public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target.
@ 2023-08-31  9:08 Chenghui Pan
  2023-08-31  9:08 ` [PATCH v6 1/4] LoongArch: Add Loongson SX base instruction support Chenghui Pan
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Chenghui Pan @ 2023-08-31  9:08 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua, Chenghui Pan

This is an update of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628303.html

Changes since last version of patch set:
- "dg-skip-if"-related Changes of the g++.dg/torture/vshuf* testcases are reverted.
  (Replaced by __builtin_shuffle fix)
- Add fix of __builtin_shuffle() for Loongson SX/ASX (Implemeted by adding
  vand/xvand insn in front of shuffle operation). There's no significant performance
  impact in current state.
- Rebased on the top of Yang Yujie's latest target configuration interface patch set
  (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628772.html).

Brief history of patch set:
v1 -> v2:
- Reduce usage of "unspec" in RTL template.
- Append Support of ADDR_REG_REG in LSX and LASX.
- Constraint docs are appended in gcc/doc/md.texi and ccomment block.
- Codes related to vecarg are removed.
- Testsuite of LSX and LASX is added in v2. (Because of the size limitation of
  mail list, these patches are not shown)
- Adjust the loongarch_expand_vector_init() function to reduce instruction 
output amount.
- Some minor implementation changes of RTL templates.

v2 -> v3:
- Revert vabsd/xvabsd RTL templates to unspec impl.
- Resolve warning in gcc/config/loongarch/loongarch.cc when bootstrapping 
  with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
- Remove redundant definitions in lasxintrin.h.
- Refine commit info.

v3 -> v4:
- Code simplification.
- Testsuite patches are splited from this patch set again and will be
  submitted independently in the future.

v4 -> v5:
- Regression test fix (pr54346.c)
- Combine vilvh/xvilvh insn's RTL template impl.
- Add dg-skip-if for loongarch*-*-* in vshuf test inside g++.dg/torture
  (reverted in this version)

Lulu Cheng (4):
  LoongArch: Add Loongson SX base instruction support.
  LoongArch: Add Loongson SX directive builtin function support.
  LoongArch: Add Loongson ASX base instruction support.
  LoongArch: Add Loongson ASX directive builtin function support.

 gcc/config.gcc                                |    2 +-
 gcc/config/loongarch/constraints.md           |  131 +-
 gcc/config/loongarch/genopts/loongarch.opt.in |    4 +
 gcc/config/loongarch/lasx.md                  | 5104 ++++++++++++++++
 gcc/config/loongarch/lasxintrin.h             | 5338 +++++++++++++++++
 gcc/config/loongarch/loongarch-builtins.cc    | 2686 ++++++++-
 gcc/config/loongarch/loongarch-ftypes.def     |  666 +-
 gcc/config/loongarch/loongarch-modes.def      |   39 +
 gcc/config/loongarch/loongarch-protos.h       |   35 +
 gcc/config/loongarch/loongarch.cc             | 4751 ++++++++++++++-
 gcc/config/loongarch/loongarch.h              |  117 +-
 gcc/config/loongarch/loongarch.md             |   56 +-
 gcc/config/loongarch/loongarch.opt            |    4 +
 gcc/config/loongarch/lsx.md                   | 4467 ++++++++++++++
 gcc/config/loongarch/lsxintrin.h              | 5181 ++++++++++++++++
 gcc/config/loongarch/predicates.md            |  333 +-
 gcc/doc/md.texi                               |   11 +
 17 files changed, 28645 insertions(+), 280 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md
 create mode 100644 gcc/config/loongarch/lasxintrin.h
 create mode 100644 gcc/config/loongarch/lsx.md
 create mode 100644 gcc/config/loongarch/lsxintrin.h

-- 
2.36.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 1/4] LoongArch: Add Loongson SX base instruction support.
  2023-08-31  9:08 [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
@ 2023-08-31  9:08 ` Chenghui Pan
  2023-08-31  9:08 ` [PATCH v6 2/4] LoongArch: Add Loongson SX directive builtin function support Chenghui Pan
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Chenghui Pan @ 2023-08-31  9:08 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config/loongarch/constraints.md (M): Add Loongson LSX base instruction support.
	(N): Ditto.
	(O): Ditto.
	(P): Ditto.
	(R): Ditto.
	(S): Ditto.
	(YG): Ditto.
	(YA): Ditto.
	(YB): Ditto.
	(Yb): Ditto.
	(Yh): Ditto.
	(Yw): Ditto.
	(YI): Ditto.
	(YC): Ditto.
	(YZ): Ditto.
	(Unv5): Ditto.
	(Uuv5): Ditto.
	(Usv5): Ditto.
	(Uuv6): Ditto.
	(Urv8): Ditto.
	* config/loongarch/genopts/loongarch.opt.in: Ditto.
	* config/loongarch/loongarch-builtins.cc (loongarch_gen_const_int_vector): Ditto.
	* config/loongarch/loongarch-modes.def (VECTOR_MODES): Ditto.
	(VECTOR_MODE): Ditto.
	(INT_MODE): Ditto.
	* config/loongarch/loongarch-protos.h (loongarch_split_move_insn_p): Ditto.
	(loongarch_split_move_insn): Ditto.
	(loongarch_split_128bit_move): Ditto.
	(loongarch_split_128bit_move_p): Ditto.
	(loongarch_split_lsx_copy_d): Ditto.
	(loongarch_split_lsx_insert_d): Ditto.
	(loongarch_split_lsx_fill_d): Ditto.
	(loongarch_expand_vec_cmp): Ditto.
	(loongarch_const_vector_same_val_p): Ditto.
	(loongarch_const_vector_same_bytes_p): Ditto.
	(loongarch_const_vector_same_int_p): Ditto.
	(loongarch_const_vector_shuffle_set_p): Ditto.
	(loongarch_const_vector_bitimm_set_p): Ditto.
	(loongarch_const_vector_bitimm_clr_p): Ditto.
	(loongarch_lsx_vec_parallel_const_half): Ditto.
	(loongarch_gen_const_int_vector): Ditto.
	(loongarch_lsx_output_division): Ditto.
	(loongarch_expand_vector_init): Ditto.
	(loongarch_expand_vec_unpack): Ditto.
	(loongarch_expand_vec_perm): Ditto.
	(loongarch_expand_vector_extract): Ditto.
	(loongarch_expand_vector_reduc): Ditto.
	(loongarch_ldst_scaled_shift): Ditto.
	(loongarch_expand_vec_cond_expr): Ditto.
	(loongarch_expand_vec_cond_mask_expr): Ditto.
	(loongarch_builtin_vectorized_function): Ditto.
	(loongarch_gen_const_int_vector_shuffle): Ditto.
	(loongarch_build_signbit_mask): Ditto.
	* config/loongarch/loongarch.cc (loongarch_pass_aggregate_num_fpr): Ditto.
	(loongarch_setup_incoming_varargs): Ditto.
	(loongarch_emit_move): Ditto.
	(loongarch_const_vector_bitimm_set_p): Ditto.
	(loongarch_const_vector_bitimm_clr_p): Ditto.
	(loongarch_const_vector_same_val_p): Ditto.
	(loongarch_const_vector_same_bytes_p): Ditto.
	(loongarch_const_vector_same_int_p): Ditto.
	(loongarch_const_vector_shuffle_set_p): Ditto.
	(loongarch_symbol_insns): Ditto.
	(loongarch_cannot_force_const_mem): Ditto.
	(loongarch_valid_offset_p): Ditto.
	(loongarch_valid_index_p): Ditto.
	(loongarch_classify_address): Ditto.
	(loongarch_address_insns): Ditto.
	(loongarch_ldst_scaled_shift): Ditto.
	(loongarch_const_insns): Ditto.
	(loongarch_split_move_insn_p): Ditto.
	(loongarch_subword_at_byte): Ditto.
	(loongarch_legitimize_move): Ditto.
	(loongarch_builtin_vectorization_cost): Ditto.
	(loongarch_split_move_p): Ditto.
	(loongarch_split_move): Ditto.
	(loongarch_split_move_insn): Ditto.
	(loongarch_output_move_index_float): Ditto.
	(loongarch_split_128bit_move_p): Ditto.
	(loongarch_split_128bit_move): Ditto.
	(loongarch_split_lsx_copy_d): Ditto.
	(loongarch_split_lsx_insert_d): Ditto.
	(loongarch_split_lsx_fill_d): Ditto.
	(loongarch_output_move): Ditto.
	(loongarch_extend_comparands): Ditto.
	(loongarch_print_operand_reloc): Ditto.
	(loongarch_print_operand): Ditto.
	(loongarch_hard_regno_mode_ok_uncached): Ditto.
	(loongarch_hard_regno_call_part_clobbered): Ditto.
	(loongarch_hard_regno_nregs): Ditto.
	(loongarch_class_max_nregs): Ditto.
	(loongarch_can_change_mode_class): Ditto.
	(loongarch_mode_ok_for_mov_fmt_p): Ditto.
	(loongarch_secondary_reload): Ditto.
	(loongarch_vector_mode_supported_p): Ditto.
	(loongarch_preferred_simd_mode): Ditto.
	(loongarch_autovectorize_vector_modes): Ditto.
	(loongarch_lsx_output_division): Ditto.
	(loongarch_option_override_internal): Ditto.
	(loongarch_hard_regno_caller_save_mode): Ditto.
	(MAX_VECT_LEN): Ditto.
	(loongarch_spill_class): Ditto.
	(struct expand_vec_perm_d): Ditto.
	(loongarch_promote_function_mode): Ditto.
	(loongarch_expand_vselect): Ditto.
	(loongarch_starting_frame_offset): Ditto.
	(loongarch_expand_vselect_vconcat): Ditto.
	(TARGET_ASM_ALIGNED_DI_OP): Ditto.
	(TARGET_OPTION_OVERRIDE): Ditto.
	(TARGET_LEGITIMIZE_ADDRESS): Ditto.
	(TARGET_ASM_SELECT_RTX_SECTION): Ditto.
	(TARGET_ASM_FUNCTION_RODATA_SECTION): Ditto.
	(loongarch_expand_lsx_shuffle): Ditto.
	(TARGET_SCHED_INIT): Ditto.
	(TARGET_SCHED_REORDER): Ditto.
	(TARGET_SCHED_REORDER2): Ditto.
	(TARGET_SCHED_VARIABLE_ISSUE): Ditto.
	(TARGET_SCHED_ADJUST_COST): Ditto.
	(TARGET_SCHED_ISSUE_RATE): Ditto.
	(TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Ditto.
	(TARGET_FUNCTION_OK_FOR_SIBCALL): Ditto.
	(TARGET_VALID_POINTER_MODE): Ditto.
	(TARGET_REGISTER_MOVE_COST): Ditto.
	(TARGET_MEMORY_MOVE_COST): Ditto.
	(TARGET_RTX_COSTS): Ditto.
	(TARGET_ADDRESS_COST): Ditto.
	(TARGET_IN_SMALL_DATA_P): Ditto.
	(TARGET_PREFERRED_RELOAD_CLASS): Ditto.
	(TARGET_ASM_FILE_START_FILE_DIRECTIVE): Ditto.
	(TARGET_EXPAND_BUILTIN_VA_START): Ditto.
	(loongarch_expand_vec_perm): Ditto.
	(TARGET_PROMOTE_FUNCTION_MODE): Ditto.
	(TARGET_RETURN_IN_MEMORY): Ditto.
	(TARGET_FUNCTION_VALUE): Ditto.
	(TARGET_LIBCALL_VALUE): Ditto.
	(loongarch_try_expand_lsx_vshuf_const): Ditto.
	(TARGET_ASM_OUTPUT_MI_THUNK): Ditto.
	(TARGET_ASM_CAN_OUTPUT_MI_THUNK): Ditto.
	(TARGET_PRINT_OPERAND): Ditto.
	(TARGET_PRINT_OPERAND_ADDRESS): Ditto.
	(TARGET_PRINT_OPERAND_PUNCT_VALID_P): Ditto.
	(TARGET_SETUP_INCOMING_VARARGS): Ditto.
	(TARGET_STRICT_ARGUMENT_NAMING): Ditto.
	(TARGET_MUST_PASS_IN_STACK): Ditto.
	(TARGET_PASS_BY_REFERENCE): Ditto.
	(TARGET_ARG_PARTIAL_BYTES): Ditto.
	(TARGET_FUNCTION_ARG): Ditto.
	(TARGET_FUNCTION_ARG_ADVANCE): Ditto.
	(TARGET_FUNCTION_ARG_BOUNDARY): Ditto.
	(TARGET_SCALAR_MODE_SUPPORTED_P): Ditto.
	(TARGET_INIT_BUILTINS): Ditto.
	(loongarch_expand_vec_perm_const_1): Ditto.
	(loongarch_expand_vec_perm_const_2): Ditto.
	(loongarch_vectorize_vec_perm_const): Ditto.
	(loongarch_cpu_sched_reassociation_width): Ditto.
	(loongarch_sched_reassociation_width): Ditto.
	(loongarch_expand_vector_extract): Ditto.
	(emit_reduc_half): Ditto.
	(loongarch_expand_vector_reduc): Ditto.
	(loongarch_expand_vec_unpack): Ditto.
	(loongarch_lsx_vec_parallel_const_half): Ditto.
	(loongarch_constant_elt_p): Ditto.
	(loongarch_gen_const_int_vector_shuffle): Ditto.
	(loongarch_expand_vector_init): Ditto.
	(loongarch_expand_lsx_cmp): Ditto.
	(loongarch_expand_vec_cond_expr): Ditto.
	(loongarch_expand_vec_cond_mask_expr): Ditto.
	(loongarch_expand_vec_cmp): Ditto.
	(loongarch_case_values_threshold): Ditto.
	(loongarch_build_const_vector): Ditto.
	(loongarch_build_signbit_mask): Ditto.
	(loongarch_builtin_support_vector_misalignment): Ditto.
	(TARGET_ASM_ALIGNED_HI_OP): Ditto.
	(TARGET_ASM_ALIGNED_SI_OP): Ditto.
	(TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Ditto.
	(TARGET_VECTOR_MODE_SUPPORTED_P): Ditto.
	(TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.
	(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.
	(TARGET_VECTORIZE_VEC_PERM_CONST): Ditto.
	(TARGET_SCHED_REASSOCIATION_WIDTH): Ditto.
	(TARGET_CASE_VALUES_THRESHOLD): Ditto.
	(TARGET_HARD_REGNO_CALL_PART_CLOBBERED): Ditto.
	(TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): Ditto.
	* config/loongarch/loongarch.h (TARGET_SUPPORTS_WIDE_INT): Ditto.
	(UNITS_PER_LSX_REG): Ditto.
	(BITS_PER_LSX_REG): Ditto.
	(BIGGEST_ALIGNMENT): Ditto.
	(LSX_REG_FIRST): Ditto.
	(LSX_REG_LAST): Ditto.
	(LSX_REG_NUM): Ditto.
	(LSX_REG_P): Ditto.
	(LSX_REG_RTX_P): Ditto.
	(IMM13_OPERAND): Ditto.
	(LSX_SUPPORTED_MODE_P): Ditto.
	* config/loongarch/loongarch.md (unknown,add,sub,not,nor,and,or,xor): Ditto.
	(unknown,add,sub,not,nor,and,or,xor,simd_add): Ditto.
	(unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC): Ditto.
	(mode" ): Ditto.
	(DF): Ditto.
	(SF): Ditto.
	(sf): Ditto.
	(DI): Ditto.
	(SI): Ditto.
	* config/loongarch/loongarch.opt: Ditto.
	* config/loongarch/predicates.md (const_lsx_branch_operand): Ditto.
	(const_uimm3_operand): Ditto.
	(const_8_to_11_operand): Ditto.
	(const_12_to_15_operand): Ditto.
	(const_uimm4_operand): Ditto.
	(const_uimm6_operand): Ditto.
	(const_uimm7_operand): Ditto.
	(const_uimm8_operand): Ditto.
	(const_imm5_operand): Ditto.
	(const_imm10_operand): Ditto.
	(const_imm13_operand): Ditto.
	(reg_imm10_operand): Ditto.
	(aq8b_operand): Ditto.
	(aq8h_operand): Ditto.
	(aq8w_operand): Ditto.
	(aq8d_operand): Ditto.
	(aq10b_operand): Ditto.
	(aq10h_operand): Ditto.
	(aq10w_operand): Ditto.
	(aq10d_operand): Ditto.
	(aq12b_operand): Ditto.
	(aq12h_operand): Ditto.
	(aq12w_operand): Ditto.
	(aq12d_operand): Ditto.
	(const_m1_operand): Ditto.
	(reg_or_m1_operand): Ditto.
	(const_exp_2_operand): Ditto.
	(const_exp_4_operand): Ditto.
	(const_exp_8_operand): Ditto.
	(const_exp_16_operand): Ditto.
	(const_exp_32_operand): Ditto.
	(const_0_or_1_operand): Ditto.
	(const_0_to_3_operand): Ditto.
	(const_0_to_7_operand): Ditto.
	(const_2_or_3_operand): Ditto.
	(const_4_to_7_operand): Ditto.
	(const_8_to_15_operand): Ditto.
	(const_16_to_31_operand): Ditto.
	(qi_mask_operand): Ditto.
	(hi_mask_operand): Ditto.
	(si_mask_operand): Ditto.
	(d_operand): Ditto.
	(db4_operand): Ditto.
	(db7_operand): Ditto.
	(db8_operand): Ditto.
	(ib3_operand): Ditto.
	(sb4_operand): Ditto.
	(sb5_operand): Ditto.
	(sb8_operand): Ditto.
	(sd8_operand): Ditto.
	(ub4_operand): Ditto.
	(ub8_operand): Ditto.
	(uh4_operand): Ditto.
	(uw4_operand): Ditto.
	(uw5_operand): Ditto.
	(uw6_operand): Ditto.
	(uw8_operand): Ditto.
	(addiur2_operand): Ditto.
	(addiusp_operand): Ditto.
	(andi16_operand): Ditto.
	(movep_src_register): Ditto.
	(movep_src_operand): Ditto.
	(fcc_reload_operand): Ditto.
	(muldiv_target_operand): Ditto.
	(const_vector_same_val_operand): Ditto.
	(const_vector_same_simm5_operand): Ditto.
	(const_vector_same_uimm5_operand): Ditto.
	(const_vector_same_ximm5_operand): Ditto.
	(const_vector_same_uimm6_operand): Ditto.
	(par_const_vector_shf_set_operand): Ditto.
	(reg_or_vector_same_val_operand): Ditto.
	(reg_or_vector_same_simm5_operand): Ditto.
	(reg_or_vector_same_uimm5_operand): Ditto.
	(reg_or_vector_same_ximm5_operand): Ditto.
	(reg_or_vector_same_uimm6_operand): Ditto.
	* doc/md.texi: Ditto.
	* config/loongarch/lsx.md: New file.
---
 gcc/config/loongarch/constraints.md           |  131 +-
 gcc/config/loongarch/genopts/loongarch.opt.in |    4 +
 gcc/config/loongarch/loongarch-builtins.cc    |   10 +
 gcc/config/loongarch/loongarch-modes.def      |   38 +
 gcc/config/loongarch/loongarch-protos.h       |   31 +
 gcc/config/loongarch/loongarch.cc             | 2226 +++++++-
 gcc/config/loongarch/loongarch.h              |   65 +-
 gcc/config/loongarch/loongarch.md             |   44 +-
 gcc/config/loongarch/loongarch.opt            |    4 +
 gcc/config/loongarch/lsx.md                   | 4467 +++++++++++++++++
 gcc/config/loongarch/predicates.md            |  333 +-
 gcc/doc/md.texi                               |   11 +
 12 files changed, 7181 insertions(+), 183 deletions(-)
 create mode 100644 gcc/config/loongarch/lsx.md

diff --git a/gcc/config/loongarch/constraints.md b/gcc/config/loongarch/constraints.md
index 7a38cd07ae9..39505e45efe 100644
--- a/gcc/config/loongarch/constraints.md
+++ b/gcc/config/loongarch/constraints.md
@@ -76,12 +76,13 @@
 ;;     "Le"
 ;;	 "A signed 32-bit constant can be expressed as Lb + I, but not a
 ;;	  single Lb or I."
-;; "M" <-----unused
-;; "N" <-----unused
-;; "O" <-----unused
-;; "P" <-----unused
+;; "M" "A constant that cannot be loaded using @code{lui}, @code{addiu}
+;;	or @code{ori}."
+;; "N" "A constant in the range -65535 to -1 (inclusive)."
+;; "O" "A signed 15-bit constant."
+;; "P" "A constant in the range 1 to 65535 (inclusive)."
 ;; "Q" <-----unused
-;; "R" <-----unused
+;; "R" "An address that can be used in a non-macro load or store."
 ;; "S" <-----unused
 ;; "T" <-----unused
 ;; "U" <-----unused
@@ -214,6 +215,63 @@ (define_constraint "Le"
   (and (match_code "const_int")
        (match_test "loongarch_addu16i_imm12_operand_p (ival, SImode)")))
 
+(define_constraint "M"
+  "A constant that cannot be loaded using @code{lui}, @code{addiu}
+   or @code{ori}."
+  (and (match_code "const_int")
+       (not (match_test "IMM12_OPERAND (ival)"))
+       (not (match_test "IMM12_OPERAND_UNSIGNED (ival)"))
+       (not (match_test "LU12I_OPERAND (ival)"))))
+
+(define_constraint "N"
+  "A constant in the range -65535 to -1 (inclusive)."
+  (and (match_code "const_int")
+       (match_test "ival >= -0xffff && ival < 0")))
+
+(define_constraint "O"
+  "A signed 15-bit constant."
+  (and (match_code "const_int")
+       (match_test "ival >= -0x4000 && ival < 0x4000")))
+
+(define_constraint "P"
+  "A constant in the range 1 to 65535 (inclusive)."
+  (and (match_code "const_int")
+       (match_test "ival > 0 && ival < 0x10000")))
+
+;; General constraints
+
+(define_memory_constraint "R"
+  "An address that can be used in a non-macro load or store."
+  (and (match_code "mem")
+       (match_test "loongarch_address_insns (XEXP (op, 0), mode, false) == 1")))
+(define_constraint "S"
+  "@internal
+   A constant call address."
+  (and (match_operand 0 "call_insn_operand")
+       (match_test "CONSTANT_P (op)")))
+
+(define_constraint "YG"
+  "@internal
+   A vector zero."
+  (and (match_code "const_vector")
+       (match_test "op == CONST0_RTX (mode)")))
+
+(define_constraint "YA"
+  "@internal
+   An unsigned 6-bit constant."
+  (and (match_code "const_int")
+       (match_test "UIMM6_OPERAND (ival)")))
+
+(define_constraint "YB"
+  "@internal
+   A signed 10-bit constant."
+  (and (match_code "const_int")
+       (match_test "IMM10_OPERAND (ival)")))
+
+(define_constraint "Yb"
+   "@internal"
+   (match_operand 0 "qi_mask_operand"))
+
 (define_constraint "Yd"
   "@internal
    A constant @code{move_operand} that can be safely loaded using
@@ -221,10 +279,73 @@ (define_constraint "Yd"
   (and (match_operand 0 "move_operand")
        (match_test "CONSTANT_P (op)")))
 
+(define_constraint "Yh"
+   "@internal"
+    (match_operand 0 "hi_mask_operand"))
+
+(define_constraint "Yw"
+   "@internal"
+    (match_operand 0 "si_mask_operand"))
+
 (define_constraint "Yx"
    "@internal"
    (match_operand 0 "low_bitmask_operand"))
 
+(define_constraint "YI"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [-512,511]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, -512, 511)")))
+
+(define_constraint "YC"
+  "@internal
+   A replicated vector const in which the replicated value has a single
+   bit set."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_bitimm_set_p (op, mode)")))
+
+(define_constraint "YZ"
+  "@internal
+   A replicated vector const in which the replicated value has a single
+   bit clear."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_bitimm_clr_p (op, mode)")))
+
+(define_constraint "Unv5"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [-31,0]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, -31, 0)")))
+
+(define_constraint "Uuv5"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [0,31]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, 0, 31)")))
+
+(define_constraint "Usv5"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [-16,15]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, -16, 15)")))
+
+(define_constraint "Uuv6"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [0,63]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, 0, 63)")))
+
+(define_constraint "Urv8"
+  "@internal
+   A replicated vector const with replicated byte values as well as elements"
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_bytes_p (op, mode)")))
+
 (define_memory_constraint "ZC"
   "A memory operand whose address is formed by a base register and offset
    that is suitable for use in instructions with the same addressing mode
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in b/gcc/config/loongarch/genopts/loongarch.opt.in
index f3567b27935..78a27dac806 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -146,6 +146,10 @@ mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST	Set the cost of branches to roughly COST instructions.
 
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) IntegerRange(1, 5)
+mmemvec-cost=COST      Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
index b929f224dfa..ebe70a986c3 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "fold-const.h"
 #include "expr.h"
 #include "langhooks.h"
+#include "emit-rtl.h"
 
 /* Macros to create an enumeration identifier for a function prototype.  */
 #define LARCH_FTYPE_NAME1(A, B) LARCH_##A##_FTYPE_##B
@@ -297,6 +298,15 @@ loongarch_prepare_builtin_arg (struct expand_operand *op, tree exp,
   create_input_operand (op, value, TYPE_MODE (TREE_TYPE (arg)));
 }
 
+/* Return a const_int vector of VAL with mode MODE.  */
+
+rtx
+loongarch_gen_const_int_vector (machine_mode mode, HOST_WIDE_INT val)
+{
+  rtx c = gen_int_mode (val, GET_MODE_INNER (mode));
+  return gen_const_vec_duplicate (mode, c);
+}
+
 /* Expand instruction ICODE as part of a built-in function sequence.
    Use the first NOPS elements of OPS as the instruction's operands.
    HAS_TARGET_P is true if operand 0 is a target; it is false if the
diff --git a/gcc/config/loongarch/loongarch-modes.def b/gcc/config/loongarch/loongarch-modes.def
index 8082ce993a5..6f57b60525d 100644
--- a/gcc/config/loongarch/loongarch-modes.def
+++ b/gcc/config/loongarch/loongarch-modes.def
@@ -23,3 +23,41 @@ FLOAT_MODE (TF, 16, ieee_quad_format);
 
 /* For floating point conditions in FCC registers.  */
 CC_MODE (FCC);
+
+/* Vector modes.  */
+VECTOR_MODES (INT, 4);	      /* V4QI  V2HI      */
+VECTOR_MODES (INT, 8);	      /* V8QI  V4HI V2SI */
+VECTOR_MODES (FLOAT, 8);      /*       V4HF V2SF */
+
+/* For LARCH LSX 128 bits.  */
+VECTOR_MODES (INT, 16);	      /* V16QI V8HI V4SI V2DI */
+VECTOR_MODES (FLOAT, 16);     /*	    V4SF V2DF */
+
+VECTOR_MODES (INT, 32);	      /* V32QI V16HI V8SI V4DI */
+VECTOR_MODES (FLOAT, 32);     /*	     V8SF V4DF */
+
+/* Double-sized vector modes for vec_concat.  */
+/* VECTOR_MODE (INT, QI, 32);	  V32QI	*/
+/* VECTOR_MODE (INT, HI, 16);	  V16HI	*/
+/* VECTOR_MODE (INT, SI, 8);	  V8SI	*/
+/* VECTOR_MODE (INT, DI, 4);	  V4DI	*/
+/* VECTOR_MODE (FLOAT, SF, 8);	  V8SF	*/
+/* VECTOR_MODE (FLOAT, DF, 4);	  V4DF	*/
+
+VECTOR_MODE (INT, QI, 64);    /* V64QI	*/
+VECTOR_MODE (INT, HI, 32);    /* V32HI	*/
+VECTOR_MODE (INT, SI, 16);    /* V16SI	*/
+VECTOR_MODE (INT, DI, 8);     /* V8DI */
+VECTOR_MODE (FLOAT, SF, 16);  /* V16SF	*/
+VECTOR_MODE (FLOAT, DF, 8);   /* V8DF */
+
+VECTOR_MODES (FRACT, 4);	/* V4QQ  V2HQ */
+VECTOR_MODES (UFRACT, 4);	/* V4UQQ V2UHQ */
+VECTOR_MODES (ACCUM, 4);	/*       V2HA */
+VECTOR_MODES (UACCUM, 4);	/*       V2UHA */
+
+INT_MODE (OI, 32);
+
+/* Keep the OI modes from confusing the compiler into thinking
+   that these modes could actually be used for computation.  They are
+   only holders for vectors during data movement.  */
diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h
index b71b188507a..fc33527cdcf 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -85,10 +85,18 @@ extern bool loongarch_split_move_p (rtx, rtx);
 extern void loongarch_split_move (rtx, rtx, rtx);
 extern bool loongarch_addu16i_imm12_operand_p (HOST_WIDE_INT, machine_mode);
 extern void loongarch_split_plus_constant (rtx *, machine_mode);
+extern bool loongarch_split_move_insn_p (rtx, rtx);
+extern void loongarch_split_move_insn (rtx, rtx, rtx);
+extern void loongarch_split_128bit_move (rtx, rtx);
+extern bool loongarch_split_128bit_move_p (rtx, rtx);
+extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx (*)(rtx, rtx, rtx));
+extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
+extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
 extern bool loongarch_cfun_has_cprestore_slot_p (void);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
+extern bool loongarch_expand_vec_cmp (rtx *);
 extern void loongarch_expand_conditional_branch (rtx *);
 extern void loongarch_expand_conditional_move (rtx *);
 extern void loongarch_expand_conditional_trap (rtx);
@@ -110,6 +118,15 @@ extern bool loongarch_small_data_pattern_p (rtx);
 extern rtx loongarch_rewrite_small_data (rtx);
 extern rtx loongarch_return_addr (int, rtx);
 
+extern bool loongarch_const_vector_same_val_p (rtx, machine_mode);
+extern bool loongarch_const_vector_same_bytes_p (rtx, machine_mode);
+extern bool loongarch_const_vector_same_int_p (rtx, machine_mode, HOST_WIDE_INT,
+					  HOST_WIDE_INT);
+extern bool loongarch_const_vector_shuffle_set_p (rtx, machine_mode);
+extern bool loongarch_const_vector_bitimm_set_p (rtx, machine_mode);
+extern bool loongarch_const_vector_bitimm_clr_p (rtx, machine_mode);
+extern rtx loongarch_lsx_vec_parallel_const_half (machine_mode, bool);
+extern rtx loongarch_gen_const_int_vector (machine_mode, HOST_WIDE_INT);
 extern enum reg_class loongarch_secondary_reload_class (enum reg_class,
 							machine_mode,
 							rtx, bool);
@@ -129,6 +146,7 @@ extern const char *loongarch_output_equal_conditional_branch (rtx_insn *,
 							      rtx *,
 							      bool);
 extern const char *loongarch_output_division (const char *, rtx *);
+extern const char *loongarch_lsx_output_division (const char *, rtx *);
 extern const char *loongarch_output_probe_stack_range (rtx, rtx, rtx);
 extern bool loongarch_hard_regno_rename_ok (unsigned int, unsigned int);
 extern int loongarch_dspalu_bypass_p (rtx, rtx);
@@ -156,6 +174,13 @@ union loongarch_gen_fn_ptrs
 extern void loongarch_expand_atomic_qihi (union loongarch_gen_fn_ptrs,
 					  rtx, rtx, rtx, rtx, rtx);
 
+extern void loongarch_expand_vector_init (rtx, rtx);
+extern void loongarch_expand_vec_unpack (rtx op[2], bool, bool);
+extern void loongarch_expand_vec_perm (rtx, rtx, rtx, rtx);
+extern void loongarch_expand_vector_extract (rtx, rtx, int);
+extern void loongarch_expand_vector_reduc (rtx (*)(rtx, rtx, rtx), rtx, rtx);
+
+extern int loongarch_ldst_scaled_shift (machine_mode);
 extern bool loongarch_signed_immediate_p (unsigned HOST_WIDE_INT, int, int);
 extern bool loongarch_unsigned_immediate_p (unsigned HOST_WIDE_INT, int, int);
 extern bool loongarch_12bit_offset_address_p (rtx, machine_mode);
@@ -171,6 +196,9 @@ extern bool loongarch_split_symbol_type (enum loongarch_symbol_type);
 typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
 
 extern void loongarch_register_frame_header_opt (void);
+extern void loongarch_expand_vec_cond_expr (machine_mode, machine_mode, rtx *);
+extern void loongarch_expand_vec_cond_mask_expr (machine_mode, machine_mode,
+						 rtx *);
 
 /* Routines implemented in loongarch-c.c.  */
 void loongarch_cpu_cpp_builtins (cpp_reader *);
@@ -180,6 +208,9 @@ extern void loongarch_atomic_assign_expand_fenv (tree *, tree *, tree *);
 extern tree loongarch_builtin_decl (unsigned int, bool);
 extern rtx loongarch_expand_builtin (tree, rtx, rtx subtarget ATTRIBUTE_UNUSED,
 				     machine_mode, int);
+extern tree loongarch_builtin_vectorized_function (unsigned int, tree, tree);
+extern rtx loongarch_gen_const_int_vector_shuffle (machine_mode, int);
 extern tree loongarch_build_builtin_va_list (void);
 
+extern rtx loongarch_build_signbit_mask (machine_mode, bool, bool);
 #endif /* ! GCC_LOONGARCH_PROTOS_H */
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 7f696e55f91..dd42816ac27 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -432,7 +432,7 @@ loongarch_flatten_aggregate_argument (const_tree type,
 
 static unsigned
 loongarch_pass_aggregate_num_fpr (const_tree type,
-					loongarch_aggregate_field fields[2])
+				  loongarch_aggregate_field fields[2])
 {
   int n = loongarch_flatten_aggregate_argument (type, fields);
 
@@ -773,7 +773,7 @@ loongarch_setup_incoming_varargs (cumulative_args_t cum,
     {
       rtx ptr = plus_constant (Pmode, virtual_incoming_args_rtx,
 			       REG_PARM_STACK_SPACE (cfun->decl)
-				 - gp_saved * UNITS_PER_WORD);
+			       - gp_saved * UNITS_PER_WORD);
       rtx mem = gen_frame_mem (BLKmode, ptr);
       set_mem_alias_set (mem, get_varargs_alias_set ());
 
@@ -1049,7 +1049,7 @@ rtx
 loongarch_emit_move (rtx dest, rtx src)
 {
   return (can_create_pseudo_p () ? emit_move_insn (dest, src)
-				 : emit_move_insn_1 (dest, src));
+	  : emit_move_insn_1 (dest, src));
 }
 
 /* Save register REG to MEM.  Make the instruction frame-related.  */
@@ -1675,6 +1675,140 @@ loongarch_symbol_binds_local_p (const_rtx x)
     return false;
 }
 
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same bit set.  */
+
+bool
+loongarch_const_vector_bitimm_set_p (rtx op, machine_mode mode)
+{
+  if (GET_CODE (op) == CONST_VECTOR && op != CONST0_RTX (mode))
+    {
+      unsigned HOST_WIDE_INT val = UINTVAL (CONST_VECTOR_ELT (op, 0));
+      int vlog2 = exact_log2 (val & GET_MODE_MASK (GET_MODE_INNER (mode)));
+
+      if (vlog2 != -1)
+	{
+	  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT);
+	  gcc_assert (vlog2 >= 0 && vlog2 <= GET_MODE_UNIT_BITSIZE (mode) - 1);
+	  return loongarch_const_vector_same_val_p (op, mode);
+	}
+    }
+
+  return false;
+}
+
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same bit clear.  */
+
+bool
+loongarch_const_vector_bitimm_clr_p (rtx op, machine_mode mode)
+{
+  if (GET_CODE (op) == CONST_VECTOR && op != CONSTM1_RTX (mode))
+    {
+      unsigned HOST_WIDE_INT val = ~UINTVAL (CONST_VECTOR_ELT (op, 0));
+      int vlog2 = exact_log2 (val & GET_MODE_MASK (GET_MODE_INNER (mode)));
+
+      if (vlog2 != -1)
+	{
+	  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT);
+	  gcc_assert (vlog2 >= 0 && vlog2 <= GET_MODE_UNIT_BITSIZE (mode) - 1);
+	  return loongarch_const_vector_same_val_p (op, mode);
+	}
+    }
+
+  return false;
+}
+
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same value.  */
+
+bool
+loongarch_const_vector_same_val_p (rtx op, machine_mode mode)
+{
+  int i, nunits = GET_MODE_NUNITS (mode);
+  rtx first;
+
+  if (GET_CODE (op) != CONST_VECTOR || GET_MODE (op) != mode)
+    return false;
+
+  first = CONST_VECTOR_ELT (op, 0);
+  for (i = 1; i < nunits; i++)
+    if (!rtx_equal_p (first, CONST_VECTOR_ELT (op, i)))
+      return false;
+
+  return true;
+}
+
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same value as well as replicated bytes in the value.
+*/
+
+bool
+loongarch_const_vector_same_bytes_p (rtx op, machine_mode mode)
+{
+  int i, bytes;
+  HOST_WIDE_INT val, first_byte;
+  rtx first;
+
+  if (!loongarch_const_vector_same_val_p (op, mode))
+    return false;
+
+  first = CONST_VECTOR_ELT (op, 0);
+  bytes = GET_MODE_UNIT_SIZE (mode);
+  val = INTVAL (first);
+  first_byte = val & 0xff;
+  for (i = 1; i < bytes; i++)
+    {
+      val >>= 8;
+      if ((val & 0xff) != first_byte)
+	return false;
+    }
+
+  return true;
+}
+
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same integer value in the range [LOW, HIGH].  */
+
+bool
+loongarch_const_vector_same_int_p (rtx op, machine_mode mode, HOST_WIDE_INT low,
+				   HOST_WIDE_INT high)
+{
+  HOST_WIDE_INT value;
+  rtx elem0;
+
+  if (!loongarch_const_vector_same_val_p (op, mode))
+    return false;
+
+  elem0 = CONST_VECTOR_ELT (op, 0);
+  if (!CONST_INT_P (elem0))
+    return false;
+
+  value = INTVAL (elem0);
+  return (value >= low && value <= high);
+}
+
+/* Return true if OP is a constant vector with repeated 4-element sets
+   in mode MODE.  */
+
+bool
+loongarch_const_vector_shuffle_set_p (rtx op, machine_mode mode)
+{
+  int nunits = GET_MODE_NUNITS (mode);
+  int nsets = nunits / 4;
+  int set = 0;
+  int i, j;
+
+  /* Check if we have the same 4-element sets.  */
+  for (j = 0; j < nsets; j++, set = 4 * j)
+    for (i = 0; i < 4; i++)
+      if ((INTVAL (XVECEXP (op, 0, i))
+	   != (INTVAL (XVECEXP (op, 0, set + i)) - set))
+	  || !IN_RANGE (INTVAL (XVECEXP (op, 0, set + i)), 0, set + 3))
+	return false;
+  return true;
+}
+
 /* Return true if rtx constants of mode MODE should be put into a small
    data section.  */
 
@@ -1792,6 +1926,11 @@ loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_type *symbol_type)
 static int
 loongarch_symbol_insns (enum loongarch_symbol_type type, machine_mode mode)
 {
+  /* LSX LD.* and ST.* cannot support loading symbols via an immediate
+     operand.  */
+  if (LSX_SUPPORTED_MODE_P (mode))
+    return 0;
+
   switch (type)
     {
     case SYMBOL_GOT_DISP:
@@ -1838,7 +1977,8 @@ loongarch_cannot_force_const_mem (machine_mode mode, rtx x)
      references, reload will consider forcing C into memory and using
      one of the instruction's memory alternatives.  Returning false
      here will force it to use an input reload instead.  */
-  if (CONST_INT_P (x) && loongarch_legitimate_constant_p (mode, x))
+  if ((CONST_INT_P (x) || GET_CODE (x) == CONST_VECTOR)
+      && loongarch_legitimate_constant_p (mode, x))
     return true;
 
   split_const (x, &base, &offset);
@@ -1915,6 +2055,12 @@ loongarch_valid_offset_p (rtx x, machine_mode mode)
       && !IMM12_OPERAND (INTVAL (x) + GET_MODE_SIZE (mode) - UNITS_PER_WORD))
     return false;
 
+  /* LSX LD.* and ST.* supports 10-bit signed offsets.  */
+  if (LSX_SUPPORTED_MODE_P (mode)
+      && !loongarch_signed_immediate_p (INTVAL (x), 10,
+					loongarch_ldst_scaled_shift (mode)))
+    return false;
+
   return true;
 }
 
@@ -1999,7 +2145,7 @@ loongarch_valid_lo_sum_p (enum loongarch_symbol_type symbol_type,
 
 static bool
 loongarch_valid_index_p (struct loongarch_address_info *info, rtx x,
-			  machine_mode mode, bool strict_p)
+			 machine_mode mode, bool strict_p)
 {
   rtx index;
 
@@ -2052,7 +2198,7 @@ loongarch_classify_address (struct loongarch_address_info *info, rtx x,
 	}
 
       if (loongarch_valid_base_register_p (XEXP (x, 1), mode, strict_p)
-	 && loongarch_valid_index_p (info, XEXP (x, 0), mode, strict_p))
+	  && loongarch_valid_index_p (info, XEXP (x, 0), mode, strict_p))
 	{
 	  info->reg = XEXP (x, 1);
 	  return true;
@@ -2128,6 +2274,7 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p)
 {
   struct loongarch_address_info addr;
   int factor;
+  bool lsx_p = !might_split_p && LSX_SUPPORTED_MODE_P (mode);
 
   if (!loongarch_classify_address (&addr, x, mode, false))
     return 0;
@@ -2145,15 +2292,29 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p)
     switch (addr.type)
       {
       case ADDRESS_REG:
+	if (lsx_p)
+	  {
+	    /* LSX LD.* and ST.* supports 10-bit signed offsets.  */
+	    if (loongarch_signed_immediate_p (INTVAL (addr.offset), 10,
+					      loongarch_ldst_scaled_shift (mode)))
+	      return 1;
+	    else
+	      return 0;
+	  }
+	return factor;
+
       case ADDRESS_REG_REG:
-      case ADDRESS_CONST_INT:
 	return factor;
 
+      case ADDRESS_CONST_INT:
+	return lsx_p ? 0 : factor;
+
       case ADDRESS_LO_SUM:
 	return factor + 1;
 
       case ADDRESS_SYMBOLIC:
-	return factor * loongarch_symbol_insns (addr.symbol_type, mode);
+	return lsx_p ? 0
+	  : factor * loongarch_symbol_insns (addr.symbol_type, mode);
       }
   return 0;
 }
@@ -2179,6 +2340,19 @@ loongarch_signed_immediate_p (unsigned HOST_WIDE_INT x, int bits,
   return loongarch_unsigned_immediate_p (x, bits, shift);
 }
 
+/* Return the scale shift that applied to LSX LD/ST address offset.  */
+
+int
+loongarch_ldst_scaled_shift (machine_mode mode)
+{
+  int shift = exact_log2 (GET_MODE_UNIT_SIZE (mode));
+
+  if (shift < 0 || shift > 8)
+    gcc_unreachable ();
+
+  return shift;
+}
+
 /* Return true if X is a legitimate address with a 12-bit offset
    or addr.type is ADDRESS_LO_SUM.
    MODE is the mode of the value being accessed.  */
@@ -2246,6 +2420,9 @@ loongarch_const_insns (rtx x)
       return loongarch_integer_cost (INTVAL (x));
 
     case CONST_VECTOR:
+      if (LSX_SUPPORTED_MODE_P (GET_MODE (x))
+	  && loongarch_const_vector_same_int_p (x, GET_MODE (x), -512, 511))
+	return 1;
       /* Fall through.  */
     case CONST_DOUBLE:
       return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
@@ -2280,7 +2457,7 @@ loongarch_const_insns (rtx x)
     case SYMBOL_REF:
     case LABEL_REF:
       return loongarch_symbol_insns (
-	loongarch_classify_symbol (x), MAX_MACHINE_MODE);
+		loongarch_classify_symbol (x), MAX_MACHINE_MODE);
 
     default:
       return 0;
@@ -2302,7 +2479,26 @@ loongarch_split_const_insns (rtx x)
   return low + high;
 }
 
-static bool loongarch_split_move_insn_p (rtx dest, rtx src);
+bool loongarch_split_move_insn_p (rtx dest, rtx src);
+/* Return one word of 128-bit value OP, taking into account the fixed
+   endianness of certain registers.  BYTE selects from the byte address.  */
+
+rtx
+loongarch_subword_at_byte (rtx op, unsigned int byte)
+{
+  machine_mode mode;
+
+  mode = GET_MODE (op);
+  if (mode == VOIDmode)
+    mode = TImode;
+
+  gcc_assert (!FP_REG_RTX_P (op));
+
+  if (MEM_P (op))
+    return loongarch_rewrite_small_data (adjust_address (op, word_mode, byte));
+
+  return simplify_gen_subreg (word_mode, op, mode, byte);
+}
 
 /* Return the number of instructions needed to implement INSN,
    given that it loads from or stores to MEM.  */
@@ -3063,9 +3259,10 @@ loongarch_legitimize_move (machine_mode mode, rtx dest, rtx src)
 
   /* Both src and dest are non-registers;  one special case is supported where
      the source is (const_int 0) and the store can source the zero register.
-     */
+     LSX is never able to source the zero register directly in
+     memory operations.  */
   if (!register_operand (dest, mode) && !register_operand (src, mode)
-      && !const_0_operand (src, mode))
+      && (!const_0_operand (src, mode) || LSX_SUPPORTED_MODE_P (mode)))
     {
       loongarch_emit_move (dest, force_reg (mode, src));
       return true;
@@ -3637,6 +3834,54 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code,
     }
 }
 
+/* Vectorizer cost model implementation.  */
+
+/* Implement targetm.vectorize.builtin_vectorization_cost.  */
+
+static int
+loongarch_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
+				      tree vectype,
+				      int misalign ATTRIBUTE_UNUSED)
+{
+  unsigned elements;
+
+  switch (type_of_cost)
+    {
+      case scalar_stmt:
+      case scalar_load:
+      case vector_stmt:
+      case vector_load:
+      case vec_to_scalar:
+      case scalar_to_vec:
+      case cond_branch_not_taken:
+      case vec_promote_demote:
+      case scalar_store:
+      case vector_store:
+	return 1;
+
+      case vec_perm:
+	return 1;
+
+      case unaligned_load:
+      case vector_gather_load:
+	return 2;
+
+      case unaligned_store:
+      case vector_scatter_store:
+	return 10;
+
+      case cond_branch_taken:
+	return 3;
+
+      case vec_construct:
+	elements = TYPE_VECTOR_SUBPARTS (vectype);
+	return elements / 2 + 1;
+
+      default:
+	gcc_unreachable ();
+    }
+}
+
 /* Implement TARGET_ADDRESS_COST.  */
 
 static int
@@ -3691,6 +3936,11 @@ loongarch_split_move_p (rtx dest, rtx src)
       if (FP_REG_RTX_P (src) && MEM_P (dest))
 	return false;
     }
+
+  /* Check if LSX moves need splitting.  */
+  if (LSX_SUPPORTED_MODE_P (GET_MODE (dest)))
+    return loongarch_split_128bit_move_p (dest, src);
+
   /* Otherwise split all multiword moves.  */
   return size > UNITS_PER_WORD;
 }
@@ -3704,7 +3954,9 @@ loongarch_split_move (rtx dest, rtx src, rtx insn_)
   rtx low_dest;
 
   gcc_checking_assert (loongarch_split_move_p (dest, src));
-  if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src))
+  if (LSX_SUPPORTED_MODE_P (GET_MODE (dest)))
+    loongarch_split_128bit_move (dest, src);
+  else if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src))
     {
       if (!TARGET_64BIT && GET_MODE (dest) == DImode)
 	emit_insn (gen_move_doubleword_fprdi (dest, src));
@@ -3808,12 +4060,21 @@ loongarch_split_plus_constant (rtx *op, machine_mode mode)
 
 /* Return true if a move from SRC to DEST in INSN should be split.  */
 
-static bool
+bool
 loongarch_split_move_insn_p (rtx dest, rtx src)
 {
   return loongarch_split_move_p (dest, src);
 }
 
+/* Split a move from SRC to DEST in INSN, given that
+   loongarch_split_move_insn_p holds.  */
+
+void
+loongarch_split_move_insn (rtx dest, rtx src, rtx insn)
+{
+  loongarch_split_move (dest, src, insn);
+}
+
 /* Implement TARGET_CONSTANT_ALIGNMENT.  */
 
 static HOST_WIDE_INT
@@ -3860,7 +4121,7 @@ const char *
 loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
 {
   int index = exact_log2 (GET_MODE_SIZE (mode));
-  if (!IN_RANGE (index, 2, 3))
+  if (!IN_RANGE (index, 2, 4))
     return NULL;
 
   struct loongarch_address_info info;
@@ -3869,20 +4130,216 @@ loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
       || !loongarch_legitimate_address_p (mode, x, false))
     return NULL;
 
-  const char *const insn[][2] =
+  const char *const insn[][3] =
     {
 	{
 	  "fstx.s\t%1,%0",
-	  "fstx.d\t%1,%0"
+	  "fstx.d\t%1,%0",
+	  "vstx\t%w1,%0"
 	},
 	{
 	  "fldx.s\t%0,%1",
-	  "fldx.d\t%0,%1"
-	},
+	  "fldx.d\t%0,%1",
+	  "vldx\t%w0,%1"
+	}
     };
 
   return insn[ldr][index-2];
 }
+/* Return true if a 128-bit move from SRC to DEST should be split.  */
+
+bool
+loongarch_split_128bit_move_p (rtx dest, rtx src)
+{
+  /* LSX-to-LSX moves can be done in a single instruction.  */
+  if (FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
+    return false;
+
+  /* Check for LSX loads and stores.  */
+  if (FP_REG_RTX_P (dest) && MEM_P (src))
+    return false;
+  if (FP_REG_RTX_P (src) && MEM_P (dest))
+    return false;
+
+  /* Check for LSX set to an immediate const vector with valid replicated
+     element.  */
+  if (FP_REG_RTX_P (dest)
+      && loongarch_const_vector_same_int_p (src, GET_MODE (src), -512, 511))
+    return false;
+
+  /* Check for LSX load zero immediate.  */
+  if (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))
+    return false;
+
+  return true;
+}
+
+/* Split a 128-bit move from SRC to DEST.  */
+
+void
+loongarch_split_128bit_move (rtx dest, rtx src)
+{
+  int byte, index;
+  rtx low_dest, low_src, d, s;
+
+  if (FP_REG_RTX_P (dest))
+    {
+      gcc_assert (!MEM_P (src));
+
+      rtx new_dest = dest;
+      if (!TARGET_64BIT)
+	{
+	  if (GET_MODE (dest) != V4SImode)
+	    new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
+	}
+      else
+	{
+	  if (GET_MODE (dest) != V2DImode)
+	    new_dest = simplify_gen_subreg (V2DImode, dest, GET_MODE (dest), 0);
+	}
+
+      for (byte = 0, index = 0; byte < GET_MODE_SIZE (TImode);
+	   byte += UNITS_PER_WORD, index++)
+	{
+	  s = loongarch_subword_at_byte (src, byte);
+	  if (!TARGET_64BIT)
+	    emit_insn (gen_lsx_vinsgr2vr_w (new_dest, s, new_dest,
+					    GEN_INT (1 << index)));
+	  else
+	    emit_insn (gen_lsx_vinsgr2vr_d (new_dest, s, new_dest,
+					    GEN_INT (1 << index)));
+	}
+    }
+  else if (FP_REG_RTX_P (src))
+    {
+      gcc_assert (!MEM_P (dest));
+
+      rtx new_src = src;
+      if (!TARGET_64BIT)
+	{
+	  if (GET_MODE (src) != V4SImode)
+	    new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0);
+	}
+      else
+	{
+	  if (GET_MODE (src) != V2DImode)
+	    new_src = simplify_gen_subreg (V2DImode, src, GET_MODE (src), 0);
+	}
+
+      for (byte = 0, index = 0; byte < GET_MODE_SIZE (TImode);
+	   byte += UNITS_PER_WORD, index++)
+	{
+	  d = loongarch_subword_at_byte (dest, byte);
+	  if (!TARGET_64BIT)
+	    emit_insn (gen_lsx_vpickve2gr_w (d, new_src, GEN_INT (index)));
+	  else
+	    emit_insn (gen_lsx_vpickve2gr_d (d, new_src, GEN_INT (index)));
+	}
+    }
+  else
+    {
+      low_dest = loongarch_subword_at_byte (dest, 0);
+      low_src = loongarch_subword_at_byte (src, 0);
+      gcc_assert (REG_P (low_dest) && REG_P (low_src));
+      /* Make sure the source register is not written before reading.  */
+      if (REGNO (low_dest) <= REGNO (low_src))
+	{
+	  for (byte = 0; byte < GET_MODE_SIZE (TImode);
+	       byte += UNITS_PER_WORD)
+	    {
+	      d = loongarch_subword_at_byte (dest, byte);
+	      s = loongarch_subword_at_byte (src, byte);
+	      loongarch_emit_move (d, s);
+	    }
+	}
+      else
+	{
+	  for (byte = GET_MODE_SIZE (TImode) - UNITS_PER_WORD; byte >= 0;
+	       byte -= UNITS_PER_WORD)
+	    {
+	      d = loongarch_subword_at_byte (dest, byte);
+	      s = loongarch_subword_at_byte (src, byte);
+	      loongarch_emit_move (d, s);
+	    }
+	}
+    }
+}
+
+
+/* Split a COPY_S.D with operands DEST, SRC and INDEX.  GEN is a function
+   used to generate subregs.  */
+
+void
+loongarch_split_lsx_copy_d (rtx dest, rtx src, rtx index,
+			    rtx (*gen_fn)(rtx, rtx, rtx))
+{
+  gcc_assert ((GET_MODE (src) == V2DImode && GET_MODE (dest) == DImode)
+	      || (GET_MODE (src) == V2DFmode && GET_MODE (dest) == DFmode));
+
+  /* Note that low is always from the lower index, and high is always
+     from the higher index.  */
+  rtx low = loongarch_subword (dest, false);
+  rtx high = loongarch_subword (dest, true);
+  rtx new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0);
+
+  emit_insn (gen_fn (low, new_src, GEN_INT (INTVAL (index) * 2)));
+  emit_insn (gen_fn (high, new_src, GEN_INT (INTVAL (index) * 2 + 1)));
+}
+
+/* Split a INSERT.D with operand DEST, SRC1.INDEX and SRC2.  */
+
+void
+loongarch_split_lsx_insert_d (rtx dest, rtx src1, rtx index, rtx src2)
+{
+  int i;
+  gcc_assert (GET_MODE (dest) == GET_MODE (src1));
+  gcc_assert ((GET_MODE (dest) == V2DImode
+	       && (GET_MODE (src2) == DImode || src2 == const0_rtx))
+	      || (GET_MODE (dest) == V2DFmode && GET_MODE (src2) == DFmode));
+
+  /* Note that low is always from the lower index, and high is always
+     from the higher index.  */
+  rtx low = loongarch_subword (src2, false);
+  rtx high = loongarch_subword (src2, true);
+  rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
+  rtx new_src1 = simplify_gen_subreg (V4SImode, src1, GET_MODE (src1), 0);
+  i = exact_log2 (INTVAL (index));
+  gcc_assert (i != -1);
+
+  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, low, new_src1,
+				  GEN_INT (1 << (i * 2))));
+  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest,
+				  GEN_INT (1 << (i * 2 + 1))));
+}
+
+/* Split FILL.D.  */
+
+void
+loongarch_split_lsx_fill_d (rtx dest, rtx src)
+{
+  gcc_assert ((GET_MODE (dest) == V2DImode
+	       && (GET_MODE (src) == DImode || src == const0_rtx))
+	      || (GET_MODE (dest) == V2DFmode && GET_MODE (src) == DFmode));
+
+  /* Note that low is always from the lower index, and high is always
+     from the higher index.  */
+  rtx low, high;
+  if (src == const0_rtx)
+    {
+      low = src;
+      high = src;
+    }
+  else
+    {
+      low = loongarch_subword (src, false);
+      high = loongarch_subword (src, true);
+    }
+  rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
+  emit_insn (gen_lsx_vreplgr2vr_w (new_dest, low));
+  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest, GEN_INT (1 << 1)));
+  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest, GEN_INT (1 << 3)));
+}
+
 
 /* Return the appropriate instructions to move SRC into DEST.  Assume
    that SRC is operand 1 and DEST is operand 0.  */
@@ -3894,10 +4351,25 @@ loongarch_output_move (rtx dest, rtx src)
   enum rtx_code src_code = GET_CODE (src);
   machine_mode mode = GET_MODE (dest);
   bool dbl_p = (GET_MODE_SIZE (mode) == 8);
+  bool lsx_p = LSX_SUPPORTED_MODE_P (mode);
 
   if (loongarch_split_move_p (dest, src))
     return "#";
 
+  if ((lsx_p)
+      && dest_code == REG && FP_REG_P (REGNO (dest))
+      && src_code == CONST_VECTOR
+      && CONST_INT_P (CONST_VECTOR_ELT (src, 0)))
+    {
+      gcc_assert (loongarch_const_vector_same_int_p (src, mode, -512, 511));
+      switch (GET_MODE_SIZE (mode))
+	{
+	case 16:
+	  return "vrepli.%v0\t%w0,%E1";
+	default: gcc_unreachable ();
+	}
+    }
+
   if ((src_code == REG && GP_REG_P (REGNO (src)))
       || (src == CONST0_RTX (mode)))
     {
@@ -3907,7 +4379,21 @@ loongarch_output_move (rtx dest, rtx src)
 	    return "or\t%0,%z1,$r0";
 
 	  if (FP_REG_P (REGNO (dest)))
-	    return dbl_p ? "movgr2fr.d\t%0,%z1" : "movgr2fr.w\t%0,%z1";
+	    {
+	      if (lsx_p)
+		{
+		  gcc_assert (src == CONST0_RTX (GET_MODE (src)));
+		  switch (GET_MODE_SIZE (mode))
+		    {
+		    case 16:
+		      return "vrepli.b\t%w0,0";
+		    default:
+		      gcc_unreachable ();
+		    }
+		}
+
+	      return dbl_p ? "movgr2fr.d\t%0,%z1" : "movgr2fr.w\t%0,%z1";
+	    }
 	}
       if (dest_code == MEM)
 	{
@@ -3949,7 +4435,10 @@ loongarch_output_move (rtx dest, rtx src)
     {
       if (src_code == REG)
 	if (FP_REG_P (REGNO (src)))
-	  return dbl_p ? "movfr2gr.d\t%0,%1" : "movfr2gr.s\t%0,%1";
+	  {
+	    gcc_assert (!lsx_p);
+	    return dbl_p ? "movfr2gr.d\t%0,%1" : "movfr2gr.s\t%0,%1";
+	  }
 
       if (src_code == MEM)
 	{
@@ -3994,7 +4483,7 @@ loongarch_output_move (rtx dest, rtx src)
 	  enum loongarch_symbol_type type = SYMBOL_PCREL;
 
 	  if (UNSPEC_ADDRESS_P (x))
-	     type = UNSPEC_ADDRESS_TYPE (x);
+	    type = UNSPEC_ADDRESS_TYPE (x);
 
 	  if (type == SYMBOL_TLS_LE)
 	    return "lu12i.w\t%0,%h1";
@@ -4029,7 +4518,20 @@ loongarch_output_move (rtx dest, rtx src)
   if (src_code == REG && FP_REG_P (REGNO (src)))
     {
       if (dest_code == REG && FP_REG_P (REGNO (dest)))
-	return dbl_p ? "fmov.d\t%0,%1" : "fmov.s\t%0,%1";
+	{
+	  if (lsx_p)
+	    {
+	      switch (GET_MODE_SIZE (mode))
+		{
+		case 16:
+		  return "vori.b\t%w0,%w1,0";
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+
+	  return dbl_p ? "fmov.d\t%0,%1" : "fmov.s\t%0,%1";
+	}
 
       if (dest_code == MEM)
 	{
@@ -4040,6 +4542,17 @@ loongarch_output_move (rtx dest, rtx src)
 	  if (insn)
 	    return insn;
 
+	  if (lsx_p)
+	    {
+	      switch (GET_MODE_SIZE (mode))
+		{
+		case 16:
+		  return "vst\t%w1,%0";
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+
 	  return dbl_p ? "fst.d\t%1,%0" : "fst.s\t%1,%0";
 	}
     }
@@ -4055,6 +4568,16 @@ loongarch_output_move (rtx dest, rtx src)
 	  if (insn)
 	    return insn;
 
+	  if (lsx_p)
+	    {
+	      switch (GET_MODE_SIZE (mode))
+		{
+		case 16:
+		  return "vld\t%w0,%1";
+		default:
+		  gcc_unreachable ();
+		}
+	    }
 	  return dbl_p ? "fld.d\t%0,%1" : "fld.s\t%0,%1";
 	}
     }
@@ -4244,6 +4767,7 @@ loongarch_extend_comparands (rtx_code code, rtx *op0, rtx *op1)
     }
 }
 
+
 /* Convert a comparison into something that can be used in a branch.  On
    entry, *OP0 and *OP1 are the values being compared and *CODE is the code
    used to compare them.  Update them to describe the final comparison.  */
@@ -5048,9 +5572,12 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool hi64_part,
 
    'A'	Print a _DB suffix if the memory model requires a release.
    'b'	Print the address of a memory operand, without offset.
+   'B'	Print CONST_INT OP element 0 of a replicated CONST_VECTOR
+	  as an unsigned byte [0..255].
    'c'  Print an integer.
    'C'	Print the integer branch condition for comparison OP.
    'd'	Print CONST_INT OP in decimal.
+   'E'	Print CONST_INT OP element 0 of a replicated CONST_VECTOR in decimal.
    'F'	Print the FPU branch condition for comparison OP.
    'G'	Print a DBAR insn if the memory model requires a release.
    'H'  Print address 52-61bit relocation associated with OP.
@@ -5066,13 +5593,16 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool hi64_part,
    't'	Like 'T', but with the EQ/NE cases reversed
    'V'	Print exact log2 of CONST_INT OP element 0 of a replicated
 	  CONST_VECTOR in decimal.
+   'v'	Print the insn size suffix b, h, w or d for vector modes V16QI, V8HI,
+	  V4SI, V2SI, and w, d for vector modes V4SF, V2DF respectively.
    'W'	Print the inverse of the FPU branch condition for comparison OP.
+   'w'	Print a LSX register.
    'X'	Print CONST_INT OP in hexadecimal format.
    'x'	Print the low 16 bits of CONST_INT OP in hexadecimal format.
    'Y'	Print loongarch_fp_conditions[INTVAL (OP)]
    'y'	Print exact log2 of CONST_INT OP in decimal.
    'Z'	Print OP and a comma for 8CC, otherwise print nothing.
-   'z'	Print $0 if OP is zero, otherwise print OP normally.  */
+   'z'	Print $r0 if OP is zero, otherwise print OP normally.  */
 
 static void
 loongarch_print_operand (FILE *file, rtx op, int letter)
@@ -5094,6 +5624,18 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
       if (loongarch_memmodel_needs_rel_acq_fence ((enum memmodel) INTVAL (op)))
        fputs ("_db", file);
       break;
+    case 'E':
+      if (GET_CODE (op) == CONST_VECTOR)
+	{
+	  gcc_assert (loongarch_const_vector_same_val_p (op, GET_MODE (op)));
+	  op = CONST_VECTOR_ELT (op, 0);
+	  gcc_assert (CONST_INT_P (op));
+	  fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
+	}
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
 
     case 'c':
       if (CONST_INT_P (op))
@@ -5144,6 +5686,18 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
       loongarch_print_operand_reloc (file, op, false /* hi64_part*/,
 				     false /* lo_reloc */);
       break;
+    case 'B':
+      if (GET_CODE (op) == CONST_VECTOR)
+	{
+	  gcc_assert (loongarch_const_vector_same_val_p (op, GET_MODE (op)));
+	  op = CONST_VECTOR_ELT (op, 0);
+	  gcc_assert (CONST_INT_P (op));
+	  unsigned HOST_WIDE_INT val8 = UINTVAL (op) & GET_MODE_MASK (QImode);
+	  fprintf (file, HOST_WIDE_INT_PRINT_UNSIGNED, val8);
+	}
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
 
     case 'm':
       if (CONST_INT_P (op))
@@ -5190,10 +5744,45 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
 	output_operand_lossage ("invalid use of '%%%c'", letter);
       break;
 
-    case 'W':
-      loongarch_print_float_branch_condition (file, reverse_condition (code),
-					      letter);
-      break;
+    case 'v':
+      switch (GET_MODE (op))
+	{
+	case E_V16QImode:
+	case E_V32QImode:
+	  fprintf (file, "b");
+	  break;
+	case E_V8HImode:
+	case E_V16HImode:
+	  fprintf (file, "h");
+	  break;
+	case E_V4SImode:
+	case E_V4SFmode:
+	case E_V8SImode:
+	case E_V8SFmode:
+	  fprintf (file, "w");
+	  break;
+	case E_V2DImode:
+	case E_V2DFmode:
+	case E_V4DImode:
+	case E_V4DFmode:
+	  fprintf (file, "d");
+	  break;
+	default:
+	  output_operand_lossage ("invalid use of '%%%c'", letter);
+	}
+      break;
+
+    case 'W':
+      loongarch_print_float_branch_condition (file, reverse_condition (code),
+					      letter);
+      break;
+
+    case 'w':
+      if (code == REG && LSX_REG_P (REGNO (op)))
+	fprintf (file, "$vr%s", &reg_names[REGNO (op)][2]);
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
 
     case 'x':
       if (CONST_INT_P (op))
@@ -5566,9 +6155,13 @@ loongarch_hard_regno_mode_ok_uncached (unsigned int regno, machine_mode mode)
   size = GET_MODE_SIZE (mode);
   mclass = GET_MODE_CLASS (mode);
 
-  if (GP_REG_P (regno))
+  if (GP_REG_P (regno) && !LSX_SUPPORTED_MODE_P (mode))
     return ((regno - GP_REG_FIRST) & 1) == 0 || size <= UNITS_PER_WORD;
 
+  /* For LSX, allow TImode and 128-bit vector modes in all FPR.  */
+  if (FP_REG_P (regno) && LSX_SUPPORTED_MODE_P (mode))
+    return true;
+
   if (FP_REG_P (regno))
     {
       if (mclass == MODE_FLOAT
@@ -5595,6 +6188,17 @@ loongarch_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
   return loongarch_hard_regno_mode_ok_p[mode][regno];
 }
 
+
+static bool
+loongarch_hard_regno_call_part_clobbered (unsigned int,
+					  unsigned int regno, machine_mode mode)
+{
+  if (ISA_HAS_LSX && FP_REG_P (regno) && GET_MODE_SIZE (mode) > 8)
+    return true;
+
+  return false;
+}
+
 /* Implement TARGET_HARD_REGNO_NREGS.  */
 
 static unsigned int
@@ -5606,7 +6210,12 @@ loongarch_hard_regno_nregs (unsigned int regno, machine_mode mode)
     return (GET_MODE_SIZE (mode) + 3) / 4;
 
   if (FP_REG_P (regno))
-    return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
+    {
+      if (LSX_SUPPORTED_MODE_P (mode))
+	return 1;
+
+      return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
+    }
 
   /* All other registers are word-sized.  */
   return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
@@ -5633,8 +6242,12 @@ loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode)
   if (hard_reg_set_intersect_p (left, reg_class_contents[(int) FP_REGS]))
     {
       if (loongarch_hard_regno_mode_ok (FP_REG_FIRST, mode))
-	size = MIN (size, UNITS_PER_FPREG);
-
+	{
+	  if (LSX_SUPPORTED_MODE_P (mode))
+	    size = MIN (size, UNITS_PER_LSX_REG);
+	  else
+	    size = MIN (size, UNITS_PER_FPREG);
+	}
       left &= ~reg_class_contents[FP_REGS];
     }
   if (!hard_reg_set_empty_p (left))
@@ -5645,9 +6258,13 @@ loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode)
 /* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
 
 static bool
-loongarch_can_change_mode_class (machine_mode, machine_mode,
+loongarch_can_change_mode_class (machine_mode from, machine_mode to,
 				 reg_class_t rclass)
 {
+  /* Allow conversions between different LSX vector modes.  */
+  if (LSX_SUPPORTED_MODE_P (from) && LSX_SUPPORTED_MODE_P (to))
+    return true;
+
   return !reg_classes_intersect_p (FP_REGS, rclass);
 }
 
@@ -5667,7 +6284,7 @@ loongarch_mode_ok_for_mov_fmt_p (machine_mode mode)
       return TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT;
 
     default:
-      return 0;
+      return LSX_SUPPORTED_MODE_P (mode);
     }
 }
 
@@ -5824,7 +6441,12 @@ loongarch_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x,
       if (regno < 0
 	  || (MEM_P (x)
 	      && (GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)))
-	/* In this case we can use fld.s, fst.s, fld.d or fst.d.  */
+	/* In this case we can use lwc1, swc1, ldc1 or sdc1.  We'll use
+	   pairs of lwc1s and swc1s if ldc1 and sdc1 are not supported.  */
+	return NO_REGS;
+
+      if (MEM_P (x) && LSX_SUPPORTED_MODE_P (mode))
+	/* In this case we can use LSX LD.* and ST.*.  */
 	return NO_REGS;
 
       if (GP_REG_P (regno) || x == CONST0_RTX (mode))
@@ -5859,6 +6481,14 @@ loongarch_valid_pointer_mode (scalar_int_mode mode)
   return mode == SImode || (TARGET_64BIT && mode == DImode);
 }
 
+/* Implement TARGET_VECTOR_MODE_SUPPORTED_P.  */
+
+static bool
+loongarch_vector_mode_supported_p (machine_mode mode)
+{
+  return LSX_SUPPORTED_MODE_P (mode);
+}
+
 /* Implement TARGET_SCALAR_MODE_SUPPORTED_P.  */
 
 static bool
@@ -5871,6 +6501,48 @@ loongarch_scalar_mode_supported_p (scalar_mode mode)
   return default_scalar_mode_supported_p (mode);
 }
 
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
+
+static machine_mode
+loongarch_preferred_simd_mode (scalar_mode mode)
+{
+  if (!ISA_HAS_LSX)
+    return word_mode;
+
+  switch (mode)
+    {
+    case E_QImode:
+      return E_V16QImode;
+    case E_HImode:
+      return E_V8HImode;
+    case E_SImode:
+      return E_V4SImode;
+    case E_DImode:
+      return E_V2DImode;
+
+    case E_SFmode:
+      return E_V4SFmode;
+
+    case E_DFmode:
+      return E_V2DFmode;
+
+    default:
+      break;
+    }
+  return word_mode;
+}
+
+static unsigned int
+loongarch_autovectorize_vector_modes (vector_modes *modes, bool)
+{
+  if (ISA_HAS_LSX)
+    {
+      modes->safe_push (V16QImode);
+    }
+
+  return 0;
+}
+
 /* Return the assembly code for INSN, which has the operands given by
    OPERANDS, and which branches to OPERANDS[0] if some condition is true.
    BRANCH_IF_TRUE is the asm template that should be used if OPERANDS[0]
@@ -6035,6 +6707,29 @@ loongarch_output_division (const char *division, rtx *operands)
   return s;
 }
 
+/* Return the assembly code for LSX DIV_{S,U}.DF or MOD_{S,U}.DF instructions,
+   which has the operands given by OPERANDS.  Add in a divide-by-zero check
+   if needed.  */
+
+const char *
+loongarch_lsx_output_division (const char *division, rtx *operands)
+{
+  const char *s;
+
+  s = division;
+  if (TARGET_CHECK_ZERO_DIV)
+    {
+      if (ISA_HAS_LSX)
+	{
+	  output_asm_insn ("vsetallnez.%v0\t$fcc7,%w2",operands);
+	  output_asm_insn (s, operands);
+	  output_asm_insn ("bcnez\t$fcc7,1f", operands);
+	}
+      s = "break\t7\n1:";
+    }
+  return s;
+}
+
 /* Implement TARGET_SCHED_ADJUST_COST.  We assume that anti and output
    dependencies have no cost.  */
 
@@ -6315,6 +7010,9 @@ loongarch_option_override_internal (struct gcc_options *opts,
   if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib)
     error ("%qs cannot be used for compiling a shared library",
 	   "-mdirect-extern-access");
+  if (loongarch_vector_access_cost == 0)
+    loongarch_vector_access_cost = 5;
+
 
   switch (la_target.cmodel)
     {
@@ -6533,64 +7231,60 @@ loongarch_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
   emit_insn (gen_clear_cache (addr, end_addr));
 }
 
-/* Implement HARD_REGNO_CALLER_SAVE_MODE.  */
-
-machine_mode
-loongarch_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs,
-				       machine_mode mode)
-{
-  /* For performance, avoid saving/restoring upper parts of a register
-     by returning MODE as save mode when the mode is known.  */
-  if (mode == VOIDmode)
-    return choose_hard_reg_mode (regno, nregs, NULL);
-  else
-    return mode;
-}
+/* Generate or test for an insn that supports a constant permutation.  */
 
-/* Implement TARGET_SPILL_CLASS.  */
+#define MAX_VECT_LEN 32
 
-static reg_class_t
-loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
-		       machine_mode mode ATTRIBUTE_UNUSED)
+struct expand_vec_perm_d
 {
-  return NO_REGS;
-}
-
-/* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
+  rtx target, op0, op1;
+  unsigned char perm[MAX_VECT_LEN];
+  machine_mode vmode;
+  unsigned char nelt;
+  bool one_vector_p;
+  bool testing_p;
+};
 
-/* This function is equivalent to default_promote_function_mode_always_promote
-   except that it returns a promoted mode even if type is NULL_TREE.  This is
-   needed by libcalls which have no type (only a mode) such as fixed conversion
-   routines that take a signed or unsigned char/short argument and convert it
-   to a fixed type.  */
+/* Construct (set target (vec_select op0 (parallel perm))) and
+   return true if that's a valid instruction in the active ISA.  */
 
-static machine_mode
-loongarch_promote_function_mode (const_tree type ATTRIBUTE_UNUSED,
-				 machine_mode mode,
-				 int *punsignedp ATTRIBUTE_UNUSED,
-				 const_tree fntype ATTRIBUTE_UNUSED,
-				 int for_return ATTRIBUTE_UNUSED)
+static bool
+loongarch_expand_vselect (rtx target, rtx op0,
+			  const unsigned char *perm, unsigned nelt)
 {
-  int unsignedp;
+  rtx rperm[MAX_VECT_LEN], x;
+  rtx_insn *insn;
+  unsigned i;
 
-  if (type != NULL_TREE)
-    return promote_mode (type, mode, punsignedp);
+  for (i = 0; i < nelt; ++i)
+    rperm[i] = GEN_INT (perm[i]);
 
-  unsignedp = *punsignedp;
-  PROMOTE_MODE (mode, unsignedp, type);
-  *punsignedp = unsignedp;
-  return mode;
+  x = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (nelt, rperm));
+  x = gen_rtx_VEC_SELECT (GET_MODE (target), op0, x);
+  x = gen_rtx_SET (target, x);
+
+  insn = emit_insn (x);
+  if (recog_memoized (insn) < 0)
+    {
+      remove_insn (insn);
+      return false;
+    }
+  return true;
 }
 
-/* Implement TARGET_STARTING_FRAME_OFFSET.  See loongarch_compute_frame_info
-   for details about the frame layout.  */
+/* Similar, but generate a vec_concat from op0 and op1 as well.  */
 
-static HOST_WIDE_INT
-loongarch_starting_frame_offset (void)
+static bool
+loongarch_expand_vselect_vconcat (rtx target, rtx op0, rtx op1,
+				  const unsigned char *perm, unsigned nelt)
 {
-  if (FRAME_GROWS_DOWNWARD)
-    return 0;
-  return crtl->outgoing_args_size;
+  machine_mode v2mode;
+  rtx x;
+
+  if (!GET_MODE_2XWIDER_MODE (GET_MODE (op0)).exists (&v2mode))
+    return false;
+  x = gen_rtx_VEC_CONCAT (v2mode, op0, op1);
+  return loongarch_expand_vselect (target, x, perm, nelt);
 }
 
 static tree
@@ -6853,105 +7547,1291 @@ loongarch_set_handled_components (sbitmap components)
 #define TARGET_ASM_ALIGNED_SI_OP "\t.word\t"
 #undef TARGET_ASM_ALIGNED_DI_OP
 #define TARGET_ASM_ALIGNED_DI_OP "\t.dword\t"
+/* Construct (set target (vec_select op0 (parallel selector))) and
+   return true if that's a valid instruction in the active ISA.  */
 
-#undef TARGET_OPTION_OVERRIDE
-#define TARGET_OPTION_OVERRIDE loongarch_option_override
-
-#undef TARGET_LEGITIMIZE_ADDRESS
-#define TARGET_LEGITIMIZE_ADDRESS loongarch_legitimize_address
-
-#undef TARGET_ASM_SELECT_RTX_SECTION
-#define TARGET_ASM_SELECT_RTX_SECTION loongarch_select_rtx_section
-#undef TARGET_ASM_FUNCTION_RODATA_SECTION
-#define TARGET_ASM_FUNCTION_RODATA_SECTION loongarch_function_rodata_section
+static bool
+loongarch_expand_lsx_shuffle (struct expand_vec_perm_d *d)
+{
+  rtx x, elts[MAX_VECT_LEN];
+  rtvec v;
+  rtx_insn *insn;
+  unsigned i;
 
-#undef TARGET_SCHED_INIT
-#define TARGET_SCHED_INIT loongarch_sched_init
-#undef TARGET_SCHED_REORDER
-#define TARGET_SCHED_REORDER loongarch_sched_reorder
-#undef TARGET_SCHED_REORDER2
-#define TARGET_SCHED_REORDER2 loongarch_sched_reorder2
-#undef TARGET_SCHED_VARIABLE_ISSUE
-#define TARGET_SCHED_VARIABLE_ISSUE loongarch_variable_issue
-#undef TARGET_SCHED_ADJUST_COST
-#define TARGET_SCHED_ADJUST_COST loongarch_adjust_cost
-#undef TARGET_SCHED_ISSUE_RATE
-#define TARGET_SCHED_ISSUE_RATE loongarch_issue_rate
-#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
-#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
-  loongarch_multipass_dfa_lookahead
+  if (!ISA_HAS_LSX)
+    return false;
 
-#undef TARGET_FUNCTION_OK_FOR_SIBCALL
-#define TARGET_FUNCTION_OK_FOR_SIBCALL loongarch_function_ok_for_sibcall
+  for (i = 0; i < d->nelt; i++)
+    elts[i] = GEN_INT (d->perm[i]);
 
-#undef TARGET_VALID_POINTER_MODE
-#define TARGET_VALID_POINTER_MODE loongarch_valid_pointer_mode
-#undef TARGET_REGISTER_MOVE_COST
-#define TARGET_REGISTER_MOVE_COST loongarch_register_move_cost
-#undef TARGET_MEMORY_MOVE_COST
-#define TARGET_MEMORY_MOVE_COST loongarch_memory_move_cost
-#undef TARGET_RTX_COSTS
-#define TARGET_RTX_COSTS loongarch_rtx_costs
-#undef TARGET_ADDRESS_COST
-#define TARGET_ADDRESS_COST loongarch_address_cost
+  v = gen_rtvec_v (d->nelt, elts);
+  x = gen_rtx_PARALLEL (VOIDmode, v);
 
-#undef TARGET_IN_SMALL_DATA_P
-#define TARGET_IN_SMALL_DATA_P loongarch_in_small_data_p
+  if (!loongarch_const_vector_shuffle_set_p (x, d->vmode))
+    return false;
 
-#undef TARGET_PREFERRED_RELOAD_CLASS
-#define TARGET_PREFERRED_RELOAD_CLASS loongarch_preferred_reload_class
+  x = gen_rtx_VEC_SELECT (d->vmode, d->op0, x);
+  x = gen_rtx_SET (d->target, x);
 
-#undef TARGET_ASM_FILE_START_FILE_DIRECTIVE
-#define TARGET_ASM_FILE_START_FILE_DIRECTIVE true
+  insn = emit_insn (x);
+  if (recog_memoized (insn) < 0)
+    {
+      remove_insn (insn);
+      return false;
+    }
+  return true;
+}
 
-#undef TARGET_EXPAND_BUILTIN_VA_START
-#define TARGET_EXPAND_BUILTIN_VA_START loongarch_va_start
+void
+loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
+{
+  machine_mode vmode = GET_MODE (target);
 
-#undef TARGET_PROMOTE_FUNCTION_MODE
-#define TARGET_PROMOTE_FUNCTION_MODE loongarch_promote_function_mode
-#undef TARGET_RETURN_IN_MEMORY
-#define TARGET_RETURN_IN_MEMORY loongarch_return_in_memory
+  switch (vmode)
+    {
+    case E_V16QImode:
+      emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel));
+      break;
+    case E_V2DFmode:
+      emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0));
+      break;
+    case E_V2DImode:
+      emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0));
+      break;
+    case E_V4SFmode:
+      emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0));
+      break;
+    case E_V4SImode:
+      emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0));
+      break;
+    case E_V8HImode:
+      emit_insn (gen_lsx_vshuf_h (target, sel, op1, op0));
+      break;
+    default:
+      break;
+    }
+}
 
-#undef TARGET_FUNCTION_VALUE
-#define TARGET_FUNCTION_VALUE loongarch_function_value
-#undef TARGET_LIBCALL_VALUE
-#define TARGET_LIBCALL_VALUE loongarch_libcall_value
+static bool
+loongarch_try_expand_lsx_vshuf_const (struct expand_vec_perm_d *d)
+{
+  int i;
+  rtx target, op0, op1, sel, tmp;
+  rtx rperm[MAX_VECT_LEN];
 
-#undef TARGET_ASM_OUTPUT_MI_THUNK
-#define TARGET_ASM_OUTPUT_MI_THUNK loongarch_output_mi_thunk
-#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
-#define TARGET_ASM_CAN_OUTPUT_MI_THUNK \
-  hook_bool_const_tree_hwi_hwi_const_tree_true
+  if (d->vmode == E_V2DImode || d->vmode == E_V2DFmode
+	|| d->vmode == E_V4SImode || d->vmode == E_V4SFmode
+	|| d->vmode == E_V8HImode || d->vmode == E_V16QImode)
+    {
+      target = d->target;
+      op0 = d->op0;
+      op1 = d->one_vector_p ? d->op0 : d->op1;
 
-#undef TARGET_PRINT_OPERAND
-#define TARGET_PRINT_OPERAND loongarch_print_operand
-#undef TARGET_PRINT_OPERAND_ADDRESS
-#define TARGET_PRINT_OPERAND_ADDRESS loongarch_print_operand_address
-#undef TARGET_PRINT_OPERAND_PUNCT_VALID_P
-#define TARGET_PRINT_OPERAND_PUNCT_VALID_P \
-  loongarch_print_operand_punct_valid_p
+      if (GET_MODE (op0) != GET_MODE (op1)
+	  || GET_MODE (op0) != GET_MODE (target))
+	return false;
 
-#undef TARGET_SETUP_INCOMING_VARARGS
-#define TARGET_SETUP_INCOMING_VARARGS loongarch_setup_incoming_varargs
-#undef TARGET_STRICT_ARGUMENT_NAMING
-#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true
-#undef TARGET_MUST_PASS_IN_STACK
-#define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size
-#undef TARGET_PASS_BY_REFERENCE
-#define TARGET_PASS_BY_REFERENCE loongarch_pass_by_reference
-#undef TARGET_ARG_PARTIAL_BYTES
-#define TARGET_ARG_PARTIAL_BYTES loongarch_arg_partial_bytes
-#undef TARGET_FUNCTION_ARG
-#define TARGET_FUNCTION_ARG loongarch_function_arg
-#undef TARGET_FUNCTION_ARG_ADVANCE
-#define TARGET_FUNCTION_ARG_ADVANCE loongarch_function_arg_advance
-#undef TARGET_FUNCTION_ARG_BOUNDARY
-#define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
+      if (d->testing_p)
+	return true;
 
-#undef TARGET_SCALAR_MODE_SUPPORTED_P
-#define TARGET_SCALAR_MODE_SUPPORTED_P loongarch_scalar_mode_supported_p
+      for (i = 0; i < d->nelt; i += 1)
+	{
+	  rperm[i] = GEN_INT (d->perm[i]);
+	}
 
-#undef TARGET_INIT_BUILTINS
+      if (d->vmode == E_V2DFmode)
+	{
+	  sel = gen_rtx_CONST_VECTOR (E_V2DImode, gen_rtvec_v (d->nelt, rperm));
+	  tmp = gen_rtx_SUBREG (E_V2DImode, d->target, 0);
+	  emit_move_insn (tmp, sel);
+	}
+      else if (d->vmode == E_V4SFmode)
+	{
+	  sel = gen_rtx_CONST_VECTOR (E_V4SImode, gen_rtvec_v (d->nelt, rperm));
+	  tmp = gen_rtx_SUBREG (E_V4SImode, d->target, 0);
+	  emit_move_insn (tmp, sel);
+	}
+      else
+	{
+	  sel = gen_rtx_CONST_VECTOR (d->vmode, gen_rtvec_v (d->nelt, rperm));
+	  emit_move_insn (d->target, sel);
+	}
+
+      switch (d->vmode)
+	{
+	case E_V2DFmode:
+	  emit_insn (gen_lsx_vshuf_d_f (target, target, op1, op0));
+	  break;
+	case E_V2DImode:
+	  emit_insn (gen_lsx_vshuf_d (target, target, op1, op0));
+	  break;
+	case E_V4SFmode:
+	  emit_insn (gen_lsx_vshuf_w_f (target, target, op1, op0));
+	  break;
+	case E_V4SImode:
+	  emit_insn (gen_lsx_vshuf_w (target, target, op1, op0));
+	  break;
+	case E_V8HImode:
+	  emit_insn (gen_lsx_vshuf_h (target, target, op1, op0));
+	  break;
+	case E_V16QImode:
+	  emit_insn (gen_lsx_vshuf_b (target, op1, op0, target));
+	  break;
+	default:
+	  break;
+	}
+
+      return true;
+    }
+  return false;
+}
+
+static bool
+loongarch_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
+{
+  unsigned int i, nelt = d->nelt;
+  unsigned char perm2[MAX_VECT_LEN];
+
+  if (d->one_vector_p)
+    {
+      /* Try interleave with alternating operands.  */
+      memcpy (perm2, d->perm, sizeof (perm2));
+      for (i = 1; i < nelt; i += 2)
+	perm2[i] += nelt;
+      if (loongarch_expand_vselect_vconcat (d->target, d->op0, d->op1, perm2,
+					    nelt))
+	return true;
+    }
+  else
+    {
+      if (loongarch_expand_vselect_vconcat (d->target, d->op0, d->op1,
+					    d->perm, nelt))
+	return true;
+
+      /* Try again with swapped operands.  */
+      for (i = 0; i < nelt; ++i)
+	perm2[i] = (d->perm[i] + nelt) & (2 * nelt - 1);
+      if (loongarch_expand_vselect_vconcat (d->target, d->op1, d->op0, perm2,
+					    nelt))
+	return true;
+    }
+
+  if (loongarch_expand_lsx_shuffle (d))
+    return true;
+  return false;
+}
+
+/* Implementation of constant vector permuatation.  This function identifies
+ * recognized pattern of permuation selector argument, and use one or more
+ * instruction(s) to finish the permutation job correctly.  For unsupported
+ * patterns, it will return false.  */
+
+static bool
+loongarch_expand_vec_perm_const_2 (struct expand_vec_perm_d *d)
+{
+  /* Although we have the LSX vec_perm<mode> template, there's still some
+     128bit vector permuatation operations send to vectorize_vec_perm_const.
+     In this case, we just simpliy wrap them by single vshuf.* instruction,
+     because LSX vshuf.* instruction just have the same behavior that GCC
+     expects.  */
+  return loongarch_try_expand_lsx_vshuf_const (d);
+}
+
+/* Implement TARGET_VECTORIZE_VEC_PERM_CONST.  */
+
+static bool
+loongarch_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode,
+				    rtx target, rtx op0, rtx op1,
+				    const vec_perm_indices &sel)
+{
+  if (vmode != op_mode)
+    return false;
+
+  struct expand_vec_perm_d d;
+  int i, nelt, which;
+  unsigned char orig_perm[MAX_VECT_LEN];
+  bool ok;
+
+  d.target = target;
+  if (op0)
+    {
+      rtx nop0 = force_reg (vmode, op0);
+      if (op0 == op1)
+	op1 = nop0;
+      op0 = nop0;
+    }
+  if (op1)
+    op1 = force_reg (vmode, op1);
+  d.op0 = op0;
+  d.op1 = op1;
+
+  d.vmode = vmode;
+  gcc_assert (VECTOR_MODE_P (vmode));
+  d.nelt = nelt = GET_MODE_NUNITS (vmode);
+  d.testing_p = !target;
+
+  /* This is overly conservative, but ensures we don't get an
+     uninitialized warning on ORIG_PERM.  */
+  memset (orig_perm, 0, MAX_VECT_LEN);
+  for (i = which = 0; i < nelt; ++i)
+    {
+      int ei = sel[i] & (2 * nelt - 1);
+      which |= (ei < nelt ? 1 : 2);
+      orig_perm[i] = ei;
+    }
+  memcpy (d.perm, orig_perm, MAX_VECT_LEN);
+
+  switch (which)
+    {
+    default:
+      gcc_unreachable ();
+
+    case 3:
+      d.one_vector_p = false;
+      if (d.testing_p || !rtx_equal_p (d.op0, d.op1))
+	break;
+      /* FALLTHRU */
+
+    case 2:
+      for (i = 0; i < nelt; ++i)
+	d.perm[i] &= nelt - 1;
+      d.op0 = d.op1;
+      d.one_vector_p = true;
+      break;
+
+    case 1:
+      d.op1 = d.op0;
+      d.one_vector_p = true;
+      break;
+    }
+
+  if (d.testing_p)
+    {
+      d.target = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 1);
+      d.op1 = d.op0 = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 2);
+      if (!d.one_vector_p)
+	d.op1 = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 3);
+
+      ok = loongarch_expand_vec_perm_const_2 (&d);
+      if (ok)
+	return ok;
+
+      start_sequence ();
+      ok = loongarch_expand_vec_perm_const_1 (&d);
+      end_sequence ();
+      return ok;
+    }
+
+  ok = loongarch_expand_vec_perm_const_2 (&d);
+  if (!ok)
+    ok = loongarch_expand_vec_perm_const_1 (&d);
+
+  /* If we were given a two-vector permutation which just happened to
+     have both input vectors equal, we folded this into a one-vector
+     permutation.  There are several loongson patterns that are matched
+     via direct vec_select+vec_concat expansion, but we do not have
+     support in loongarch_expand_vec_perm_const_1 to guess the adjustment
+     that should be made for a single operand.  Just try again with
+     the original permutation.  */
+  if (!ok && which == 3)
+    {
+      d.op0 = op0;
+      d.op1 = op1;
+      d.one_vector_p = false;
+      memcpy (d.perm, orig_perm, MAX_VECT_LEN);
+      ok = loongarch_expand_vec_perm_const_1 (&d);
+    }
+
+  return ok;
+}
+
+static int
+loongarch_cpu_sched_reassociation_width (struct loongarch_target *target,
+					 unsigned int opc, machine_mode mode)
+{
+  /* unreferenced argument */
+  (void) opc;
+
+  switch (target->cpu_tune)
+    {
+    case CPU_LOONGARCH64:
+    case CPU_LA464:
+      /* Vector part.  */
+      if (LSX_SUPPORTED_MODE_P (mode))
+	{
+	  /* Integer vector instructions execute in FP unit.
+	     The width of integer/float-point vector instructions is 3.  */
+	  return 3;
+	}
+
+      /* Scalar part.  */
+      else if (INTEGRAL_MODE_P (mode))
+	return 1;
+      else if (FLOAT_MODE_P (mode))
+	{
+	  if (opc == PLUS_EXPR)
+	    {
+	      return 2;
+	    }
+	  return 4;
+	}
+      break;
+    default:
+      break;
+    }
+
+  /* default is 1 */
+  return 1;
+}
+
+/* Implement TARGET_SCHED_REASSOCIATION_WIDTH.  */
+
+static int
+loongarch_sched_reassociation_width (unsigned int opc, machine_mode mode)
+{
+  return loongarch_cpu_sched_reassociation_width (&la_target, opc, mode);
+}
+
+/* Implement extract a scalar element from vecotr register */
+
+void
+loongarch_expand_vector_extract (rtx target, rtx vec, int elt)
+{
+  machine_mode mode = GET_MODE (vec);
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+  rtx tmp;
+
+  switch (mode)
+    {
+    case E_V8HImode:
+    case E_V16QImode:
+      break;
+
+    default:
+      break;
+    }
+
+  tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, GEN_INT (elt)));
+  tmp = gen_rtx_VEC_SELECT (inner_mode, vec, tmp);
+
+  /* Let the rtl optimizers know about the zero extension performed.  */
+  if (inner_mode == QImode || inner_mode == HImode)
+    {
+      tmp = gen_rtx_ZERO_EXTEND (SImode, tmp);
+      target = gen_lowpart (SImode, target);
+    }
+  if (inner_mode == SImode || inner_mode == DImode)
+    {
+      tmp = gen_rtx_SIGN_EXTEND (inner_mode, tmp);
+    }
+
+  emit_insn (gen_rtx_SET (target, tmp));
+}
+
+/* Generate code to copy vector bits i / 2 ... i - 1 from vector SRC
+   to bits 0 ... i / 2 - 1 of vector DEST, which has the same mode.
+   The upper bits of DEST are undefined, though they shouldn't cause
+   exceptions (some bits from src or all zeros are ok).  */
+
+static void
+emit_reduc_half (rtx dest, rtx src, int i)
+{
+  rtx tem, d = dest;
+  switch (GET_MODE (src))
+    {
+    case E_V4SFmode:
+      tem = gen_lsx_vbsrl_w_f (dest, src, GEN_INT (i == 128 ? 8 : 4));
+      break;
+    case E_V2DFmode:
+      tem = gen_lsx_vbsrl_d_f (dest, src, GEN_INT (8));
+      break;
+    case E_V16QImode:
+    case E_V8HImode:
+    case E_V4SImode:
+    case E_V2DImode:
+      d = gen_reg_rtx (V2DImode);
+      tem = gen_lsx_vbsrl_d (d, gen_lowpart (V2DImode, src), GEN_INT (i/16));
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  emit_insn (tem);
+  if (d != dest)
+    emit_move_insn (dest, gen_lowpart (GET_MODE (dest), d));
+}
+
+/* Expand a vector reduction.  FN is the binary pattern to reduce;
+   DEST is the destination; IN is the input vector.  */
+
+void
+loongarch_expand_vector_reduc (rtx (*fn) (rtx, rtx, rtx), rtx dest, rtx in)
+{
+  rtx half, dst, vec = in;
+  machine_mode mode = GET_MODE (in);
+  int i;
+
+  for (i = GET_MODE_BITSIZE (mode);
+       i > GET_MODE_UNIT_BITSIZE (mode);
+       i >>= 1)
+    {
+      half = gen_reg_rtx (mode);
+      emit_reduc_half (half, vec, i);
+      if (i == GET_MODE_UNIT_BITSIZE (mode) * 2)
+	dst = dest;
+      else
+	dst = gen_reg_rtx (mode);
+      emit_insn (fn (dst, half, vec));
+      vec = dst;
+    }
+}
+
+/* Expand an integral vector unpack operation.  */
+
+void
+loongarch_expand_vec_unpack (rtx operands[2], bool unsigned_p, bool high_p)
+{
+  machine_mode imode = GET_MODE (operands[1]);
+  rtx (*unpack) (rtx, rtx, rtx);
+  rtx (*cmpFunc) (rtx, rtx, rtx);
+  rtx tmp, dest;
+
+  if (ISA_HAS_LSX)
+    {
+      switch (imode)
+	{
+	case E_V4SImode:
+	  if (high_p != 0)
+	    unpack = gen_lsx_vilvh_w;
+	  else
+	    unpack = gen_lsx_vilvl_w;
+
+	  cmpFunc = gen_lsx_vslt_w;
+	  break;
+
+	case E_V8HImode:
+	  if (high_p != 0)
+	    unpack = gen_lsx_vilvh_h;
+	  else
+	    unpack = gen_lsx_vilvl_h;
+
+	  cmpFunc = gen_lsx_vslt_h;
+	  break;
+
+	case E_V16QImode:
+	  if (high_p != 0)
+	    unpack = gen_lsx_vilvh_b;
+	  else
+	    unpack = gen_lsx_vilvl_b;
+
+	  cmpFunc = gen_lsx_vslt_b;
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	  break;
+	}
+
+      if (!unsigned_p)
+	{
+	  /* Extract sign extention for each element comparing each element
+	     with immediate zero.  */
+	  tmp = gen_reg_rtx (imode);
+	  emit_insn (cmpFunc (tmp, operands[1], CONST0_RTX (imode)));
+	}
+      else
+	tmp = force_reg (imode, CONST0_RTX (imode));
+
+      dest = gen_reg_rtx (imode);
+
+      emit_insn (unpack (dest, operands[1], tmp));
+      emit_move_insn (operands[0], gen_lowpart (GET_MODE (operands[0]), dest));
+      return;
+    }
+  gcc_unreachable ();
+}
+
+/* Construct and return PARALLEL RTX with CONST_INTs for HIGH (high_p == TRUE)
+   or LOW (high_p == FALSE) half of a vector for mode MODE.  */
+
+rtx
+loongarch_lsx_vec_parallel_const_half (machine_mode mode, bool high_p)
+{
+  int nunits = GET_MODE_NUNITS (mode);
+  rtvec v = rtvec_alloc (nunits / 2);
+  int base;
+  int i;
+
+  base = high_p ? nunits / 2 : 0;
+
+  for (i = 0; i < nunits / 2; i++)
+    RTVEC_ELT (v, i) = GEN_INT (base + i);
+
+  return gen_rtx_PARALLEL (VOIDmode, v);
+}
+
+/* A subroutine of loongarch_expand_vec_init, match constant vector
+   elements.  */
+
+static inline bool
+loongarch_constant_elt_p (rtx x)
+{
+  return CONST_INT_P (x) || GET_CODE (x) == CONST_DOUBLE;
+}
+
+rtx
+loongarch_gen_const_int_vector_shuffle (machine_mode mode, int val)
+{
+  int nunits = GET_MODE_NUNITS (mode);
+  int nsets = nunits / 4;
+  rtx elts[MAX_VECT_LEN];
+  int set = 0;
+  int i, j;
+
+  /* Generate a const_int vector replicating the same 4-element set
+     from an immediate.  */
+  for (j = 0; j < nsets; j++, set = 4 * j)
+    for (i = 0; i < 4; i++)
+      elts[set + i] = GEN_INT (set + ((val >> (2 * i)) & 0x3));
+
+  return gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (nunits, elts));
+}
+
+/* Expand a vector initialization.  */
+
+void
+loongarch_expand_vector_init (rtx target, rtx vals)
+{
+  machine_mode vmode = GET_MODE (target);
+  machine_mode imode = GET_MODE_INNER (vmode);
+  unsigned i, nelt = GET_MODE_NUNITS (vmode);
+  unsigned nvar = 0;
+  bool all_same = true;
+  rtx x;
+
+  for (i = 0; i < nelt; ++i)
+    {
+      x = XVECEXP (vals, 0, i);
+      if (!loongarch_constant_elt_p (x))
+	nvar++;
+      if (i > 0 && !rtx_equal_p (x, XVECEXP (vals, 0, 0)))
+	all_same = false;
+    }
+
+  if (ISA_HAS_LSX)
+    {
+      if (all_same)
+	{
+	  rtx same = XVECEXP (vals, 0, 0);
+	  rtx temp, temp2;
+
+	  if (CONST_INT_P (same) && nvar == 0
+	      && loongarch_signed_immediate_p (INTVAL (same), 10, 0))
+	    {
+	      switch (vmode)
+		{
+		case E_V16QImode:
+		case E_V8HImode:
+		case E_V4SImode:
+		case E_V2DImode:
+		  temp = gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0));
+		  emit_move_insn (target, temp);
+		  return;
+
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+	  temp = gen_reg_rtx (imode);
+	  if (imode == GET_MODE (same))
+	    temp2 = same;
+	  else if (GET_MODE_SIZE (imode) >= UNITS_PER_WORD)
+	    {
+	      if (GET_CODE (same) == MEM)
+		{
+		  rtx reg_tmp = gen_reg_rtx (GET_MODE (same));
+		  loongarch_emit_move (reg_tmp, same);
+		  temp2 = simplify_gen_subreg (imode, reg_tmp,
+					       GET_MODE (reg_tmp), 0);
+		}
+	      else
+		temp2 = simplify_gen_subreg (imode, same, GET_MODE (same), 0);
+	    }
+	  else
+	    {
+	      if (GET_CODE (same) == MEM)
+		{
+		  rtx reg_tmp = gen_reg_rtx (GET_MODE (same));
+		  loongarch_emit_move (reg_tmp, same);
+		  temp2 = lowpart_subreg (imode, reg_tmp, GET_MODE (reg_tmp));
+		}
+	      else
+		temp2 = lowpart_subreg (imode, same, GET_MODE (same));
+	    }
+	  emit_move_insn (temp, temp2);
+
+	  switch (vmode)
+	    {
+	    case E_V16QImode:
+	    case E_V8HImode:
+	    case E_V4SImode:
+	    case E_V2DImode:
+	      loongarch_emit_move (target, gen_rtx_VEC_DUPLICATE (vmode, temp));
+	      break;
+
+	    case E_V4SFmode:
+	      emit_insn (gen_lsx_vreplvei_w_f_scalar (target, temp));
+	      break;
+
+	    case E_V2DFmode:
+	      emit_insn (gen_lsx_vreplvei_d_f_scalar (target, temp));
+	      break;
+
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+      else
+	{
+	  emit_move_insn (target, CONST0_RTX (vmode));
+
+	  for (i = 0; i < nelt; ++i)
+	    {
+	      rtx temp = gen_reg_rtx (imode);
+	      emit_move_insn (temp, XVECEXP (vals, 0, i));
+	      switch (vmode)
+		{
+		case E_V16QImode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_b_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv16qi (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V8HImode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_h_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv8hi (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V4SImode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_w_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv4si (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V2DImode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_d_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv2di (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V4SFmode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_w_f_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv4sf (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V2DFmode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_d_f_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv2df (target, temp, GEN_INT (i)));
+		  break;
+
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+	}
+      return;
+    }
+
+  /* Load constants from the pool, or whatever's handy.  */
+  if (nvar == 0)
+    {
+      emit_move_insn (target, gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0)));
+      return;
+    }
+
+  /* For two-part initialization, always use CONCAT.  */
+  if (nelt == 2)
+    {
+      rtx op0 = force_reg (imode, XVECEXP (vals, 0, 0));
+      rtx op1 = force_reg (imode, XVECEXP (vals, 0, 1));
+      x = gen_rtx_VEC_CONCAT (vmode, op0, op1);
+      emit_insn (gen_rtx_SET (target, x));
+      return;
+    }
+
+  /* Loongson is the only cpu with vectors with more elements.  */
+  gcc_assert (0);
+}
+
+/* Implement HARD_REGNO_CALLER_SAVE_MODE.  */
+
+machine_mode
+loongarch_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs,
+				       machine_mode mode)
+{
+  /* For performance, avoid saving/restoring upper parts of a register
+     by returning MODE as save mode when the mode is known.  */
+  if (mode == VOIDmode)
+    return choose_hard_reg_mode (regno, nregs, NULL);
+  else
+    return mode;
+}
+
+/* Generate RTL for comparing CMP_OP0 and CMP_OP1 using condition COND and
+   store the result -1 or 0 in DEST.  */
+
+static void
+loongarch_expand_lsx_cmp (rtx dest, enum rtx_code cond, rtx op0, rtx op1)
+{
+  machine_mode cmp_mode = GET_MODE (op0);
+  int unspec = -1;
+  bool negate = false;
+
+  switch (cmp_mode)
+    {
+    case E_V16QImode:
+    case E_V32QImode:
+    case E_V8HImode:
+    case E_V16HImode:
+    case E_V4SImode:
+    case E_V8SImode:
+    case E_V2DImode:
+    case E_V4DImode:
+      switch (cond)
+	{
+	case NE:
+	  cond = reverse_condition (cond);
+	  negate = true;
+	  break;
+	case EQ:
+	case LT:
+	case LE:
+	case LTU:
+	case LEU:
+	  break;
+	case GE:
+	case GT:
+	case GEU:
+	case GTU:
+	  std::swap (op0, op1);
+	  cond = swap_condition (cond);
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+      loongarch_emit_binary (cond, dest, op0, op1);
+      if (negate)
+	emit_move_insn (dest, gen_rtx_NOT (GET_MODE (dest), dest));
+      break;
+
+    case E_V4SFmode:
+    case E_V2DFmode:
+      switch (cond)
+	{
+	case UNORDERED:
+	case ORDERED:
+	case EQ:
+	case NE:
+	case UNEQ:
+	case UNLE:
+	case UNLT:
+	  break;
+	case LTGT: cond = NE; break;
+	case UNGE: cond = UNLE; std::swap (op0, op1); break;
+	case UNGT: cond = UNLT; std::swap (op0, op1); break;
+	case LE: unspec = UNSPEC_LSX_VFCMP_SLE; break;
+	case LT: unspec = UNSPEC_LSX_VFCMP_SLT; break;
+	case GE: unspec = UNSPEC_LSX_VFCMP_SLE; std::swap (op0, op1); break;
+	case GT: unspec = UNSPEC_LSX_VFCMP_SLT; std::swap (op0, op1); break;
+	default:
+		 gcc_unreachable ();
+	}
+      if (unspec < 0)
+	loongarch_emit_binary (cond, dest, op0, op1);
+      else
+	{
+	  rtx x = gen_rtx_UNSPEC (GET_MODE (dest),
+				  gen_rtvec (2, op0, op1), unspec);
+	  emit_insn (gen_rtx_SET (dest, x));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+      break;
+    }
+}
+
+/* Expand VEC_COND_EXPR, where:
+   MODE is mode of the result
+   VIMODE equivalent integer mode
+   OPERANDS operands of VEC_COND_EXPR.  */
+
+void
+loongarch_expand_vec_cond_expr (machine_mode mode, machine_mode vimode,
+				rtx *operands)
+{
+  rtx cond = operands[3];
+  rtx cmp_op0 = operands[4];
+  rtx cmp_op1 = operands[5];
+  rtx cmp_res = gen_reg_rtx (vimode);
+
+  loongarch_expand_lsx_cmp (cmp_res, GET_CODE (cond), cmp_op0, cmp_op1);
+
+  /* We handle the following cases:
+     1) r = a CMP b ? -1 : 0
+     2) r = a CMP b ? -1 : v
+     3) r = a CMP b ?  v : 0
+     4) r = a CMP b ? v1 : v2  */
+
+  /* Case (1) above.  We only move the results.  */
+  if (operands[1] == CONSTM1_RTX (vimode)
+      && operands[2] == CONST0_RTX (vimode))
+    emit_move_insn (operands[0], cmp_res);
+  else
+    {
+      rtx src1 = gen_reg_rtx (vimode);
+      rtx src2 = gen_reg_rtx (vimode);
+      rtx mask = gen_reg_rtx (vimode);
+      rtx bsel;
+
+      /* Move the vector result to use it as a mask.  */
+      emit_move_insn (mask, cmp_res);
+
+      if (register_operand (operands[1], mode))
+	{
+	  rtx xop1 = operands[1];
+	  if (mode != vimode)
+	    {
+	      xop1 = gen_reg_rtx (vimode);
+	      emit_move_insn (xop1, gen_rtx_SUBREG (vimode, operands[1], 0));
+	    }
+	  emit_move_insn (src1, xop1);
+	}
+      else
+	{
+	  gcc_assert (operands[1] == CONSTM1_RTX (vimode));
+	  /* Case (2) if the below doesn't move the mask to src2.  */
+	  emit_move_insn (src1, mask);
+	}
+
+      if (register_operand (operands[2], mode))
+	{
+	  rtx xop2 = operands[2];
+	  if (mode != vimode)
+	    {
+	      xop2 = gen_reg_rtx (vimode);
+	      emit_move_insn (xop2, gen_rtx_SUBREG (vimode, operands[2], 0));
+	    }
+	  emit_move_insn (src2, xop2);
+	}
+      else
+	{
+	  gcc_assert (operands[2] == CONST0_RTX (mode));
+	  /* Case (3) if the above didn't move the mask to src1.  */
+	  emit_move_insn (src2, mask);
+	}
+
+      /* We deal with case (4) if the mask wasn't moved to either src1 or src2.
+	 In any case, we eventually do vector mask-based copy.  */
+      bsel = gen_rtx_IOR (vimode,
+			  gen_rtx_AND (vimode,
+				       gen_rtx_NOT (vimode, mask), src2),
+			  gen_rtx_AND (vimode, mask, src1));
+      /* The result is placed back to a register with the mask.  */
+      emit_insn (gen_rtx_SET (mask, bsel));
+      emit_move_insn (operands[0], gen_rtx_SUBREG (mode, mask, 0));
+    }
+}
+
+void
+loongarch_expand_vec_cond_mask_expr (machine_mode mode, machine_mode vimode,
+				    rtx *operands)
+{
+  rtx cmp_res = operands[3];
+
+  /* We handle the following cases:
+     1) r = a CMP b ? -1 : 0
+     2) r = a CMP b ? -1 : v
+     3) r = a CMP b ?  v : 0
+     4) r = a CMP b ? v1 : v2  */
+
+  /* Case (1) above.  We only move the results.  */
+  if (operands[1] == CONSTM1_RTX (vimode)
+      && operands[2] == CONST0_RTX (vimode))
+    emit_move_insn (operands[0], cmp_res);
+  else
+    {
+      rtx src1 = gen_reg_rtx (vimode);
+      rtx src2 = gen_reg_rtx (vimode);
+      rtx mask = gen_reg_rtx (vimode);
+      rtx bsel;
+
+      /* Move the vector result to use it as a mask.  */
+      emit_move_insn (mask, cmp_res);
+
+      if (register_operand (operands[1], mode))
+	{
+	  rtx xop1 = operands[1];
+	  if (mode != vimode)
+	    {
+	      xop1 = gen_reg_rtx (vimode);
+	      emit_move_insn (xop1, gen_rtx_SUBREG (vimode, operands[1], 0));
+	    }
+	  emit_move_insn (src1, xop1);
+	}
+      else
+	{
+	  gcc_assert (operands[1] == CONSTM1_RTX (vimode));
+	  /* Case (2) if the below doesn't move the mask to src2.  */
+	  emit_move_insn (src1, mask);
+	}
+
+      if (register_operand (operands[2], mode))
+	{
+	  rtx xop2 = operands[2];
+	  if (mode != vimode)
+	    {
+	      xop2 = gen_reg_rtx (vimode);
+	      emit_move_insn (xop2, gen_rtx_SUBREG (vimode, operands[2], 0));
+	    }
+	  emit_move_insn (src2, xop2);
+	}
+      else
+	{
+	  gcc_assert (operands[2] == CONST0_RTX (mode));
+	  /* Case (3) if the above didn't move the mask to src1.  */
+	  emit_move_insn (src2, mask);
+	}
+
+      /* We deal with case (4) if the mask wasn't moved to either src1 or src2.
+	 In any case, we eventually do vector mask-based copy.  */
+      bsel = gen_rtx_IOR (vimode,
+			  gen_rtx_AND (vimode,
+				       gen_rtx_NOT (vimode, mask), src2),
+			  gen_rtx_AND (vimode, mask, src1));
+      /* The result is placed back to a register with the mask.  */
+      emit_insn (gen_rtx_SET (mask, bsel));
+      emit_move_insn (operands[0], gen_rtx_SUBREG (mode, mask, 0));
+    }
+}
+
+/* Expand integer vector comparison */
+bool
+loongarch_expand_vec_cmp (rtx operands[])
+{
+
+  rtx_code code = GET_CODE (operands[1]);
+  loongarch_expand_lsx_cmp (operands[0], code, operands[2], operands[3]);
+  return true;
+}
+
+/* Implement TARGET_CASE_VALUES_THRESHOLD.  */
+
+unsigned int
+loongarch_case_values_threshold (void)
+{
+  return default_case_values_threshold ();
+}
+
+/* Implement TARGET_SPILL_CLASS.  */
+
+static reg_class_t
+loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
+		       machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return NO_REGS;
+}
+
+/* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
+
+/* This function is equivalent to default_promote_function_mode_always_promote
+   except that it returns a promoted mode even if type is NULL_TREE.  This is
+   needed by libcalls which have no type (only a mode) such as fixed conversion
+   routines that take a signed or unsigned char/short argument and convert it
+   to a fixed type.  */
+
+static machine_mode
+loongarch_promote_function_mode (const_tree type ATTRIBUTE_UNUSED,
+				 machine_mode mode,
+				 int *punsignedp ATTRIBUTE_UNUSED,
+				 const_tree fntype ATTRIBUTE_UNUSED,
+				 int for_return ATTRIBUTE_UNUSED)
+{
+  int unsignedp;
+
+  if (type != NULL_TREE)
+    return promote_mode (type, mode, punsignedp);
+
+  unsignedp = *punsignedp;
+  PROMOTE_MODE (mode, unsignedp, type);
+  *punsignedp = unsignedp;
+  return mode;
+}
+
+/* Implement TARGET_STARTING_FRAME_OFFSET.  See loongarch_compute_frame_info
+   for details about the frame layout.  */
+
+static HOST_WIDE_INT
+loongarch_starting_frame_offset (void)
+{
+  if (FRAME_GROWS_DOWNWARD)
+    return 0;
+  return crtl->outgoing_args_size;
+}
+
+/* A subroutine of loongarch_build_signbit_mask.  If VECT is true,
+   then replicate the value for all elements of the vector
+   register.  */
+
+rtx
+loongarch_build_const_vector (machine_mode mode, bool vect, rtx value)
+{
+  int i, n_elt;
+  rtvec v;
+  machine_mode scalar_mode;
+
+  switch (mode)
+    {
+    case E_V32QImode:
+    case E_V16QImode:
+    case E_V32HImode:
+    case E_V16HImode:
+    case E_V8HImode:
+    case E_V8SImode:
+    case E_V4SImode:
+    case E_V8DImode:
+    case E_V4DImode:
+    case E_V2DImode:
+      gcc_assert (vect);
+      /* FALLTHRU */
+    case E_V8SFmode:
+    case E_V4SFmode:
+    case E_V8DFmode:
+    case E_V4DFmode:
+    case E_V2DFmode:
+      n_elt = GET_MODE_NUNITS (mode);
+      v = rtvec_alloc (n_elt);
+      scalar_mode = GET_MODE_INNER (mode);
+
+      RTVEC_ELT (v, 0) = value;
+
+      for (i = 1; i < n_elt; ++i)
+	RTVEC_ELT (v, i) = vect ? value : CONST0_RTX (scalar_mode);
+
+      return gen_rtx_CONST_VECTOR (mode, v);
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create a mask for the sign bit in MODE
+   for an register.  If VECT is true, then replicate the mask for
+   all elements of the vector register.  If INVERT is true, then create
+   a mask excluding the sign bit.  */
+
+rtx
+loongarch_build_signbit_mask (machine_mode mode, bool vect, bool invert)
+{
+  machine_mode vec_mode, imode;
+  wide_int w;
+  rtx mask, v;
+
+  switch (mode)
+    {
+    case E_V16SImode:
+    case E_V16SFmode:
+    case E_V8SImode:
+    case E_V4SImode:
+    case E_V8SFmode:
+    case E_V4SFmode:
+      vec_mode = mode;
+      imode = SImode;
+      break;
+
+    case E_V8DImode:
+    case E_V4DImode:
+    case E_V2DImode:
+    case E_V8DFmode:
+    case E_V4DFmode:
+    case E_V2DFmode:
+      vec_mode = mode;
+      imode = DImode;
+      break;
+
+    case E_TImode:
+    case E_TFmode:
+      vec_mode = VOIDmode;
+      imode = TImode;
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+  w = wi::set_bit_in_zero (GET_MODE_BITSIZE (inner_mode) - 1,
+			   GET_MODE_BITSIZE (inner_mode));
+  if (invert)
+    w = wi::bit_not (w);
+
+  /* Force this value into the low part of a fp vector constant.  */
+  mask = immed_wide_int_const (w, imode);
+  mask = gen_lowpart (inner_mode, mask);
+
+  if (vec_mode == VOIDmode)
+    return force_reg (inner_mode, mask);
+
+  v = loongarch_build_const_vector (vec_mode, vect, mask);
+  return force_reg (vec_mode, v);
+}
+
+static bool
+loongarch_builtin_support_vector_misalignment (machine_mode mode,
+					       const_tree type,
+					       int misalignment,
+					       bool is_packed)
+{
+  if (ISA_HAS_LSX && STRICT_ALIGNMENT)
+    {
+      if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing)
+	return false;
+      if (misalignment == -1)
+	return false;
+    }
+  return default_builtin_support_vector_misalignment (mode, type, misalignment,
+						      is_packed);
+}
+
+/* Initialize the GCC target structure.  */
+#undef TARGET_ASM_ALIGNED_HI_OP
+#define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
+#undef TARGET_ASM_ALIGNED_SI_OP
+#define TARGET_ASM_ALIGNED_SI_OP "\t.word\t"
+#undef TARGET_ASM_ALIGNED_DI_OP
+#define TARGET_ASM_ALIGNED_DI_OP "\t.dword\t"
+
+#undef TARGET_OPTION_OVERRIDE
+#define TARGET_OPTION_OVERRIDE loongarch_option_override
+
+#undef TARGET_LEGITIMIZE_ADDRESS
+#define TARGET_LEGITIMIZE_ADDRESS loongarch_legitimize_address
+
+#undef TARGET_ASM_SELECT_RTX_SECTION
+#define TARGET_ASM_SELECT_RTX_SECTION loongarch_select_rtx_section
+#undef TARGET_ASM_FUNCTION_RODATA_SECTION
+#define TARGET_ASM_FUNCTION_RODATA_SECTION loongarch_function_rodata_section
+
+#undef TARGET_SCHED_INIT
+#define TARGET_SCHED_INIT loongarch_sched_init
+#undef TARGET_SCHED_REORDER
+#define TARGET_SCHED_REORDER loongarch_sched_reorder
+#undef TARGET_SCHED_REORDER2
+#define TARGET_SCHED_REORDER2 loongarch_sched_reorder2
+#undef TARGET_SCHED_VARIABLE_ISSUE
+#define TARGET_SCHED_VARIABLE_ISSUE loongarch_variable_issue
+#undef TARGET_SCHED_ADJUST_COST
+#define TARGET_SCHED_ADJUST_COST loongarch_adjust_cost
+#undef TARGET_SCHED_ISSUE_RATE
+#define TARGET_SCHED_ISSUE_RATE loongarch_issue_rate
+#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
+#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
+  loongarch_multipass_dfa_lookahead
+
+#undef TARGET_FUNCTION_OK_FOR_SIBCALL
+#define TARGET_FUNCTION_OK_FOR_SIBCALL loongarch_function_ok_for_sibcall
+
+#undef TARGET_VALID_POINTER_MODE
+#define TARGET_VALID_POINTER_MODE loongarch_valid_pointer_mode
+#undef TARGET_REGISTER_MOVE_COST
+#define TARGET_REGISTER_MOVE_COST loongarch_register_move_cost
+#undef TARGET_MEMORY_MOVE_COST
+#define TARGET_MEMORY_MOVE_COST loongarch_memory_move_cost
+#undef TARGET_RTX_COSTS
+#define TARGET_RTX_COSTS loongarch_rtx_costs
+#undef TARGET_ADDRESS_COST
+#define TARGET_ADDRESS_COST loongarch_address_cost
+#undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
+#define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \
+  loongarch_builtin_vectorization_cost
+
+
+#undef TARGET_IN_SMALL_DATA_P
+#define TARGET_IN_SMALL_DATA_P loongarch_in_small_data_p
+
+#undef TARGET_PREFERRED_RELOAD_CLASS
+#define TARGET_PREFERRED_RELOAD_CLASS loongarch_preferred_reload_class
+
+#undef TARGET_ASM_FILE_START_FILE_DIRECTIVE
+#define TARGET_ASM_FILE_START_FILE_DIRECTIVE true
+
+#undef TARGET_EXPAND_BUILTIN_VA_START
+#define TARGET_EXPAND_BUILTIN_VA_START loongarch_va_start
+
+#undef TARGET_PROMOTE_FUNCTION_MODE
+#define TARGET_PROMOTE_FUNCTION_MODE loongarch_promote_function_mode
+#undef TARGET_RETURN_IN_MEMORY
+#define TARGET_RETURN_IN_MEMORY loongarch_return_in_memory
+
+#undef TARGET_FUNCTION_VALUE
+#define TARGET_FUNCTION_VALUE loongarch_function_value
+#undef TARGET_LIBCALL_VALUE
+#define TARGET_LIBCALL_VALUE loongarch_libcall_value
+
+#undef TARGET_ASM_OUTPUT_MI_THUNK
+#define TARGET_ASM_OUTPUT_MI_THUNK loongarch_output_mi_thunk
+#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
+#define TARGET_ASM_CAN_OUTPUT_MI_THUNK \
+  hook_bool_const_tree_hwi_hwi_const_tree_true
+
+#undef TARGET_PRINT_OPERAND
+#define TARGET_PRINT_OPERAND loongarch_print_operand
+#undef TARGET_PRINT_OPERAND_ADDRESS
+#define TARGET_PRINT_OPERAND_ADDRESS loongarch_print_operand_address
+#undef TARGET_PRINT_OPERAND_PUNCT_VALID_P
+#define TARGET_PRINT_OPERAND_PUNCT_VALID_P \
+  loongarch_print_operand_punct_valid_p
+
+#undef TARGET_SETUP_INCOMING_VARARGS
+#define TARGET_SETUP_INCOMING_VARARGS loongarch_setup_incoming_varargs
+#undef TARGET_STRICT_ARGUMENT_NAMING
+#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true
+#undef TARGET_MUST_PASS_IN_STACK
+#define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size
+#undef TARGET_PASS_BY_REFERENCE
+#define TARGET_PASS_BY_REFERENCE loongarch_pass_by_reference
+#undef TARGET_ARG_PARTIAL_BYTES
+#define TARGET_ARG_PARTIAL_BYTES loongarch_arg_partial_bytes
+#undef TARGET_FUNCTION_ARG
+#define TARGET_FUNCTION_ARG loongarch_function_arg
+#undef TARGET_FUNCTION_ARG_ADVANCE
+#define TARGET_FUNCTION_ARG_ADVANCE loongarch_function_arg_advance
+#undef TARGET_FUNCTION_ARG_BOUNDARY
+#define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
+
+#undef TARGET_VECTOR_MODE_SUPPORTED_P
+#define TARGET_VECTOR_MODE_SUPPORTED_P loongarch_vector_mode_supported_p
+
+#undef TARGET_SCALAR_MODE_SUPPORTED_P
+#define TARGET_SCALAR_MODE_SUPPORTED_P loongarch_scalar_mode_supported_p
+
+#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
+#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE loongarch_preferred_simd_mode
+
+#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES
+#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES \
+  loongarch_autovectorize_vector_modes
+
+#undef TARGET_INIT_BUILTINS
 #define TARGET_INIT_BUILTINS loongarch_init_builtins
 #undef TARGET_BUILTIN_DECL
 #define TARGET_BUILTIN_DECL loongarch_builtin_decl
@@ -6998,6 +8878,14 @@ loongarch_set_handled_components (sbitmap components)
 
 #undef TARGET_MAX_ANCHOR_OFFSET
 #define TARGET_MAX_ANCHOR_OFFSET (IMM_REACH/2-1)
+#undef TARGET_VECTORIZE_VEC_PERM_CONST
+#define TARGET_VECTORIZE_VEC_PERM_CONST loongarch_vectorize_vec_perm_const
+
+#undef TARGET_SCHED_REASSOCIATION_WIDTH
+#define TARGET_SCHED_REASSOCIATION_WIDTH loongarch_sched_reassociation_width
+
+#undef TARGET_CASE_VALUES_THRESHOLD
+#define TARGET_CASE_VALUES_THRESHOLD loongarch_case_values_threshold
 
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV loongarch_atomic_assign_expand_fenv
@@ -7016,6 +8904,10 @@ loongarch_set_handled_components (sbitmap components)
 #undef TARGET_MODES_TIEABLE_P
 #define TARGET_MODES_TIEABLE_P loongarch_modes_tieable_p
 
+#undef TARGET_HARD_REGNO_CALL_PART_CLOBBERED
+#define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \
+  loongarch_hard_regno_call_part_clobbered
+
 #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS
 #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 2
 
@@ -7066,6 +8958,10 @@ loongarch_set_handled_components (sbitmap components)
 #define TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS \
   loongarch_set_handled_components
 
+#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
+#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
+  loongarch_builtin_support_vector_misalignment
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-loongarch.h"
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index eca723293a1..e939dd826d1 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -23,6 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "config/loongarch/loongarch-opts.h"
 
+#define TARGET_SUPPORTS_WIDE_INT 1
+
 /* Macros to silence warnings about numbers being signed in traditional
    C and unsigned in ISO C when compiled on 32-bit hosts.  */
 
@@ -179,6 +181,11 @@ along with GCC; see the file COPYING3.  If not see
 #define MIN_UNITS_PER_WORD 4
 #endif
 
+/* Width of a LSX vector register in bytes.  */
+#define UNITS_PER_LSX_REG 16
+/* Width of a LSX vector register in bits.  */
+#define BITS_PER_LSX_REG (UNITS_PER_LSX_REG * BITS_PER_UNIT)
+
 /* For LARCH, width of a floating point register.  */
 #define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4)
 
@@ -241,8 +248,10 @@ along with GCC; see the file COPYING3.  If not see
 #define STRUCTURE_SIZE_BOUNDARY 8
 
 /* There is no point aligning anything to a rounder boundary than
-   LONG_DOUBLE_TYPE_SIZE.  */
-#define BIGGEST_ALIGNMENT (LONG_DOUBLE_TYPE_SIZE)
+   LONG_DOUBLE_TYPE_SIZE, unless under LSX the bigggest alignment is
+   BITS_PER_LSX_REG/..  */
+#define BIGGEST_ALIGNMENT \
+  (ISA_HAS_LSX ? BITS_PER_LSX_REG : LONG_DOUBLE_TYPE_SIZE)
 
 /* All accesses must be aligned.  */
 #define STRICT_ALIGNMENT (TARGET_STRICT_ALIGN)
@@ -378,6 +387,9 @@ along with GCC; see the file COPYING3.  If not see
 #define FP_REG_FIRST 32
 #define FP_REG_LAST 63
 #define FP_REG_NUM (FP_REG_LAST - FP_REG_FIRST + 1)
+#define LSX_REG_FIRST FP_REG_FIRST
+#define LSX_REG_LAST  FP_REG_LAST
+#define LSX_REG_NUM   FP_REG_NUM
 
 /* The DWARF 2 CFA column which tracks the return address from a
    signal handler context.  This means that to maintain backwards
@@ -395,8 +407,11 @@ along with GCC; see the file COPYING3.  If not see
   ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM)
 #define FCC_REG_P(REGNO) \
   ((unsigned int) ((int) (REGNO) - FCC_REG_FIRST) < FCC_REG_NUM)
+#define LSX_REG_P(REGNO) \
+  ((unsigned int) ((int) (REGNO) - LSX_REG_FIRST) < LSX_REG_NUM)
 
 #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
+#define LSX_REG_RTX_P(X) (REG_P (X) && LSX_REG_P (REGNO (X)))
 
 /* Select a register mode required for caller save of hard regno REGNO.  */
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
@@ -577,6 +592,11 @@ enum reg_class
 #define IMM12_OPERAND(VALUE) \
   ((unsigned HOST_WIDE_INT) (VALUE) + IMM_REACH / 2 < IMM_REACH)
 
+/* True if VALUE is a signed 13-bit number.  */
+
+#define IMM13_OPERAND(VALUE) \
+  ((unsigned HOST_WIDE_INT) (VALUE) + 0x1000 < 0x2000)
+
 /* True if VALUE is a signed 16-bit number.  */
 
 #define IMM16_OPERAND(VALUE) \
@@ -706,6 +726,13 @@ enum reg_class
 #define FP_ARG_FIRST (FP_REG_FIRST + 0)
 #define FP_ARG_LAST (FP_ARG_FIRST + MAX_ARGS_IN_REGISTERS - 1)
 
+/* True if MODE is vector and supported in a LSX vector register.  */
+#define LSX_SUPPORTED_MODE_P(MODE)			\
+  (ISA_HAS_LSX						\
+   && GET_MODE_SIZE (MODE) == UNITS_PER_LSX_REG		\
+   && (GET_MODE_CLASS (MODE) == MODE_VECTOR_INT		\
+       || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT))
+
 /* 1 if N is a possible register number for function argument passing.
    We have no FP argument registers when soft-float.  */
 
@@ -926,7 +953,39 @@ typedef struct {
   { "s7",	30 + GP_REG_FIRST },					\
   { "s8",	31 + GP_REG_FIRST },					\
   { "v0",	 4 + GP_REG_FIRST },					\
-  { "v1",	 5 + GP_REG_FIRST }					\
+  { "v1",	 5 + GP_REG_FIRST },					\
+  { "vr0",	 0 + FP_REG_FIRST },					\
+  { "vr1",	 1 + FP_REG_FIRST },					\
+  { "vr2",	 2 + FP_REG_FIRST },					\
+  { "vr3",	 3 + FP_REG_FIRST },					\
+  { "vr4",	 4 + FP_REG_FIRST },					\
+  { "vr5",	 5 + FP_REG_FIRST },					\
+  { "vr6",	 6 + FP_REG_FIRST },					\
+  { "vr7",	 7 + FP_REG_FIRST },					\
+  { "vr8",	 8 + FP_REG_FIRST },					\
+  { "vr9",	 9 + FP_REG_FIRST },					\
+  { "vr10",	10 + FP_REG_FIRST },					\
+  { "vr11",	11 + FP_REG_FIRST },					\
+  { "vr12",	12 + FP_REG_FIRST },					\
+  { "vr13",	13 + FP_REG_FIRST },					\
+  { "vr14",	14 + FP_REG_FIRST },					\
+  { "vr15",	15 + FP_REG_FIRST },					\
+  { "vr16",	16 + FP_REG_FIRST },					\
+  { "vr17",	17 + FP_REG_FIRST },					\
+  { "vr18",	18 + FP_REG_FIRST },					\
+  { "vr19",	19 + FP_REG_FIRST },					\
+  { "vr20",	20 + FP_REG_FIRST },					\
+  { "vr21",	21 + FP_REG_FIRST },					\
+  { "vr22",	22 + FP_REG_FIRST },					\
+  { "vr23",	23 + FP_REG_FIRST },					\
+  { "vr24",	24 + FP_REG_FIRST },					\
+  { "vr25",	25 + FP_REG_FIRST },					\
+  { "vr26",	26 + FP_REG_FIRST },					\
+  { "vr27",	27 + FP_REG_FIRST },					\
+  { "vr28",	28 + FP_REG_FIRST },					\
+  { "vr29",	29 + FP_REG_FIRST },					\
+  { "vr30",	30 + FP_REG_FIRST },					\
+  { "vr31",	31 + FP_REG_FIRST }					\
 }
 
 /* Globalizing directive for a label.  */
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index b37e070660f..7b8978e2533 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -158,11 +158,12 @@ (define_attr "move_type"
    const,signext,pick_ins,logical,arith,sll0,andi,shift_shift"
   (const_string "unknown"))
 
-(define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor"
+(define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor,simd_add"
   (const_string "unknown"))
 
 ;; Main data type used by the insn
-(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC"
+(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC,
+  V2DI,V4SI,V8HI,V16QI,V2DF,V4SF"
   (const_string "unknown"))
 
 ;; True if the main data type is twice the size of a word.
@@ -234,7 +235,12 @@ (define_attr "type"
    prefetch,prefetchx,condmove,mgtf,mftg,const,arith,logical,
    shift,slt,signext,clz,trap,imul,idiv,move,
    fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,flogb,fneg,fcmp,fcopysign,fcvt,
-   fscaleb,fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost"
+   fscaleb,fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost,
+   simd_div,simd_fclass,simd_flog2,simd_fadd,simd_fcvt,simd_fmul,simd_fmadd,
+   simd_fdiv,simd_bitins,simd_bitmov,simd_insert,simd_sld,simd_mul,simd_fcmp,
+   simd_fexp2,simd_int_arith,simd_bit,simd_shift,simd_splat,simd_fill,
+   simd_permute,simd_shf,simd_sat,simd_pcnt,simd_copy,simd_branch,simd_clsx,
+   simd_fminmax,simd_logic,simd_move,simd_load,simd_store"
   (cond [(eq_attr "jirl" "!unset") (const_string "call")
 	 (eq_attr "got" "load") (const_string "load")
 
@@ -414,11 +420,20 @@ (define_mode_attr ifmt [(SI "w") (DI "l")])
 
 ;; This attribute gives the upper-case mode name for one unit of a
 ;; floating-point mode or vector mode.
-(define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
+(define_mode_attr UNITMODE [(SF "SF") (DF "DF") (V2SF "SF") (V4SF "SF")
+			    (V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI")
+			    (V2DF "DF")])
+
+;; As above, but in lower case.
+(define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf")
+			    (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi")
+			    (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")])
 
 ;; This attribute gives the integer mode that has half the size of
 ;; the controlling mode.
-(define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
+(define_mode_attr HALFMODE [(DF "SI") (DI "SI") (V2SF "SI")
+			    (V2SI "SI") (V4HI "SI") (V8QI "SI")
+			    (TF "DI")])
 
 ;; This attribute gives the integer mode that has the same size of a
 ;; floating-point mode.
@@ -445,6 +460,18 @@ (define_code_iterator neg_bitwise [and ior])
 ;; from the same template.
 (define_code_iterator any_div [div udiv mod umod])
 
+;; This code iterator allows addition and subtraction to be generated
+;; from the same template.
+(define_code_iterator addsub [plus minus])
+
+;; This code iterator allows addition and multiplication to be generated
+;; from the same template.
+(define_code_iterator addmul [plus mult])
+
+;; This code iterator allows addition subtraction and multiplication to be
+;; generated from the same template
+(define_code_iterator addsubmul [plus minus mult])
+
 ;; This code iterator allows all native floating-point comparisons to be
 ;; generated from the same template.
 (define_code_iterator fcond [unordered uneq unlt unle eq lt le
@@ -684,7 +711,6 @@ (define_insn "sub<mode>3"
   [(set_attr "alu_type" "sub")
    (set_attr "mode" "<MODE>")])
 
-
 (define_insn "*subsi3_extended"
   [(set (match_operand:DI 0 "register_operand" "= r")
 	(sign_extend:DI
@@ -1228,7 +1254,7 @@ (define_insn "smina<mode>3"
   "fmina.<fmt>\t%0,%1,%2"
   [(set_attr "type" "fmove")
    (set_attr "mode" "<MODE>")])
-\f
+
 ;;
 ;;  ....................
 ;;
@@ -2541,7 +2567,6 @@ (define_insn "rotr<mode>3"
   [(set_attr "type" "shift,shift")
    (set_attr "mode" "<MODE>")])
 
-\f
 ;; The following templates were added to generate "bstrpick.d + alsl.d"
 ;; instruction pairs.
 ;; It is required that the values of const_immalsl_operand and
@@ -3606,6 +3631,9 @@ (define_insn "loongarch_crcc_w_<size>_w"
 (include "generic.md")
 (include "la464.md")
 
+; The LoongArch SX Instructions.
+(include "lsx.md")
+
 (define_c_enum "unspec" [
   UNSPEC_ADDRESS_FIRST
 ])
diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt
index 8be2b0bcb4f..b06b58b5ba2 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -153,6 +153,10 @@ mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST	Set the cost of branches to roughly COST instructions.
 
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) IntegerRange(1, 5)
+mmemvec-cost=COST      Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
new file mode 100644
index 00000000000..fb4d228ba84
--- /dev/null
+++ b/gcc/config/loongarch/lsx.md
@@ -0,0 +1,4467 @@
+;; Machine Description for LARCH Loongson SX ASE
+;;
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+;;
+
+(define_c_enum "unspec" [
+  UNSPEC_LSX_ABSD_S
+  UNSPEC_LSX_VABSD_U
+  UNSPEC_LSX_VAVG_S
+  UNSPEC_LSX_VAVG_U
+  UNSPEC_LSX_VAVGR_S
+  UNSPEC_LSX_VAVGR_U
+  UNSPEC_LSX_VBITCLR
+  UNSPEC_LSX_VBITCLRI
+  UNSPEC_LSX_VBITREV
+  UNSPEC_LSX_VBITREVI
+  UNSPEC_LSX_VBITSET
+  UNSPEC_LSX_VBITSETI
+  UNSPEC_LSX_BRANCH_V
+  UNSPEC_LSX_BRANCH
+  UNSPEC_LSX_VFCMP_CAF
+  UNSPEC_LSX_VFCLASS
+  UNSPEC_LSX_VFCMP_CUNE
+  UNSPEC_LSX_VFCVT
+  UNSPEC_LSX_VFCVTH
+  UNSPEC_LSX_VFCVTL
+  UNSPEC_LSX_VFLOGB
+  UNSPEC_LSX_VFRECIP
+  UNSPEC_LSX_VFRINT
+  UNSPEC_LSX_VFRSQRT
+  UNSPEC_LSX_VFCMP_SAF
+  UNSPEC_LSX_VFCMP_SEQ
+  UNSPEC_LSX_VFCMP_SLE
+  UNSPEC_LSX_VFCMP_SLT
+  UNSPEC_LSX_VFCMP_SNE
+  UNSPEC_LSX_VFCMP_SOR
+  UNSPEC_LSX_VFCMP_SUEQ
+  UNSPEC_LSX_VFCMP_SULE
+  UNSPEC_LSX_VFCMP_SULT
+  UNSPEC_LSX_VFCMP_SUN
+  UNSPEC_LSX_VFCMP_SUNE
+  UNSPEC_LSX_VFTINT_S
+  UNSPEC_LSX_VFTINT_U
+  UNSPEC_LSX_VSAT_S
+  UNSPEC_LSX_VSAT_U
+  UNSPEC_LSX_VREPLVEI
+  UNSPEC_LSX_VSRAR
+  UNSPEC_LSX_VSRARI
+  UNSPEC_LSX_VSRLR
+  UNSPEC_LSX_VSRLRI
+  UNSPEC_LSX_VSHUF
+  UNSPEC_LSX_VMUH_S
+  UNSPEC_LSX_VMUH_U
+  UNSPEC_LSX_VEXTW_S
+  UNSPEC_LSX_VEXTW_U
+  UNSPEC_LSX_VSLLWIL_S
+  UNSPEC_LSX_VSLLWIL_U
+  UNSPEC_LSX_VSRAN
+  UNSPEC_LSX_VSSRAN_S
+  UNSPEC_LSX_VSSRAN_U
+  UNSPEC_LSX_VSRAIN
+  UNSPEC_LSX_VSRAINS_S
+  UNSPEC_LSX_VSRAINS_U
+  UNSPEC_LSX_VSRARN
+  UNSPEC_LSX_VSRLN
+  UNSPEC_LSX_VSRLRN
+  UNSPEC_LSX_VSSRLRN_U
+  UNSPEC_LSX_VFRSTPI
+  UNSPEC_LSX_VFRSTP
+  UNSPEC_LSX_VSHUF4I
+  UNSPEC_LSX_VBSRL_V
+  UNSPEC_LSX_VBSLL_V
+  UNSPEC_LSX_VEXTRINS
+  UNSPEC_LSX_VMSKLTZ
+  UNSPEC_LSX_VSIGNCOV
+  UNSPEC_LSX_VFTINTRNE
+  UNSPEC_LSX_VFTINTRP
+  UNSPEC_LSX_VFTINTRM
+  UNSPEC_LSX_VFTINT_W_D
+  UNSPEC_LSX_VFFINT_S_L
+  UNSPEC_LSX_VFTINTRZ_W_D
+  UNSPEC_LSX_VFTINTRP_W_D
+  UNSPEC_LSX_VFTINTRM_W_D
+  UNSPEC_LSX_VFTINTRNE_W_D
+  UNSPEC_LSX_VFTINTL_L_S
+  UNSPEC_LSX_VFFINTH_D_W
+  UNSPEC_LSX_VFFINTL_D_W
+  UNSPEC_LSX_VFTINTRZL_L_S
+  UNSPEC_LSX_VFTINTRZH_L_S
+  UNSPEC_LSX_VFTINTRPL_L_S
+  UNSPEC_LSX_VFTINTRPH_L_S
+  UNSPEC_LSX_VFTINTRMH_L_S
+  UNSPEC_LSX_VFTINTRML_L_S
+  UNSPEC_LSX_VFTINTRNEL_L_S
+  UNSPEC_LSX_VFTINTRNEH_L_S
+  UNSPEC_LSX_VFTINTH_L_H
+  UNSPEC_LSX_VFRINTRNE_S
+  UNSPEC_LSX_VFRINTRNE_D
+  UNSPEC_LSX_VFRINTRZ_S
+  UNSPEC_LSX_VFRINTRZ_D
+  UNSPEC_LSX_VFRINTRP_S
+  UNSPEC_LSX_VFRINTRP_D
+  UNSPEC_LSX_VFRINTRM_S
+  UNSPEC_LSX_VFRINTRM_D
+  UNSPEC_LSX_VSSRARN_S
+  UNSPEC_LSX_VSSRARN_U
+  UNSPEC_LSX_VSSRLN_U
+  UNSPEC_LSX_VSSRLN
+  UNSPEC_LSX_VSSRLRN
+  UNSPEC_LSX_VLDI
+  UNSPEC_LSX_VSHUF_B
+  UNSPEC_LSX_VLDX
+  UNSPEC_LSX_VSTX
+  UNSPEC_LSX_VEXTL_QU_DU
+  UNSPEC_LSX_VSETEQZ_V
+  UNSPEC_LSX_VADDWEV
+  UNSPEC_LSX_VADDWEV2
+  UNSPEC_LSX_VADDWEV3
+  UNSPEC_LSX_VADDWOD
+  UNSPEC_LSX_VADDWOD2
+  UNSPEC_LSX_VADDWOD3
+  UNSPEC_LSX_VSUBWEV
+  UNSPEC_LSX_VSUBWEV2
+  UNSPEC_LSX_VSUBWOD
+  UNSPEC_LSX_VSUBWOD2
+  UNSPEC_LSX_VMULWEV
+  UNSPEC_LSX_VMULWEV2
+  UNSPEC_LSX_VMULWEV3
+  UNSPEC_LSX_VMULWOD
+  UNSPEC_LSX_VMULWOD2
+  UNSPEC_LSX_VMULWOD3
+  UNSPEC_LSX_VHADDW_Q_D
+  UNSPEC_LSX_VHADDW_QU_DU
+  UNSPEC_LSX_VHSUBW_Q_D
+  UNSPEC_LSX_VHSUBW_QU_DU
+  UNSPEC_LSX_VMADDWEV
+  UNSPEC_LSX_VMADDWEV2
+  UNSPEC_LSX_VMADDWEV3
+  UNSPEC_LSX_VMADDWOD
+  UNSPEC_LSX_VMADDWOD2
+  UNSPEC_LSX_VMADDWOD3
+  UNSPEC_LSX_VROTR
+  UNSPEC_LSX_VADD_Q
+  UNSPEC_LSX_VSUB_Q
+  UNSPEC_LSX_VEXTH_Q_D
+  UNSPEC_LSX_VEXTH_QU_DU
+  UNSPEC_LSX_VMSKGEZ
+  UNSPEC_LSX_VMSKNZ
+  UNSPEC_LSX_VEXTL_Q_D
+  UNSPEC_LSX_VSRLNI
+  UNSPEC_LSX_VSRLRNI
+  UNSPEC_LSX_VSSRLNI
+  UNSPEC_LSX_VSSRLNI2
+  UNSPEC_LSX_VSSRLRNI
+  UNSPEC_LSX_VSSRLRNI2
+  UNSPEC_LSX_VSRANI
+  UNSPEC_LSX_VSRARNI
+  UNSPEC_LSX_VSSRANI
+  UNSPEC_LSX_VSSRANI2
+  UNSPEC_LSX_VSSRARNI
+  UNSPEC_LSX_VSSRARNI2
+  UNSPEC_LSX_VPERMI
+])
+
+;; This attribute gives suffix for integers in VHMODE.
+(define_mode_attr dlsxfmt
+  [(V2DI "q")
+   (V4SI "d")
+   (V8HI "w")
+   (V16QI "h")])
+
+(define_mode_attr dlsxfmt_u
+  [(V2DI "qu")
+   (V4SI "du")
+   (V8HI "wu")
+   (V16QI "hu")])
+
+(define_mode_attr d2lsxfmt
+  [(V4SI "q")
+   (V8HI "d")
+   (V16QI "w")])
+
+(define_mode_attr d2lsxfmt_u
+  [(V4SI "qu")
+   (V8HI "du")
+   (V16QI "wu")])
+
+;; The attribute gives two double modes for vector modes.
+(define_mode_attr VD2MODE
+  [(V4SI "V2DI")
+   (V8HI "V2DI")
+   (V16QI "V4SI")])
+
+;; All vector modes with 128 bits.
+(define_mode_iterator LSX      [V2DF V4SF V2DI V4SI V8HI V16QI])
+
+;; Same as LSX.  Used by vcond to iterate two modes.
+(define_mode_iterator LSX_2    [V2DF V4SF V2DI V4SI V8HI V16QI])
+
+;; Only used for vilvh and splitting insert_d and copy_{u,s}.d.
+(define_mode_iterator LSX_D    [V2DI V2DF])
+
+;; Only used for copy_{u,s}.w and vilvh.
+(define_mode_iterator LSX_W    [V4SI V4SF])
+
+;; Only integer modes.
+(define_mode_iterator ILSX     [V2DI V4SI V8HI V16QI])
+
+;; As ILSX but excludes V16QI.
+(define_mode_iterator ILSX_DWH [V2DI V4SI V8HI])
+
+;; As LSX but excludes V16QI.
+(define_mode_iterator LSX_DWH  [V2DF V4SF V2DI V4SI V8HI])
+
+;; As ILSX but excludes V2DI.
+(define_mode_iterator ILSX_WHB [V4SI V8HI V16QI])
+
+;; Only integer modes equal or larger than a word.
+(define_mode_iterator ILSX_DW  [V2DI V4SI])
+
+;; Only integer modes smaller than a word.
+(define_mode_iterator ILSX_HB  [V8HI V16QI])
+
+;;;; Only integer modes for fixed-point madd_q/maddr_q.
+;;(define_mode_iterator ILSX_WH  [V4SI V8HI])
+
+;; Only floating-point modes.
+(define_mode_iterator FLSX     [V2DF V4SF])
+
+;; Only used for immediate set shuffle elements instruction.
+(define_mode_iterator LSX_WHB_W [V4SI V8HI V16QI V4SF])
+
+;; The attribute gives the integer vector mode with same size.
+(define_mode_attr VIMODE
+  [(V2DF "V2DI")
+   (V4SF "V4SI")
+   (V2DI "V2DI")
+   (V4SI "V4SI")
+   (V8HI "V8HI")
+   (V16QI "V16QI")])
+
+;; The attribute gives half modes for vector modes.
+(define_mode_attr VHMODE
+  [(V8HI "V16QI")
+   (V4SI "V8HI")
+   (V2DI "V4SI")])
+
+;; The attribute gives double modes for vector modes.
+(define_mode_attr VDMODE
+  [(V2DI "V2DI")
+   (V4SI "V2DI")
+   (V8HI "V4SI")
+   (V16QI "V8HI")])
+
+;; The attribute gives half modes with same number of elements for vector modes.
+(define_mode_attr VTRUNCMODE
+  [(V8HI "V8QI")
+   (V4SI "V4HI")
+   (V2DI "V2SI")])
+
+;; Double-sized Vector MODE with same elemet type. "Vector, Enlarged-MODE"
+(define_mode_attr VEMODE
+  [(V4SF "V8SF")
+   (V4SI "V8SI")
+   (V2DI "V4DI")
+   (V2DF "V4DF")])
+
+;; This attribute gives the mode of the result for "vpickve2gr_b, copy_u_b" etc.
+(define_mode_attr VRES
+  [(V2DF "DF")
+   (V4SF "SF")
+   (V2DI "DI")
+   (V4SI "SI")
+   (V8HI "SI")
+   (V16QI "SI")])
+
+;; Only used with LSX_D iterator.
+(define_mode_attr lsx_d
+  [(V2DI "reg_or_0")
+   (V2DF "register")])
+
+;; This attribute gives the integer vector mode with same size.
+(define_mode_attr mode_i
+  [(V2DF "v2di")
+   (V4SF "v4si")
+   (V2DI "v2di")
+   (V4SI "v4si")
+   (V8HI "v8hi")
+   (V16QI "v16qi")])
+
+;; This attribute gives suffix for LSX instructions.
+(define_mode_attr lsxfmt
+  [(V2DF "d")
+   (V4SF "w")
+   (V2DI "d")
+   (V4SI "w")
+   (V8HI "h")
+   (V16QI "b")])
+
+;; This attribute gives suffix for LSX instructions.
+(define_mode_attr lsxfmt_u
+  [(V2DF "du")
+   (V4SF "wu")
+   (V2DI "du")
+   (V4SI "wu")
+   (V8HI "hu")
+   (V16QI "bu")])
+
+;; This attribute gives suffix for integers in VHMODE.
+(define_mode_attr hlsxfmt
+  [(V2DI "w")
+   (V4SI "h")
+   (V8HI "b")])
+
+;; This attribute gives suffix for integers in VHMODE.
+(define_mode_attr hlsxfmt_u
+  [(V2DI "wu")
+   (V4SI "hu")
+   (V8HI "bu")])
+
+;; This attribute gives define_insn suffix for LSX instructions that need
+;; distinction between integer and floating point.
+(define_mode_attr lsxfmt_f
+  [(V2DF "d_f")
+   (V4SF "w_f")
+   (V2DI "d")
+   (V4SI "w")
+   (V8HI "h")
+   (V16QI "b")])
+
+(define_mode_attr flsxfmt_f
+  [(V2DF "d_f")
+   (V4SF "s_f")
+   (V2DI "d")
+   (V4SI "w")
+   (V8HI "h")
+   (V16QI "b")])
+
+(define_mode_attr flsxfmt
+  [(V2DF "d")
+   (V4SF "s")
+   (V2DI "d")
+   (V4SI "s")])
+
+(define_mode_attr flsxfrint
+  [(V2DF "d")
+   (V4SF "s")])
+
+(define_mode_attr ilsxfmt
+  [(V2DF "l")
+   (V4SF "w")])
+
+(define_mode_attr ilsxfmt_u
+  [(V2DF "lu")
+   (V4SF "wu")])
+
+;; This is used to form an immediate operand constraint using
+;; "const_<indeximm>_operand".
+(define_mode_attr indeximm
+  [(V2DF "0_or_1")
+   (V4SF "0_to_3")
+   (V2DI "0_or_1")
+   (V4SI "0_to_3")
+   (V8HI "uimm3")
+   (V16QI "uimm4")])
+
+;; This attribute represents bitmask needed for vec_merge using
+;; "const_<bitmask>_operand".
+(define_mode_attr bitmask
+  [(V2DF "exp_2")
+   (V4SF "exp_4")
+   (V2DI "exp_2")
+   (V4SI "exp_4")
+   (V8HI "exp_8")
+   (V16QI "exp_16")])
+
+;; This attribute is used to form an immediate operand constraint using
+;; "const_<bitimm>_operand".
+(define_mode_attr bitimm
+  [(V16QI "uimm3")
+   (V8HI  "uimm4")
+   (V4SI  "uimm5")
+   (V2DI  "uimm6")])
+
+
+(define_int_iterator FRINT_S [UNSPEC_LSX_VFRINTRP_S
+			    UNSPEC_LSX_VFRINTRZ_S
+			    UNSPEC_LSX_VFRINT
+			    UNSPEC_LSX_VFRINTRM_S])
+
+(define_int_iterator FRINT_D [UNSPEC_LSX_VFRINTRP_D
+			    UNSPEC_LSX_VFRINTRZ_D
+			    UNSPEC_LSX_VFRINT
+			    UNSPEC_LSX_VFRINTRM_D])
+
+(define_int_attr frint_pattern_s
+  [(UNSPEC_LSX_VFRINTRP_S  "ceil")
+   (UNSPEC_LSX_VFRINTRZ_S  "btrunc")
+   (UNSPEC_LSX_VFRINT	   "rint")
+   (UNSPEC_LSX_VFRINTRM_S  "floor")])
+
+(define_int_attr frint_pattern_d
+  [(UNSPEC_LSX_VFRINTRP_D  "ceil")
+   (UNSPEC_LSX_VFRINTRZ_D  "btrunc")
+   (UNSPEC_LSX_VFRINT	   "rint")
+   (UNSPEC_LSX_VFRINTRM_D  "floor")])
+
+(define_int_attr frint_suffix
+  [(UNSPEC_LSX_VFRINTRP_S  "rp")
+   (UNSPEC_LSX_VFRINTRP_D  "rp")
+   (UNSPEC_LSX_VFRINTRZ_S  "rz")
+   (UNSPEC_LSX_VFRINTRZ_D  "rz")
+   (UNSPEC_LSX_VFRINT	   "")
+   (UNSPEC_LSX_VFRINTRM_S  "rm")
+   (UNSPEC_LSX_VFRINTRM_D  "rm")])
+
+(define_expand "vec_init<mode><unitmode>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand:LSX 1 "")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vector_init (operands[0], operands[1]);
+  DONE;
+})
+
+;; vpickev pattern with implicit type conversion.
+(define_insn "vec_pack_trunc_<mode>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(vec_concat:<VHMODE>
+	  (truncate:<VTRUNCMODE>
+	    (match_operand:ILSX_DWH 1 "register_operand" "f"))
+	  (truncate:<VTRUNCMODE>
+	    (match_operand:ILSX_DWH 2 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vpickev.<hlsxfmt>\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "vec_unpacks_hi_v4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(float_extend:V2DF
+	  (vec_select:V2SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_dup 2))))]
+  "ISA_HAS_LSX"
+{
+  operands[2] = loongarch_lsx_vec_parallel_const_half (V4SFmode,
+      true/*high_p*/);
+})
+
+(define_expand "vec_unpacks_lo_v4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(float_extend:V2DF
+	  (vec_select:V2SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_dup 2))))]
+  "ISA_HAS_LSX"
+{
+  operands[2] = loongarch_lsx_vec_parallel_const_half (V4SFmode,
+      false/*high_p*/);
+})
+
+(define_expand "vec_unpacks_hi_<mode>"
+  [(match_operand:<VDMODE> 0 "register_operand")
+   (match_operand:ILSX_WHB 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, true/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacks_lo_<mode>"
+  [(match_operand:<VDMODE> 0 "register_operand")
+   (match_operand:ILSX_WHB 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, false/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacku_hi_<mode>"
+  [(match_operand:<VDMODE> 0 "register_operand")
+   (match_operand:ILSX_WHB 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, true/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacku_lo_<mode>"
+  [(match_operand:<VDMODE> 0 "register_operand")
+   (match_operand:ILSX_WHB 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, false/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_extract<mode><unitmode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILSX 1 "register_operand")
+   (match_operand 2 "const_<indeximm>_operand")]
+  "ISA_HAS_LSX"
+{
+  if (<UNITMODE>mode == QImode || <UNITMODE>mode == HImode)
+    {
+      rtx dest1 = gen_reg_rtx (SImode);
+      emit_insn (gen_lsx_vpickve2gr_<lsxfmt> (dest1, operands[1], operands[2]));
+      emit_move_insn (operands[0],
+		      gen_lowpart (<UNITMODE>mode, dest1));
+    }
+  else
+    emit_insn (gen_lsx_vpickve2gr_<lsxfmt> (operands[0], operands[1], operands[2]));
+  DONE;
+})
+
+(define_expand "vec_extract<mode><unitmode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:FLSX 1 "register_operand")
+   (match_operand 2 "const_<indeximm>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx temp;
+  HOST_WIDE_INT val = INTVAL (operands[2]);
+
+  if (val == 0)
+    temp = operands[1];
+  else
+    {
+      rtx n = GEN_INT (val * GET_MODE_SIZE (<UNITMODE>mode));
+      temp = gen_reg_rtx (<MODE>mode);
+      emit_insn (gen_lsx_vbsrl_<lsxfmt_f> (temp, operands[1], n));
+    }
+  emit_insn (gen_lsx_vec_extract_<lsxfmt_f> (operands[0], temp));
+  DONE;
+})
+
+(define_insn_and_split "lsx_vec_extract_<lsxfmt_f>"
+  [(set (match_operand:<UNITMODE> 0 "register_operand" "=f")
+	(vec_select:<UNITMODE>
+	  (match_operand:FLSX 1 "register_operand" "f")
+	  (parallel [(const_int 0)])))]
+  "ISA_HAS_LSX"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0) (match_dup 1))]
+{
+  operands[1] = gen_rtx_REG (<UNITMODE>mode, REGNO (operands[1]));
+}
+  [(set_attr "move_type" "fmove")
+   (set_attr "mode" "<UNITMODE>")])
+
+(define_expand "vec_set<mode>"
+  [(match_operand:ILSX 0 "register_operand")
+   (match_operand:<UNITMODE> 1 "reg_or_0_operand")
+   (match_operand 2 "const_<indeximm>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lsx_vinsgr2vr_<lsxfmt> (operands[0], operands[1],
+					 operands[0], index));
+  DONE;
+})
+
+(define_expand "vec_set<mode>"
+  [(match_operand:FLSX 0 "register_operand")
+   (match_operand:<UNITMODE> 1 "register_operand")
+   (match_operand 2 "const_<indeximm>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lsx_vextrins_<lsxfmt_f>_scalar (operands[0], operands[1],
+						 operands[0], index));
+  DONE;
+})
+
+(define_expand "vec_cmp<mode><mode_i>"
+  [(set (match_operand:<VIMODE> 0 "register_operand")
+	(match_operator 1 ""
+	  [(match_operand:LSX 2 "register_operand")
+	   (match_operand:LSX 3 "register_operand")]))]
+  "ISA_HAS_LSX"
+{
+  bool ok = loongarch_expand_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpu<ILSX:mode><mode_i>"
+  [(set (match_operand:<VIMODE> 0 "register_operand")
+	(match_operator 1 ""
+	  [(match_operand:ILSX 2 "register_operand")
+	   (match_operand:ILSX 3 "register_operand")]))]
+  "ISA_HAS_LSX"
+{
+  bool ok = loongarch_expand_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vcondu<LSX:mode><ILSX:mode>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand:LSX 1 "reg_or_m1_operand")
+   (match_operand:LSX 2 "reg_or_0_operand")
+   (match_operator 3 ""
+     [(match_operand:ILSX 4 "register_operand")
+      (match_operand:ILSX 5 "register_operand")])]
+  "ISA_HAS_LSX
+   && (GET_MODE_NUNITS (<LSX:MODE>mode) == GET_MODE_NUNITS (<ILSX:MODE>mode))"
+{
+  loongarch_expand_vec_cond_expr (<LSX:MODE>mode, <LSX:VIMODE>mode, operands);
+  DONE;
+})
+
+(define_expand "vcond<LSX:mode><LSX_2:mode>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand:LSX 1 "reg_or_m1_operand")
+   (match_operand:LSX 2 "reg_or_0_operand")
+   (match_operator 3 ""
+     [(match_operand:LSX_2 4 "register_operand")
+      (match_operand:LSX_2 5 "register_operand")])]
+  "ISA_HAS_LSX
+   && (GET_MODE_NUNITS (<LSX:MODE>mode) == GET_MODE_NUNITS (<LSX_2:MODE>mode))"
+{
+  loongarch_expand_vec_cond_expr (<LSX:MODE>mode, <LSX:VIMODE>mode, operands);
+  DONE;
+})
+
+(define_expand "vcond_mask_<ILSX:mode><ILSX:mode>"
+  [(match_operand:ILSX 0 "register_operand")
+   (match_operand:ILSX 1 "reg_or_m1_operand")
+   (match_operand:ILSX 2 "reg_or_0_operand")
+   (match_operand:ILSX 3 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_cond_mask_expr (<ILSX:MODE>mode,
+				      <ILSX:VIMODE>mode, operands);
+  DONE;
+})
+
+(define_insn "lsx_vinsgr2vr_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(vec_merge:ILSX
+	  (vec_duplicate:ILSX
+	    (match_operand:<UNITMODE> 1 "reg_or_0_operand" "rJ"))
+	  (match_operand:ILSX 2 "register_operand" "0")
+	  (match_operand 3 "const_<bitmask>_operand" "")))]
+  "ISA_HAS_LSX"
+{
+  if (!TARGET_64BIT && (<MODE>mode == V2DImode || <MODE>mode == V2DFmode))
+    return "#";
+  else
+    return "vinsgr2vr.<lsxfmt>\t%w0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_split
+  [(set (match_operand:LSX_D 0 "register_operand")
+	(vec_merge:LSX_D
+	  (vec_duplicate:LSX_D
+	    (match_operand:<UNITMODE> 1 "<LSX_D:lsx_d>_operand"))
+	  (match_operand:LSX_D 2 "register_operand")
+	  (match_operand 3 "const_<bitmask>_operand")))]
+  "reload_completed && ISA_HAS_LSX && !TARGET_64BIT"
+  [(const_int 0)]
+{
+  loongarch_split_lsx_insert_d (operands[0], operands[2], operands[3], operands[1]);
+  DONE;
+})
+
+(define_insn "lsx_vextrins_<lsxfmt_f>_internal"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(vec_merge:LSX
+	  (vec_duplicate:LSX
+	    (vec_select:<UNITMODE>
+	      (match_operand:LSX 1 "register_operand" "f")
+	      (parallel [(const_int 0)])))
+	  (match_operand:LSX 2 "register_operand" "0")
+	  (match_operand 3 "const_<bitmask>_operand" "")))]
+  "ISA_HAS_LSX"
+  "vextrins.<lsxfmt>\t%w0,%w1,%y3<<4"
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+;; Operand 3 is a scalar.
+(define_insn "lsx_vextrins_<lsxfmt_f>_scalar"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(vec_merge:FLSX
+	  (vec_duplicate:FLSX
+	    (match_operand:<UNITMODE> 1 "register_operand" "f"))
+	  (match_operand:FLSX 2 "register_operand" "0")
+	  (match_operand 3 "const_<bitmask>_operand" "")))]
+  "ISA_HAS_LSX"
+  "vextrins.<lsxfmt>\t%w0,%w1,%y3<<4"
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpickve2gr_<lsxfmt><u>"
+  [(set (match_operand:<VRES> 0 "register_operand" "=r")
+	(any_extend:<VRES>
+	  (vec_select:<UNITMODE>
+	    (match_operand:ILSX_HB 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_<indeximm>_operand" "")]))))]
+  "ISA_HAS_LSX"
+  "vpickve2gr.<lsxfmt><u>\t%0,%w1,%2"
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpickve2gr_<lsxfmt_f><u>"
+  [(set (match_operand:<UNITMODE> 0 "register_operand" "=r")
+	(any_extend:<UNITMODE>
+	  (vec_select:<UNITMODE>
+	    (match_operand:LSX_W 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_<indeximm>_operand" "")]))))]
+  "ISA_HAS_LSX"
+  "vpickve2gr.<lsxfmt><u>\t%0,%w1,%2"
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn_and_split "lsx_vpickve2gr_du"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(vec_select:DI
+	  (match_operand:V2DI 1 "register_operand" "f")
+	  (parallel [(match_operand 2 "const_0_or_1_operand" "")])))]
+  "ISA_HAS_LSX"
+{
+  if (TARGET_64BIT)
+    return "vpickve2gr.du\t%0,%w1,%2";
+  else
+    return "#";
+}
+  "reload_completed && ISA_HAS_LSX && !TARGET_64BIT"
+  [(const_int 0)]
+{
+  loongarch_split_lsx_copy_d (operands[0], operands[1], operands[2],
+			      gen_lsx_vpickve2gr_wu);
+  DONE;
+}
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "V2DI")])
+
+(define_insn_and_split "lsx_vpickve2gr_<lsxfmt_f>"
+  [(set (match_operand:<UNITMODE> 0 "register_operand" "=r")
+	(vec_select:<UNITMODE>
+	  (match_operand:LSX_D 1 "register_operand" "f")
+	  (parallel [(match_operand 2 "const_<indeximm>_operand" "")])))]
+  "ISA_HAS_LSX"
+{
+  if (TARGET_64BIT)
+    return "vpickve2gr.<lsxfmt>\t%0,%w1,%2";
+  else
+    return "#";
+}
+  "reload_completed && ISA_HAS_LSX && !TARGET_64BIT"
+  [(const_int 0)]
+{
+  loongarch_split_lsx_copy_d (operands[0], operands[1], operands[2],
+			      gen_lsx_vpickve2gr_w);
+  DONE;
+}
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_expand "abs<mode>2"
+  [(match_operand:ILSX 0 "register_operand" "=f")
+   (abs:ILSX (match_operand:ILSX 1 "register_operand" "f"))]
+  "ISA_HAS_LSX"
+{
+  if (ISA_HAS_LSX)
+  {
+    emit_insn (gen_vabs<mode>2 (operands[0], operands[1]));
+    DONE;
+  }
+  else
+  {
+    rtx reg = gen_reg_rtx (<MODE>mode);
+    emit_move_insn (reg, CONST0_RTX (<MODE>mode));
+    emit_insn (gen_lsx_vadda_<lsxfmt> (operands[0], operands[1], reg));
+    DONE;
+  }
+})
+
+(define_expand "neg<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand")
+	(neg:ILSX (match_operand:ILSX 1 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  emit_insn (gen_vneg<mode>2 (operands[0], operands[1]));
+  DONE;
+})
+
+(define_expand "neg<mode>2"
+  [(set (match_operand:FLSX 0 "register_operand")
+	(neg:FLSX (match_operand:FLSX 1 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  rtx reg = gen_reg_rtx (<MODE>mode);
+  emit_move_insn (reg, CONST0_RTX (<MODE>mode));
+  emit_insn (gen_sub<mode>3 (operands[0], reg, operands[1]));
+  DONE;
+})
+
+(define_expand "lsx_vrepli<mode>"
+  [(match_operand:ILSX 0 "register_operand")
+   (match_operand 1 "const_imm10_operand")]
+  "ISA_HAS_LSX"
+{
+  if (<MODE>mode == V16QImode)
+    operands[1] = GEN_INT (trunc_int_for_mode (INTVAL (operands[1]),
+					       <UNITMODE>mode));
+  emit_move_insn (operands[0],
+		  loongarch_gen_const_int_vector (<MODE>mode, INTVAL (operands[1])));
+  DONE;
+})
+
+(define_expand "vec_perm<mode>"
+ [(match_operand:LSX 0 "register_operand")
+  (match_operand:LSX 1 "register_operand")
+  (match_operand:LSX 2 "register_operand")
+  (match_operand:LSX 3 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_perm (operands[0], operands[1],
+			     operands[2], operands[3]);
+  DONE;
+})
+
+(define_insn "lsx_vshuf_<lsxfmt_f>"
+  [(set (match_operand:LSX_DWH 0 "register_operand" "=f")
+	(unspec:LSX_DWH [(match_operand:LSX_DWH 1 "register_operand" "0")
+			 (match_operand:LSX_DWH 2 "register_operand" "f")
+			 (match_operand:LSX_DWH 3 "register_operand" "f")]
+			UNSPEC_LSX_VSHUF))]
+  "ISA_HAS_LSX"
+  "vshuf.<lsxfmt>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "mov<mode>"
+  [(set (match_operand:LSX 0)
+	(match_operand:LSX 1))]
+  "ISA_HAS_LSX"
+{
+  if (loongarch_legitimize_move (<MODE>mode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:LSX 0)
+	(match_operand:LSX 1))]
+  "ISA_HAS_LSX"
+{
+  if (loongarch_legitimize_move (<MODE>mode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "mov<mode>_lsx"
+  [(set (match_operand:LSX 0 "nonimmediate_operand" "=f,f,R,*r,*f")
+	(match_operand:LSX 1 "move_operand" "fYGYI,R,f,*f,*r"))]
+  "ISA_HAS_LSX"
+{ return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "type" "simd_move,simd_load,simd_store,simd_copy,simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_split
+  [(set (match_operand:LSX 0 "nonimmediate_operand")
+	(match_operand:LSX 1 "move_operand"))]
+  "reload_completed && ISA_HAS_LSX
+   && loongarch_split_move_insn_p (operands[0], operands[1])"
+  [(const_int 0)]
+{
+  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  DONE;
+})
+
+;; Offset load
+(define_expand "lsx_ld_<lsxfmt_f>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq10<lsxfmt>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (operands[0], gen_rtx_MEM (<MODE>mode, addr));
+  DONE;
+})
+
+;; Offset store
+(define_expand "lsx_st_<lsxfmt_f>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq10<lsxfmt>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (gen_rtx_MEM (<MODE>mode, addr), operands[0]);
+  DONE;
+})
+
+;; Integer operations
+(define_insn "add<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f,f")
+	(plus:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_ximm5_operand" "f,Unv5,Uuv5")))]
+  "ISA_HAS_LSX"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "vadd.<lsxfmt>\t%w0,%w1,%w2";
+    case 1:
+      {
+	HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0));
+
+	operands[2] = GEN_INT (-val);
+	return "vsubi.<lsxfmt_u>\t%w0,%w1,%d2";
+      }
+    case 2:
+      return "vaddi.<lsxfmt_u>\t%w0,%w1,%E2";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "alu_type" "simd_add")
+   (set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(minus:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vsub.<lsxfmt>\t%w0,%w1,%w2
+   vsubi.<lsxfmt_u>\t%w0,%w1,%E2"
+  [(set_attr "alu_type" "simd_add")
+   (set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(mult:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		   (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vmul.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmadd_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(plus:ILSX (mult:ILSX (match_operand:ILSX 2 "register_operand" "f")
+			      (match_operand:ILSX 3 "register_operand" "f"))
+		   (match_operand:ILSX 1 "register_operand" "0")))]
+  "ISA_HAS_LSX"
+  "vmadd.<lsxfmt>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmsub_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(minus:ILSX (match_operand:ILSX 1 "register_operand" "0")
+		    (mult:ILSX (match_operand:ILSX 2 "register_operand" "f")
+			       (match_operand:ILSX 3 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vmsub.<lsxfmt>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "div<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(div:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		  (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+{ return loongarch_lsx_output_division ("vdiv.<lsxfmt>\t%w0,%w1,%w2", operands); }
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "udiv<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(udiv:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		   (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+{ return loongarch_lsx_output_division ("vdiv.<lsxfmt_u>\t%w0,%w1,%w2", operands); }
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mod<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(mod:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		  (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+{ return loongarch_lsx_output_division ("vmod.<lsxfmt>\t%w0,%w1,%w2", operands); }
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umod<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(umod:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		   (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+{ return loongarch_lsx_output_division ("vmod.<lsxfmt_u>\t%w0,%w1,%w2", operands); }
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "xor<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f,f")
+	(xor:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))]
+  "ISA_HAS_LSX"
+  "@
+   vxor.v\t%w0,%w1,%w2
+   vbitrevi.%v0\t%w0,%w1,%V2
+   vxori.b\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "ior<mode>3"
+  [(set (match_operand:LSX 0 "register_operand" "=f,f,f")
+	(ior:LSX
+	  (match_operand:LSX 1 "register_operand" "f,f,f")
+	  (match_operand:LSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))]
+  "ISA_HAS_LSX"
+  "@
+   vor.v\t%w0,%w1,%w2
+   vbitseti.%v0\t%w0,%w1,%V2
+   vori.b\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "and<mode>3"
+  [(set (match_operand:LSX 0 "register_operand" "=f,f,f")
+	(and:LSX
+	  (match_operand:LSX 1 "register_operand" "f,f,f")
+	  (match_operand:LSX 2 "reg_or_vector_same_val_operand" "f,YZ,Urv8")))]
+  "ISA_HAS_LSX"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "vand.v\t%w0,%w1,%w2";
+    case 1:
+      {
+	rtx elt0 = CONST_VECTOR_ELT (operands[2], 0);
+	unsigned HOST_WIDE_INT val = ~UINTVAL (elt0);
+	operands[2] = loongarch_gen_const_int_vector (<MODE>mode, val & (-val));
+	return "vbitclri.%v0\t%w0,%w1,%V2";
+      }
+    case 2:
+      return "vandi.b\t%w0,%w1,%B2";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "one_cmpl<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(not:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vnor.v\t%w0,%w1,%w1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "TI")])
+
+(define_insn "vlshr<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(lshiftrt:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LSX"
+  "@
+   vsrl.<lsxfmt>\t%w0,%w1,%w2
+   vsrli.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vashr<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(ashiftrt:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LSX"
+  "@
+   vsra.<lsxfmt>\t%w0,%w1,%w2
+   vsrai.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vashl<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(ashift:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LSX"
+  "@
+   vsll.<lsxfmt>\t%w0,%w1,%w2
+   vslli.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; Floating-point operations
+(define_insn "add<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(plus:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		   (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfadd.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(minus:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		    (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfsub.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(mult:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		   (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmul.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fmul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "div<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(div:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		  (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfdiv.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fma<mode>4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(fma:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		  (match_operand:FLSX 2 "register_operand" "f")
+		  (match_operand:FLSX 3 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmadd.<flsxfmt>\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fnma<mode>4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(fma:FLSX (neg:FLSX (match_operand:FLSX 1 "register_operand" "f"))
+		  (match_operand:FLSX 2 "register_operand" "f")
+		  (match_operand:FLSX 3 "register_operand" "0")))]
+  "ISA_HAS_LSX"
+  "vfnmsub.<flsxfmt>\t%w0,%w1,%w2,%w0"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(sqrt:FLSX (match_operand:FLSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfsqrt.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+;; Built-in functions
+(define_insn "lsx_vadda_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(plus:ILSX (abs:ILSX (match_operand:ILSX 1 "register_operand" "f"))
+		   (abs:ILSX (match_operand:ILSX 2 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vadda.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "ssadd<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(ss_plus:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vsadd.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "usadd<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(us_plus:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vsadd.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vabsd_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_ABSD_S))]
+  "ISA_HAS_LSX"
+  "vabsd.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vabsd_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VABSD_U))]
+  "ISA_HAS_LSX"
+  "vabsd.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vavg_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VAVG_S))]
+  "ISA_HAS_LSX"
+  "vavg.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vavg_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VAVG_U))]
+  "ISA_HAS_LSX"
+  "vavg.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vavgr_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VAVGR_S))]
+  "ISA_HAS_LSX"
+  "vavgr.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vavgr_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VAVGR_U))]
+  "ISA_HAS_LSX"
+  "vavgr.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitclr_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VBITCLR))]
+  "ISA_HAS_LSX"
+  "vbitclr.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitclri_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VBITCLRI))]
+  "ISA_HAS_LSX"
+  "vbitclri.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitrev_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VBITREV))]
+  "ISA_HAS_LSX"
+  "vbitrev.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitrevi_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		       (match_operand 2 "const_lsx_branch_operand" "")]
+		     UNSPEC_LSX_VBITREVI))]
+  "ISA_HAS_LSX"
+  "vbitrevi.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitsel_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(ior:ILSX (and:ILSX (not:ILSX
+			      (match_operand:ILSX 3 "register_operand" "f"))
+			    (match_operand:ILSX 1 "register_operand" "f"))
+		  (and:ILSX (match_dup 3)
+			    (match_operand:ILSX 2 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vbitsel.v\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_bitmov")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitseli_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(ior:V16QI (and:V16QI (not:V16QI
+				(match_operand:V16QI 1 "register_operand" "0"))
+			      (match_operand:V16QI 2 "register_operand" "f"))
+		   (and:V16QI (match_dup 1)
+			      (match_operand:V16QI 3 "const_vector_same_val_operand" "Urv8"))))]
+  "ISA_HAS_LSX"
+  "vbitseli.b\t%w0,%w2,%B3"
+  [(set_attr "type" "simd_bitmov")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vbitset_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VBITSET))]
+  "ISA_HAS_LSX"
+  "vbitset.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitseti_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VBITSETI))]
+  "ISA_HAS_LSX"
+  "vbitseti.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_code_iterator ICC [eq le leu lt ltu])
+
+(define_code_attr icc
+  [(eq  "eq")
+   (le  "le")
+   (leu "le")
+   (lt  "lt")
+   (ltu "lt")])
+
+(define_code_attr icci
+  [(eq  "eqi")
+   (le  "lei")
+   (leu "lei")
+   (lt  "lti")
+   (ltu "lti")])
+
+(define_code_attr cmpi
+  [(eq   "s")
+   (le   "s")
+   (leu  "u")
+   (lt   "s")
+   (ltu  "u")])
+
+(define_code_attr cmpi_1
+  [(eq   "")
+   (le   "")
+   (leu  "u")
+   (lt   "")
+   (ltu  "u")])
+
+(define_insn "lsx_vs<ICC:icc>_<ILSX:lsxfmt><ICC:cmpi_1>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(ICC:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_<ICC:cmpi>imm5_operand" "f,U<ICC:cmpi>v5")))]
+  "ISA_HAS_LSX"
+  "@
+   vs<ICC:icc>.<ILSX:lsxfmt><ICC:cmpi_1>\t%w0,%w1,%w2
+   vs<ICC:icci>.<ILSX:lsxfmt><ICC:cmpi_1>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfclass_<flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")]
+			 UNSPEC_LSX_VFCLASS))]
+  "ISA_HAS_LSX"
+  "vfclass.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fclass")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfcmp_caf_<flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")
+			  (match_operand:FLSX 2 "register_operand" "f")]
+			 UNSPEC_LSX_VFCMP_CAF))]
+  "ISA_HAS_LSX"
+  "vfcmp.caf.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfcmp_cune_<FLSX:flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")
+			  (match_operand:FLSX 2 "register_operand" "f")]
+			 UNSPEC_LSX_VFCMP_CUNE))]
+  "ISA_HAS_LSX"
+  "vfcmp.cune.<FLSX:flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_code_iterator vfcond [unordered ordered eq ne le lt uneq unle unlt])
+
+(define_code_attr fcc
+  [(unordered "cun")
+   (ordered   "cor")
+   (eq	      "ceq")
+   (ne	      "cne")
+   (uneq      "cueq")
+   (unle      "cule")
+   (unlt      "cult")
+   (le	      "cle")
+   (lt	      "clt")])
+
+(define_int_iterator FSC_UNS [UNSPEC_LSX_VFCMP_SAF UNSPEC_LSX_VFCMP_SUN UNSPEC_LSX_VFCMP_SOR
+			      UNSPEC_LSX_VFCMP_SEQ UNSPEC_LSX_VFCMP_SNE UNSPEC_LSX_VFCMP_SUEQ
+			      UNSPEC_LSX_VFCMP_SUNE UNSPEC_LSX_VFCMP_SULE UNSPEC_LSX_VFCMP_SULT
+			      UNSPEC_LSX_VFCMP_SLE UNSPEC_LSX_VFCMP_SLT])
+
+(define_int_attr fsc
+  [(UNSPEC_LSX_VFCMP_SAF  "saf")
+   (UNSPEC_LSX_VFCMP_SUN  "sun")
+   (UNSPEC_LSX_VFCMP_SOR  "sor")
+   (UNSPEC_LSX_VFCMP_SEQ  "seq")
+   (UNSPEC_LSX_VFCMP_SNE  "sne")
+   (UNSPEC_LSX_VFCMP_SUEQ "sueq")
+   (UNSPEC_LSX_VFCMP_SUNE "sune")
+   (UNSPEC_LSX_VFCMP_SULE "sule")
+   (UNSPEC_LSX_VFCMP_SULT "sult")
+   (UNSPEC_LSX_VFCMP_SLE  "sle")
+   (UNSPEC_LSX_VFCMP_SLT  "slt")])
+
+(define_insn "lsx_vfcmp_<vfcond:fcc>_<FLSX:flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(vfcond:<VIMODE> (match_operand:FLSX 1 "register_operand" "f")
+			 (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfcmp.<vfcond:fcc>.<FLSX:flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfcmp_<fsc>_<FLSX:flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")
+			  (match_operand:FLSX 2 "register_operand" "f")]
+			 FSC_UNS))]
+  "ISA_HAS_LSX"
+  "vfcmp.<fsc>.<FLSX:flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_mode_attr fint
+  [(V4SF "v4si")
+   (V2DF "v2di")])
+
+(define_mode_attr FINTCNV
+  [(V4SF "I2S")
+   (V2DF "I2D")])
+
+(define_mode_attr FINTCNV_2
+  [(V4SF "S2I")
+   (V2DF "D2I")])
+
+(define_insn "float<fint><FLSX:mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(float:FLSX (match_operand:<VIMODE> 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vffint.<flsxfmt>.<ilsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "floatuns<fint><FLSX:mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unsigned_float:FLSX
+	  (match_operand:<VIMODE> 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vffint.<flsxfmt>.<ilsxfmt_u>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV>")
+   (set_attr "mode" "<MODE>")])
+
+(define_mode_attr FFQ
+  [(V4SF "V8HI")
+   (V2DF "V4SI")])
+
+(define_insn "lsx_vreplgr2vr_<lsxfmt_f>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(vec_duplicate:ILSX
+	  (match_operand:<UNITMODE> 1 "reg_or_0_operand" "r,J")))]
+  "ISA_HAS_LSX"
+{
+  if (which_alternative == 1)
+    return "ldi.<lsxfmt>\t%w0,0";
+
+  if (!TARGET_64BIT && (<MODE>mode == V2DImode || <MODE>mode == V2DFmode))
+    return "#";
+  else
+    return "vreplgr2vr.<lsxfmt>\t%w0,%z1";
+}
+  [(set_attr "type" "simd_fill")
+   (set_attr "mode" "<MODE>")])
+
+(define_split
+  [(set (match_operand:LSX_D 0 "register_operand")
+	(vec_duplicate:LSX_D
+	  (match_operand:<UNITMODE> 1 "register_operand")))]
+  "reload_completed && ISA_HAS_LSX && !TARGET_64BIT"
+  [(const_int 0)]
+{
+  loongarch_split_lsx_fill_d (operands[0], operands[1]);
+  DONE;
+})
+
+(define_insn "logb<mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFLOGB))]
+  "ISA_HAS_LSX"
+  "vflogb.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_flog2")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(smax:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		   (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmax.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfmaxa_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(if_then_else:FLSX
+	   (gt (abs:FLSX (match_operand:FLSX 1 "register_operand" "f"))
+	       (abs:FLSX (match_operand:FLSX 2 "register_operand" "f")))
+	   (match_dup 1)
+	   (match_dup 2)))]
+  "ISA_HAS_LSX"
+  "vfmaxa.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(smin:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		   (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmin.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfmina_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(if_then_else:FLSX
+	   (lt (abs:FLSX (match_operand:FLSX 1 "register_operand" "f"))
+	       (abs:FLSX (match_operand:FLSX 2 "register_operand" "f")))
+	   (match_dup 1)
+	   (match_dup 2)))]
+  "ISA_HAS_LSX"
+  "vfmina.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrecip_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRECIP))]
+  "ISA_HAS_LSX"
+  "vfrecip.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrint_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINT))]
+  "ISA_HAS_LSX"
+  "vfrint.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrsqrt_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRSQRT))]
+  "ISA_HAS_LSX"
+  "vfrsqrt.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vftint_s_<ilsxfmt>_<flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")]
+			 UNSPEC_LSX_VFTINT_S))]
+  "ISA_HAS_LSX"
+  "vftint.<ilsxfmt>.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vftint_u_<ilsxfmt_u>_<flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")]
+			 UNSPEC_LSX_VFTINT_U))]
+  "ISA_HAS_LSX"
+  "vftint.<ilsxfmt_u>.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fix_trunc<FLSX:mode><mode_i>2"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(fix:<VIMODE> (match_operand:FLSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vftintrz.<ilsxfmt>.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fixuns_trunc<FLSX:mode><mode_i>2"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unsigned_fix:<VIMODE> (match_operand:FLSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vftintrz.<ilsxfmt_u>.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vh<optab>w_h<u>_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addsub:V8HI
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LSX"
+  "vh<optab>w.h<u>.b<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vh<optab>w_w<u>_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addsub:V4SI
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LSX"
+  "vh<optab>w.w<u>.h<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vh<optab>w_d<u>_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addsub:V2DI
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)])))
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)])))))]
+  "ISA_HAS_LSX"
+  "vh<optab>w.d<u>.w<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vpackev_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(vec_select:V16QI
+	  (vec_concat:V32QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (match_operand:V16QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0)  (const_int 16)
+		     (const_int 2)  (const_int 18)
+		     (const_int 4)  (const_int 20)
+		     (const_int 6)  (const_int 22)
+		     (const_int 8)  (const_int 24)
+		     (const_int 10) (const_int 26)
+		     (const_int 12) (const_int 28)
+		     (const_int 14) (const_int 30)])))]
+  "ISA_HAS_LSX"
+  "vpackev.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vpackev_h"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(vec_select:V8HI
+	  (vec_concat:V16HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (match_operand:V8HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 2) (const_int 10)
+		     (const_int 4) (const_int 12)
+		     (const_int 6) (const_int 14)])))]
+  "ISA_HAS_LSX"
+  "vpackev.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vpackev_w"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(vec_select:V4SI
+	  (vec_concat:V8SI
+	    (match_operand:V4SI 1 "register_operand" "f")
+	    (match_operand:V4SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 2) (const_int 6)])))]
+  "ISA_HAS_LSX"
+  "vpackev.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vpackev_w_f"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(vec_select:V4SF
+	  (vec_concat:V8SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_operand:V4SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 2) (const_int 6)])))]
+  "ISA_HAS_LSX"
+  "vpackev.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vilvh_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(vec_select:V16QI
+	  (vec_concat:V32QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (match_operand:V16QI 2 "register_operand" "f"))
+	  (parallel [(const_int 8)  (const_int 24)
+		     (const_int 9)  (const_int 25)
+		     (const_int 10) (const_int 26)
+		     (const_int 11) (const_int 27)
+		     (const_int 12) (const_int 28)
+		     (const_int 13) (const_int 29)
+		     (const_int 14) (const_int 30)
+		     (const_int 15) (const_int 31)])))]
+  "ISA_HAS_LSX"
+  "vilvh.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vilvh_h"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(vec_select:V8HI
+	  (vec_concat:V16HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (match_operand:V8HI 2 "register_operand" "f"))
+	  (parallel [(const_int 4) (const_int 12)
+		     (const_int 5) (const_int 13)
+		     (const_int 6) (const_int 14)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LSX"
+  "vilvh.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_mode_attr vilvh_suffix
+  [(V4SI "") (V4SF "_f")
+   (V2DI "") (V2DF "_f")])
+
+(define_insn "lsx_vilvh_w<vilvh_suffix>"
+  [(set (match_operand:LSX_W 0 "register_operand" "=f")
+	(vec_select:LSX_W
+	  (vec_concat:<VEMODE>
+	    (match_operand:LSX_W 1 "register_operand" "f")
+	    (match_operand:LSX_W 2 "register_operand" "f"))
+	  (parallel [(const_int 2) (const_int 6)
+		     (const_int 3) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vilvh.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vilvh_d<vilvh_suffix>"
+  [(set (match_operand:LSX_D 0 "register_operand" "=f")
+	(vec_select:LSX_D
+	  (vec_concat:<VEMODE>
+	    (match_operand:LSX_D 1 "register_operand" "f")
+	    (match_operand:LSX_D 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)])))]
+  "ISA_HAS_LSX"
+  "vilvh.d\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpackod_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(vec_select:V16QI
+	  (vec_concat:V32QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (match_operand:V16QI 2 "register_operand" "f"))
+	  (parallel [(const_int 1)  (const_int 17)
+		     (const_int 3)  (const_int 19)
+		     (const_int 5)  (const_int 21)
+		     (const_int 7)  (const_int 23)
+		     (const_int 9)  (const_int 25)
+		     (const_int 11) (const_int 27)
+		     (const_int 13) (const_int 29)
+		     (const_int 15) (const_int 31)])))]
+  "ISA_HAS_LSX"
+  "vpackod.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vpackod_h"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(vec_select:V8HI
+	  (vec_concat:V16HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (match_operand:V8HI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 9)
+		     (const_int 3) (const_int 11)
+		     (const_int 5) (const_int 13)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LSX"
+  "vpackod.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vpackod_w"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(vec_select:V4SI
+	  (vec_concat:V8SI
+	    (match_operand:V4SI 1 "register_operand" "f")
+	    (match_operand:V4SI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 5)
+		     (const_int 3) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vpackod.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vpackod_w_f"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(vec_select:V4SF
+	  (vec_concat:V8SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_operand:V4SF 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 5)
+		     (const_int 3) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vpackod.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vilvl_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(vec_select:V16QI
+	  (vec_concat:V32QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (match_operand:V16QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 16)
+		     (const_int 1) (const_int 17)
+		     (const_int 2) (const_int 18)
+		     (const_int 3) (const_int 19)
+		     (const_int 4) (const_int 20)
+		     (const_int 5) (const_int 21)
+		     (const_int 6) (const_int 22)
+		     (const_int 7) (const_int 23)])))]
+  "ISA_HAS_LSX"
+  "vilvl.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vilvl_h"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(vec_select:V8HI
+	  (vec_concat:V16HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (match_operand:V8HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 1) (const_int 9)
+		     (const_int 2) (const_int 10)
+		     (const_int 3) (const_int 11)])))]
+  "ISA_HAS_LSX"
+  "vilvl.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vilvl_w"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(vec_select:V4SI
+	  (vec_concat:V8SI
+	    (match_operand:V4SI 1 "register_operand" "f")
+	    (match_operand:V4SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 1) (const_int 5)])))]
+  "ISA_HAS_LSX"
+  "vilvl.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vilvl_w_f"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(vec_select:V4SF
+	  (vec_concat:V8SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_operand:V4SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 1) (const_int 5)])))]
+  "ISA_HAS_LSX"
+  "vilvl.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vilvl_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(vec_select:V2DI
+	  (vec_concat:V4DI
+	    (match_operand:V2DI 1 "register_operand" "f")
+	    (match_operand:V2DI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)])))]
+  "ISA_HAS_LSX"
+  "vilvl.d\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vilvl_d_f"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(vec_select:V2DF
+	  (vec_concat:V4DF
+	    (match_operand:V2DF 1 "register_operand" "f")
+	    (match_operand:V2DF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)])))]
+  "ISA_HAS_LSX"
+  "vilvl.d\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(smax:ILSX (match_operand:ILSX 1 "register_operand" "f,f")
+		   (match_operand:ILSX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vmax.<lsxfmt>\t%w0,%w1,%w2
+   vmaxi.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umax<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(umax:ILSX (match_operand:ILSX 1 "register_operand" "f,f")
+		   (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vmax.<lsxfmt_u>\t%w0,%w1,%w2
+   vmaxi.<lsxfmt_u>\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(smin:ILSX (match_operand:ILSX 1 "register_operand" "f,f")
+		   (match_operand:ILSX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vmin.<lsxfmt>\t%w0,%w1,%w2
+   vmini.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umin<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(umin:ILSX (match_operand:ILSX 1 "register_operand" "f,f")
+		   (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vmin.<lsxfmt_u>\t%w0,%w1,%w2
+   vmini.<lsxfmt_u>\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vclo_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(clz:ILSX (not:ILSX (match_operand:ILSX 1 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vclo.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "clz<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(clz:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vclz.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_nor_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(and:ILSX (not:ILSX (match_operand:ILSX 1 "register_operand" "f,f"))
+		  (not:ILSX (match_operand:ILSX 2 "reg_or_vector_same_val_operand" "f,Urv8"))))]
+  "ISA_HAS_LSX"
+  "@
+   vnor.v\t%w0,%w1,%w2
+   vnori.b\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpickev_b"
+[(set (match_operand:V16QI 0 "register_operand" "=f")
+      (vec_select:V16QI
+	(vec_concat:V32QI
+	  (match_operand:V16QI 1 "register_operand" "f")
+	  (match_operand:V16QI 2 "register_operand" "f"))
+	(parallel [(const_int 0) (const_int 2)
+		   (const_int 4) (const_int 6)
+		   (const_int 8) (const_int 10)
+		   (const_int 12) (const_int 14)
+		   (const_int 16) (const_int 18)
+		   (const_int 20) (const_int 22)
+		   (const_int 24) (const_int 26)
+		   (const_int 28) (const_int 30)])))]
+  "ISA_HAS_LSX"
+  "vpickev.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vpickev_h"
+[(set (match_operand:V8HI 0 "register_operand" "=f")
+      (vec_select:V8HI
+	(vec_concat:V16HI
+	  (match_operand:V8HI 1 "register_operand" "f")
+	  (match_operand:V8HI 2 "register_operand" "f"))
+	(parallel [(const_int 0) (const_int 2)
+		   (const_int 4) (const_int 6)
+		   (const_int 8) (const_int 10)
+		   (const_int 12) (const_int 14)])))]
+  "ISA_HAS_LSX"
+  "vpickev.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vpickev_w"
+[(set (match_operand:V4SI 0 "register_operand" "=f")
+      (vec_select:V4SI
+	(vec_concat:V8SI
+	  (match_operand:V4SI 1 "register_operand" "f")
+	  (match_operand:V4SI 2 "register_operand" "f"))
+	(parallel [(const_int 0) (const_int 2)
+		   (const_int 4) (const_int 6)])))]
+  "ISA_HAS_LSX"
+  "vpickev.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vpickev_w_f"
+[(set (match_operand:V4SF 0 "register_operand" "=f")
+      (vec_select:V4SF
+	(vec_concat:V8SF
+	  (match_operand:V4SF 1 "register_operand" "f")
+	  (match_operand:V4SF 2 "register_operand" "f"))
+	(parallel [(const_int 0) (const_int 2)
+		   (const_int 4) (const_int 6)])))]
+  "ISA_HAS_LSX"
+  "vpickev.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vpickod_b"
+[(set (match_operand:V16QI 0 "register_operand" "=f")
+      (vec_select:V16QI
+	(vec_concat:V32QI
+	  (match_operand:V16QI 1 "register_operand" "f")
+	  (match_operand:V16QI 2 "register_operand" "f"))
+	(parallel [(const_int 1) (const_int 3)
+		   (const_int 5) (const_int 7)
+		   (const_int 9) (const_int 11)
+		   (const_int 13) (const_int 15)
+		   (const_int 17) (const_int 19)
+		   (const_int 21) (const_int 23)
+		   (const_int 25) (const_int 27)
+		   (const_int 29) (const_int 31)])))]
+  "ISA_HAS_LSX"
+  "vpickod.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vpickod_h"
+[(set (match_operand:V8HI 0 "register_operand" "=f")
+      (vec_select:V8HI
+	(vec_concat:V16HI
+	  (match_operand:V8HI 1 "register_operand" "f")
+	  (match_operand:V8HI 2 "register_operand" "f"))
+	(parallel [(const_int 1) (const_int 3)
+		   (const_int 5) (const_int 7)
+		   (const_int 9) (const_int 11)
+		   (const_int 13) (const_int 15)])))]
+  "ISA_HAS_LSX"
+  "vpickod.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vpickod_w"
+[(set (match_operand:V4SI 0 "register_operand" "=f")
+      (vec_select:V4SI
+	(vec_concat:V8SI
+	  (match_operand:V4SI 1 "register_operand" "f")
+	  (match_operand:V4SI 2 "register_operand" "f"))
+	(parallel [(const_int 1) (const_int 3)
+		   (const_int 5) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vpickod.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vpickod_w_f"
+[(set (match_operand:V4SF 0 "register_operand" "=f")
+      (vec_select:V4SF
+	(vec_concat:V8SF
+	  (match_operand:V4SF 1 "register_operand" "f")
+	  (match_operand:V4SF 2 "register_operand" "f"))
+	(parallel [(const_int 1) (const_int 3)
+		   (const_int 5) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vpickod.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "popcount<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(popcount:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vpcnt.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_pcnt")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsat_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VSAT_S))]
+  "ISA_HAS_LSX"
+  "vsat.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_sat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsat_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VSAT_U))]
+  "ISA_HAS_LSX"
+  "vsat.<lsxfmt_u>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_sat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vshuf4i_<lsxfmt_f>"
+  [(set (match_operand:LSX_WHB_W 0 "register_operand" "=f")
+	(vec_select:LSX_WHB_W
+	  (match_operand:LSX_WHB_W 1 "register_operand" "f")
+	  (match_operand 2 "par_const_vector_shf_set_operand" "")))]
+  "ISA_HAS_LSX"
+{
+  HOST_WIDE_INT val = 0;
+  unsigned int i;
+
+  /* We convert the selection to an immediate.  */
+  for (i = 0; i < 4; i++)
+    val |= INTVAL (XVECEXP (operands[2], 0, i)) << (2 * i);
+
+  operands[2] = GEN_INT (val);
+  return "vshuf4i.<lsxfmt>\t%w0,%w1,%X2";
+}
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrar_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSRAR))]
+  "ISA_HAS_LSX"
+  "vsrar.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrari_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VSRARI))]
+  "ISA_HAS_LSX"
+  "vsrari.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrlr_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSRLR))]
+  "ISA_HAS_LSX"
+  "vsrlr.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrlri_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VSRLRI))]
+  "ISA_HAS_LSX"
+  "vsrlri.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssub_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(ss_minus:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vssub.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssub_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(us_minus:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vssub.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vreplve_<lsxfmt_f>"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(vec_duplicate:LSX
+	  (vec_select:<UNITMODE>
+	    (match_operand:LSX 1 "register_operand" "f")
+	    (parallel [(match_operand:SI 2 "register_operand" "r")]))))]
+  "ISA_HAS_LSX"
+  "vreplve.<lsxfmt>\t%w0,%w1,%z2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vreplvei_<lsxfmt_f>"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(vec_duplicate:LSX
+	  (vec_select:<UNITMODE>
+	    (match_operand:LSX 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_<indeximm>_operand" "")]))))]
+  "ISA_HAS_LSX"
+  "vreplvei.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vreplvei_<lsxfmt_f>_scalar"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+      (vec_duplicate:LSX
+	(match_operand:<UNITMODE> 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vreplvei.<lsxfmt>\t%w0,%w1,0"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfcvt_h_s"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(unspec:V8HI [(match_operand:V4SF 1 "register_operand" "f")
+		      (match_operand:V4SF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFCVT))]
+  "ISA_HAS_LSX"
+  "vfcvt.h.s\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vfcvt_s_d"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFCVT))]
+  "ISA_HAS_LSX"
+  "vfcvt.s.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "vec_pack_trunc_v2df"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(vec_concat:V4SF
+	  (float_truncate:V2SF (match_operand:V2DF 1 "register_operand" "f"))
+	  (float_truncate:V2SF (match_operand:V2DF 2 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vfcvt.s.d\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfcvth_s_h"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V8HI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFCVTH))]
+  "ISA_HAS_LSX"
+  "vfcvth.s.h\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfcvth_d_s"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(float_extend:V2DF
+	(vec_select:V2SF
+	  (match_operand:V4SF 1 "register_operand" "f")
+	  (parallel [(const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LSX"
+  "vfcvth.d.s\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vfcvtl_s_h"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V8HI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFCVTL))]
+  "ISA_HAS_LSX"
+  "vfcvtl.s.h\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfcvtl_d_s"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(float_extend:V2DF
+	(vec_select:V2SF
+	  (match_operand:V4SF 1 "register_operand" "f")
+	  (parallel [(const_int 0) (const_int 1)]))))]
+  "ISA_HAS_LSX"
+  "vfcvtl.d.s\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DF")])
+
+(define_code_attr lsxbr
+  [(eq "bz")
+   (ne "bnz")])
+
+(define_code_attr lsxeq_v
+  [(eq "eqz")
+   (ne "nez")])
+
+(define_code_attr lsxne_v
+  [(eq "nez")
+   (ne "eqz")])
+
+(define_code_attr lsxeq
+  [(eq "anyeqz")
+   (ne "allnez")])
+
+(define_code_attr lsxne
+  [(eq "allnez")
+   (ne "anyeqz")])
+
+(define_insn "lsx_<lsxbr>_<lsxfmt_f>"
+ [(set (pc) (if_then_else
+	      (equality_op
+		(unspec:SI [(match_operand:LSX 1 "register_operand" "f")]
+			    UNSPEC_LSX_BRANCH)
+		  (match_operand:SI 2 "const_0_operand"))
+		  (label_ref (match_operand 0))
+		  (pc)))
+      (clobber (match_scratch:FCC 3 "=z"))]
+ "ISA_HAS_LSX"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					 "vset<lsxeq>.<lsxfmt>\t%Z3%w1\n\tbcnez\t%Z3%0",
+					 "vset<lsxne>.<lsxfmt>\t%Z3%w1\n\tbcnez\t%Z3%0");
+}
+ [(set_attr "type" "simd_branch")
+  (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_<lsxbr>_v_<lsxfmt_f>"
+ [(set (pc) (if_then_else
+	      (equality_op
+		(unspec:SI [(match_operand:LSX 1 "register_operand" "f")]
+			    UNSPEC_LSX_BRANCH_V)
+		  (match_operand:SI 2 "const_0_operand"))
+		  (label_ref (match_operand 0))
+		  (pc)))
+      (clobber (match_scratch:FCC 3 "=z"))]
+ "ISA_HAS_LSX"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					 "vset<lsxeq_v>.v\t%Z3%w1\n\tbcnez\t%Z3%0",
+					 "vset<lsxne_v>.v\t%Z3%w1\n\tbcnez\t%Z3%0");
+}
+ [(set_attr "type" "simd_branch")
+  (set_attr "mode" "TI")])
+
+;; vec_concate
+(define_expand "vec_concatv2di"
+  [(set (match_operand:V2DI 0 "register_operand")
+	(vec_concat:V2DI
+	  (match_operand:DI 1 "register_operand")
+	  (match_operand:DI 2 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  emit_insn (gen_lsx_vinsgr2vr_d (operands[0], operands[1],
+				  operands[0], GEN_INT (0)));
+  emit_insn (gen_lsx_vinsgr2vr_d (operands[0], operands[2],
+				  operands[0], GEN_INT (1)));
+  DONE;
+})
+
+
+(define_insn "vandn<mode>3"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(and:LSX (not:LSX (match_operand:LSX 1 "register_operand" "f"))
+		 (match_operand:LSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vandn.v\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vabs<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(abs:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vsigncov.<lsxfmt>\t%w0,%w1,%w1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vneg<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(neg:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vneg.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmuh_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMUH_S))]
+  "ISA_HAS_LSX"
+  "vmuh.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmuh_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMUH_U))]
+  "ISA_HAS_LSX"
+  "vmuh.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vextw_s_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTW_S))]
+  "ISA_HAS_LSX"
+  "vextw_s.d\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vextw_u_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTW_U))]
+  "ISA_HAS_LSX"
+  "vextw_u.d\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vsllwil_s_<dlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VDMODE> 0 "register_operand" "=f")
+	(unspec:<VDMODE> [(match_operand:ILSX_WHB 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSLLWIL_S))]
+  "ISA_HAS_LSX"
+  "vsllwil.<dlsxfmt>.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsllwil_u_<dlsxfmt_u>_<lsxfmt_u>"
+  [(set (match_operand:<VDMODE> 0 "register_operand" "=f")
+	(unspec:<VDMODE> [(match_operand:ILSX_WHB 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSLLWIL_U))]
+  "ISA_HAS_LSX"
+  "vsllwil.<dlsxfmt_u>.<lsxfmt_u>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsran_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSRAN))]
+  "ISA_HAS_LSX"
+  "vsran.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssran_s_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRAN_S))]
+  "ISA_HAS_LSX"
+  "vssran.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssran_u_<hlsxfmt_u>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRAN_U))]
+  "ISA_HAS_LSX"
+  "vssran.<hlsxfmt_u>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrain_<hlsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSRAIN))]
+  "ISA_HAS_LSX"
+  "vsrain.<hlsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; FIXME: bitimm
+(define_insn "lsx_vsrains_s_<hlsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSRAINS_S))]
+  "ISA_HAS_LSX"
+  "vsrains_s.<hlsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; FIXME: bitimm
+(define_insn "lsx_vsrains_u_<hlsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSRAINS_U))]
+  "ISA_HAS_LSX"
+  "vsrains_u.<hlsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrarn_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSRARN))]
+  "ISA_HAS_LSX"
+  "vsrarn.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrarn_s_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRARN_S))]
+  "ISA_HAS_LSX"
+  "vssrarn.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrarn_u_<hlsxfmt_u>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRARN_U))]
+  "ISA_HAS_LSX"
+  "vssrarn.<hlsxfmt_u>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrln_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSRLN))]
+  "ISA_HAS_LSX"
+  "vsrln.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrln_u_<hlsxfmt_u>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRLN_U))]
+  "ISA_HAS_LSX"
+  "vssrln.<hlsxfmt_u>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrlrn_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSRLRN))]
+  "ISA_HAS_LSX"
+  "vsrlrn.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlrn_u_<hlsxfmt_u>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRLRN_U))]
+  "ISA_HAS_LSX"
+  "vssrlrn.<hlsxfmt_u>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrstpi_<lsxfmt>"
+  [(set (match_operand:ILSX_HB 0 "register_operand" "=f")
+	(unspec:ILSX_HB [(match_operand:ILSX_HB 1 "register_operand" "0")
+			 (match_operand:ILSX_HB 2 "register_operand" "f")
+			 (match_operand 3 "const_uimm5_operand" "")]
+			UNSPEC_LSX_VFRSTPI))]
+  "ISA_HAS_LSX"
+  "vfrstpi.<lsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrstp_<lsxfmt>"
+  [(set (match_operand:ILSX_HB 0 "register_operand" "=f")
+	(unspec:ILSX_HB [(match_operand:ILSX_HB 1 "register_operand" "0")
+			 (match_operand:ILSX_HB 2 "register_operand" "f")
+			 (match_operand:ILSX_HB 3 "register_operand" "f")]
+			UNSPEC_LSX_VFRSTP))]
+  "ISA_HAS_LSX"
+  "vfrstp.<lsxfmt>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vshuf4i_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand")]
+		     UNSPEC_LSX_VSHUF4I))]
+  "ISA_HAS_LSX"
+  "vshuf4i.d\t%w0,%w2,%3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vbsrl_<lsxfmt_f>"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(unspec:LSX [(match_operand:LSX 1 "register_operand" "f")
+		     (match_operand 2 "const_uimm5_operand" "")]
+		    UNSPEC_LSX_VBSRL_V))]
+  "ISA_HAS_LSX"
+  "vbsrl.v\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbsll_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_uimm5_operand" "")]
+		     UNSPEC_LSX_VBSLL_V))]
+  "ISA_HAS_LSX"
+  "vbsll.v\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vextrins_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VEXTRINS))]
+  "ISA_HAS_LSX"
+  "vextrins.<lsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmskltz_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VMSKLTZ))]
+  "ISA_HAS_LSX"
+  "vmskltz.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsigncov_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSIGNCOV))]
+  "ISA_HAS_LSX"
+  "vsigncov.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "copysign<mode>3"
+  [(set (match_dup 4)
+	(and:FLSX
+	  (not:FLSX (match_dup 3))
+	  (match_operand:FLSX 1 "register_operand")))
+   (set (match_dup 5)
+	(and:FLSX (match_dup 3)
+		  (match_operand:FLSX 2 "register_operand")))
+   (set (match_operand:FLSX 0 "register_operand")
+	(ior:FLSX (match_dup 4) (match_dup 5)))]
+  "ISA_HAS_LSX"
+{
+  operands[3] = loongarch_build_signbit_mask (<MODE>mode, 1, 0);
+
+  operands[4] = gen_reg_rtx (<MODE>mode);
+  operands[5] = gen_reg_rtx (<MODE>mode);
+})
+
+(define_insn "absv2df2"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(abs:V2DF (match_operand:V2DF 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vbitclri.d\t%w0,%w1,63"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "absv4sf2"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(abs:V4SF (match_operand:V4SF 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vbitclri.w\t%w0,%w1,31"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "vfmadd<mode>4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(fma:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		  (match_operand:FLSX 2 "register_operand" "f")
+		  (match_operand:FLSX 3 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmadd.<flsxfmt>\t%w0,%w1,$w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fms<mode>4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(fma:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		  (match_operand:FLSX 2 "register_operand" "f")
+		  (neg:FLSX (match_operand:FLSX 3 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vfmsub.<flsxfmt>\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vfnmsub<mode>4_nmsub4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(neg:FLSX
+	  (fma:FLSX
+	    (match_operand:FLSX 1 "register_operand" "f")
+	    (match_operand:FLSX 2 "register_operand" "f")
+	    (neg:FLSX (match_operand:FLSX 3 "register_operand" "f")))))]
+  "ISA_HAS_LSX"
+  "vfnmsub.<flsxfmt>\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "vfnmadd<mode>4_nmadd4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(neg:FLSX
+	  (fma:FLSX
+	    (match_operand:FLSX 1 "register_operand" "f")
+	    (match_operand:FLSX 2 "register_operand" "f")
+	    (match_operand:FLSX 3 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vfnmadd.<flsxfmt>\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vftintrne_w_s"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNE))]
+  "ISA_HAS_LSX"
+  "vftintrne.w.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrne_l_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNE))]
+  "ISA_HAS_LSX"
+  "vftintrne.l.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrp_w_s"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRP))]
+  "ISA_HAS_LSX"
+  "vftintrp.w.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrp_l_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRP))]
+  "ISA_HAS_LSX"
+  "vftintrp.l.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrm_w_s"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRM))]
+  "ISA_HAS_LSX"
+  "vftintrm.w.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrm_l_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRM))]
+  "ISA_HAS_LSX"
+  "vftintrm.l.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftint_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINT_W_D))]
+  "ISA_HAS_LSX"
+  "vftint.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vffint_s_l"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFFINT_S_L))]
+  "ISA_HAS_LSX"
+  "vffint.s.l\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vftintrz_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRZ_W_D))]
+  "ISA_HAS_LSX"
+  "vftintrz.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrp_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRP_W_D))]
+  "ISA_HAS_LSX"
+  "vftintrp.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrm_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRM_W_D))]
+  "ISA_HAS_LSX"
+  "vftintrm.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrne_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNE_W_D))]
+  "ISA_HAS_LSX"
+  "vftintrne.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftinth_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTH_L_H))]
+  "ISA_HAS_LSX"
+  "vftinth.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintl_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTL_L_S))]
+  "ISA_HAS_LSX"
+  "vftintl.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vffinth_d_w"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V4SI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFFINTH_D_W))]
+  "ISA_HAS_LSX"
+  "vffinth.d.w\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vffintl_d_w"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V4SI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFFINTL_D_W))]
+  "ISA_HAS_LSX"
+  "vffintl.d.w\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vftintrzh_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRZH_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrzh.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrzl_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRZL_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrzl.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrph_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRPH_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrph.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrpl_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRPL_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrpl.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrmh_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRMH_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrmh.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrml_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRML_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrml.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrneh_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNEH_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrneh.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrnel_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNEL_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrnel.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrne_s"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRNE_S))]
+  "ISA_HAS_LSX"
+  "vfrintrne.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrne_d"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRNE_D))]
+  "ISA_HAS_LSX"
+  "vfrintrne.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vfrintrz_s"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRZ_S))]
+  "ISA_HAS_LSX"
+  "vfrintrz.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrz_d"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRZ_D))]
+  "ISA_HAS_LSX"
+  "vfrintrz.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vfrintrp_s"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRP_S))]
+  "ISA_HAS_LSX"
+  "vfrintrp.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrp_d"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRP_D))]
+  "ISA_HAS_LSX"
+  "vfrintrp.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vfrintrm_s"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRM_S))]
+  "ISA_HAS_LSX"
+  "vfrintrm.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrm_d"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRM_D))]
+  "ISA_HAS_LSX"
+  "vfrintrm.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+;; Vector versions of the floating-point frint patterns.
+;; Expands to btrunc, ceil, floor, rint.
+(define_insn "<FRINT_S:frint_pattern_s>v4sf2"
+ [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+			 FRINT_S))]
+  "ISA_HAS_LSX"
+  "vfrint<FRINT_S:frint_suffix>.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "<FRINT_D:frint_pattern_d>v2df2"
+ [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+			 FRINT_D))]
+  "ISA_HAS_LSX"
+  "vfrint<FRINT_D:frint_suffix>.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+;; Expands to round.
+(define_insn "round<mode>2"
+ [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+			 UNSPEC_LSX_VFRINT))]
+  "ISA_HAS_LSX"
+  "vfrint.<flsxfrint>\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; Offset load and broadcast
+(define_expand "lsx_vldrepl_<lsxfmt_f>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12<lsxfmt>_operand")]
+  "ISA_HAS_LSX"
+{
+  emit_insn (gen_lsx_vldrepl_<lsxfmt_f>_insn
+	     (operands[0], operands[1], operands[2]));
+  DONE;
+})
+
+(define_insn "lsx_vldrepl_<lsxfmt_f>_insn"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(vec_duplicate:LSX
+	  (mem:<UNITMODE> (plus:DI (match_operand:DI 1 "register_operand" "r")
+				   (match_operand 2 "aq12<lsxfmt>_operand")))))]
+  "ISA_HAS_LSX"
+{
+    return "vldrepl.<lsxfmt>\t%w0,%1,%2";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+(define_insn "lsx_vldrepl_<lsxfmt_f>_insn_0"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+    (vec_duplicate:LSX
+      (mem:<UNITMODE> (match_operand:DI 1 "register_operand" "r"))))]
+  "ISA_HAS_LSX"
+{
+    return "vldrepl.<lsxfmt>\t%w0,%1,0";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;; Offset store by sel
+(define_expand "lsx_vstelm_<lsxfmt_f>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand 3 "const_<indeximm>_operand")
+   (match_operand 2 "aq8<lsxfmt>_operand")
+   (match_operand 1 "pmode_register_operand")]
+  "ISA_HAS_LSX"
+{
+  emit_insn (gen_lsx_vstelm_<lsxfmt_f>_insn
+	     (operands[1], operands[2], operands[0], operands[3]));
+  DONE;
+})
+
+(define_insn "lsx_vstelm_<lsxfmt_f>_insn"
+  [(set (mem:<UNITMODE> (plus:DI (match_operand:DI 0 "register_operand" "r")
+				 (match_operand 1 "aq8<lsxfmt>_operand")))
+	(vec_select:<UNITMODE>
+	  (match_operand:LSX 2 "register_operand" "f")
+	  (parallel [(match_operand 3 "const_<indeximm>_operand" "")])))]
+
+  "ISA_HAS_LSX"
+{
+  return "vstelm.<lsxfmt>\t%w2,%0,%1,%3";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;; Offset is "0"
+(define_insn "lsx_vstelm_<lsxfmt_f>_insn_0"
+  [(set (mem:<UNITMODE> (match_operand:DI 0 "register_operand" "r"))
+    (vec_select:<UNITMODE>
+      (match_operand:LSX 1 "register_operand" "f")
+      (parallel [(match_operand:SI 2 "const_<indeximm>_operand")])))]
+  "ISA_HAS_LSX"
+{
+    return "vstelm.<lsxfmt>\t%w1,%0,0,%2";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+(define_expand "lsx_vld"
+  [(match_operand:V16QI 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12b_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (operands[0], gen_rtx_MEM (V16QImode, addr));
+  DONE;
+})
+
+(define_expand "lsx_vst"
+  [(match_operand:V16QI 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12b_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (gen_rtx_MEM (V16QImode, addr), operands[0]);
+  DONE;
+})
+
+(define_insn "lsx_vssrln_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRLN))]
+  "ISA_HAS_LSX"
+  "vssrln.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "lsx_vssrlrn_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRLRN))]
+  "ISA_HAS_LSX"
+  "vssrlrn.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vorn<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(ior:ILSX (not:ILSX (match_operand:ILSX 2 "register_operand" "f"))
+		  (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vorn.v\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vldi"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand 1 "const_imm13_operand")]
+		    UNSPEC_LSX_VLDI))]
+  "ISA_HAS_LSX"
+{
+  HOST_WIDE_INT val = INTVAL (operands[1]);
+  if (val < 0)
+  {
+    HOST_WIDE_INT modeVal = (val & 0xf00) >> 8;
+    if (modeVal < 13)
+      return  "vldi\t%w0,%1";
+    else
+      sorry ("imm13 only support 0000 ~ 1100 in bits 9 ~ 12 when bit '13' is 1");
+    return "#";
+  }
+  else
+    return "vldi\t%w0,%1";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vshuf_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f")
+		       (match_operand:V16QI 2 "register_operand" "f")
+		       (match_operand:V16QI 3 "register_operand" "f")]
+		      UNSPEC_LSX_VSHUF_B))]
+  "ISA_HAS_LSX"
+  "vshuf.b\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vldx"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(unspec:V16QI [(match_operand:DI 1 "register_operand" "r")
+		       (match_operand:DI 2 "reg_or_0_operand" "rJ")]
+		      UNSPEC_LSX_VLDX))]
+  "ISA_HAS_LSX"
+{
+  return "vldx\t%w0,%1,%z2";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vstx"
+  [(set (mem:V16QI (plus:DI (match_operand:DI 1 "register_operand" "r")
+			    (match_operand:DI 2 "reg_or_0_operand" "rJ")))
+	(unspec: V16QI [(match_operand:V16QI 0 "register_operand" "f")]
+		      UNSPEC_LSX_VSTX))]
+
+  "ISA_HAS_LSX"
+{
+  return "vstx\t%w0,%1,%z2";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "DI")])
+
+(define_insn "lsx_vextl_qu_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTL_QU_DU))]
+  "ISA_HAS_LSX"
+  "vextl.qu.du\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vseteqz_v"
+  [(set (match_operand:FCC 0 "register_operand" "=z")
+	(eq:FCC
+	  (unspec:SI [(match_operand:V16QI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VSETEQZ_V)
+	  (match_operand:SI 2 "const_0_operand")))]
+  "ISA_HAS_LSX"
+{
+  return "vseteqz.v\t%0,%1";
+}
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "FCC")])
+
+;; Vector reduction operation
+(define_expand "reduc_plus_scal_v2di"
+  [(match_operand:DI 0 "register_operand")
+   (match_operand:V2DI 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (V2DImode);
+  emit_insn (gen_lsx_vhaddw_q_d (tmp, operands[1], operands[1]));
+  emit_insn (gen_vec_extractv2didi (operands[0], tmp, const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_v4si"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:V4SI 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (V2DImode);
+  rtx tmp1 = gen_reg_rtx (V2DImode);
+  emit_insn (gen_lsx_vhaddw_d_w (tmp, operands[1], operands[1]));
+  emit_insn (gen_lsx_vhaddw_q_d (tmp1, tmp, tmp));
+  emit_insn (gen_vec_extractv4sisi (operands[0], gen_lowpart (V4SImode,tmp1),
+				    const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:FLSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_add<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_<optab>_scal_<mode>"
+  [(any_bitwise:<UNITMODE>
+      (match_operand:<UNITMODE> 0 "register_operand")
+      (match_operand:ILSX 1 "register_operand"))]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_<optab><mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smax_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_smax<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smin_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_smin<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umax_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_umax<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umin_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_umin<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_insn "lsx_v<optab>wev_d_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addsubmul:V2DI
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)])))
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.d.w<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_v<optab>wev_w_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addsubmul:V4SI
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.w.h<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_v<optab>wev_h_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addsubmul:V8HI
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.h.b<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_v<optab>wod_d_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addsubmul:V2DI
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)])))
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.d.w<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_v<optab>wod_w_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addsubmul:V4SI
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.w.h<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_v<optab>wod_h_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addsubmul:V8HI
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.h.b<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_v<optab>wev_d_wu_w"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addmul:V2DI
+	  (zero_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)])))
+	  (sign_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.d.wu.w\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_v<optab>wev_w_hu_h"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addmul:V4SI
+	  (zero_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))
+	  (sign_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.w.hu.h\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_v<optab>wev_h_bu_b"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addmul:V8HI
+	  (zero_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))
+	  (sign_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.h.bu.b\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_v<optab>wod_d_wu_w"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addmul:V2DI
+	  (zero_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)])))
+	  (sign_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.d.wu.w\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_v<optab>wod_w_hu_h"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addmul:V4SI
+	  (zero_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (sign_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.w.hu.h\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_v<optab>wod_h_bu_b"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addmul:V8HI
+	  (zero_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (sign_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.h.bu.b\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vaddwev_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWEV))]
+  "ISA_HAS_LSX"
+  "vaddwev.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwev_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWEV2))]
+  "ISA_HAS_LSX"
+  "vaddwev.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwod_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWOD))]
+  "ISA_HAS_LSX"
+  "vaddwod.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwod_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWOD2))]
+  "ISA_HAS_LSX"
+  "vaddwod.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsubwev_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUBWEV))]
+  "ISA_HAS_LSX"
+  "vsubwev.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsubwev_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUBWEV2))]
+  "ISA_HAS_LSX"
+  "vsubwev.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsubwod_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUBWOD))]
+  "ISA_HAS_LSX"
+  "vsubwod.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsubwod_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUBWOD2))]
+  "ISA_HAS_LSX"
+  "vsubwod.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwev_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWEV3))]
+  "ISA_HAS_LSX"
+  "vaddwev.q.du.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwod_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWOD3))]
+  "ISA_HAS_LSX"
+  "vaddwod.q.du.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwev_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWEV3))]
+  "ISA_HAS_LSX"
+  "vmulwev.q.du.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwod_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWOD3))]
+  "ISA_HAS_LSX"
+  "vmulwod.q.du.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwev_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWEV))]
+  "ISA_HAS_LSX"
+  "vmulwev.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwev_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWEV2))]
+  "ISA_HAS_LSX"
+  "vmulwev.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwod_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWOD))]
+  "ISA_HAS_LSX"
+  "vmulwod.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwod_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWOD2))]
+  "ISA_HAS_LSX"
+  "vmulwod.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vhaddw_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VHADDW_Q_D))]
+  "ISA_HAS_LSX"
+  "vhaddw.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vhaddw_qu_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VHADDW_QU_DU))]
+  "ISA_HAS_LSX"
+  "vhaddw.qu.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vhsubw_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VHSUBW_Q_D))]
+  "ISA_HAS_LSX"
+  "vhsubw.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vhsubw_qu_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VHSUBW_QU_DU))]
+  "ISA_HAS_LSX"
+  "vhsubw.qu.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_d_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(plus:V2DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (mult:V2DI
+	    (any_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)])))
+	    (any_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.d.w<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_w_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(plus:V4SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (mult:V4SI
+	    (any_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)])))
+	    (any_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.w.h<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vmaddwev_h_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(plus:V8HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (mult:V8HI
+	    (any_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)])))
+	    (any_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.h.b<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vmaddwod_d_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(plus:V2DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (mult:V2DI
+	    (any_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)])))
+	    (any_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.d.w<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_w_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(plus:V4SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (mult:V4SI
+	    (any_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)])))
+	    (any_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.w.h<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vmaddwod_h_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(plus:V8HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (mult:V8HI
+	    (any_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)])))
+	    (any_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.h.b<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vmaddwev_d_wu_w"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(plus:V2DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (mult:V2DI
+	    (zero_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)])))
+	    (sign_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.d.wu.w\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_w_hu_h"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(plus:V4SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (mult:V4SI
+	    (zero_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)])))
+	    (sign_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.w.hu.h\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vmaddwev_h_bu_b"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(plus:V8HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (mult:V8HI
+	    (zero_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)])))
+	    (sign_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.h.bu.b\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vmaddwod_d_wu_w"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(plus:V2DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (mult:V2DI
+	    (zero_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)])))
+	    (sign_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.d.wu.w\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_w_hu_h"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(plus:V4SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (mult:V4SI
+	    (zero_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)])))
+	    (sign_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.w.hu.h\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vmaddwod_h_bu_b"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(plus:V8HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (mult:V8HI
+	    (zero_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)])))
+	    (sign_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.h.bu.b\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vmaddwev_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWEV))]
+  "ISA_HAS_LSX"
+  "vmaddwev.q.d\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWOD))]
+  "ISA_HAS_LSX"
+  "vmaddwod.q.d\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWEV2))]
+  "ISA_HAS_LSX"
+  "vmaddwev.q.du\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWOD2))]
+  "ISA_HAS_LSX"
+  "vmaddwod.q.du\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWEV3))]
+  "ISA_HAS_LSX"
+  "vmaddwev.q.du.d\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWOD3))]
+  "ISA_HAS_LSX"
+  "vmaddwod.q.du.d\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vrotr_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VROTR))]
+  "ISA_HAS_LSX"
+  "vrotr.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vadd_q"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADD_Q))]
+  "ISA_HAS_LSX"
+  "vadd.q\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsub_q"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUB_Q))]
+  "ISA_HAS_LSX"
+  "vsub.q\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmskgez_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f")]
+		      UNSPEC_LSX_VMSKGEZ))]
+  "ISA_HAS_LSX"
+  "vmskgez.b\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vmsknz_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f")]
+		      UNSPEC_LSX_VMSKNZ))]
+  "ISA_HAS_LSX"
+  "vmsknz.b\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vexth_h<u>_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(any_extend:V8HI
+	  (vec_select:V8QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (parallel [(const_int 8) (const_int 9)
+		       (const_int 10) (const_int 11)
+		       (const_int 12) (const_int 13)
+		       (const_int 14) (const_int 15)]))))]
+  "ISA_HAS_LSX"
+  "vexth.h<u>.b<u>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vexth_w<u>_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(any_extend:V4SI
+	  (vec_select:V4HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (parallel [(const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LSX"
+  "vexth.w<u>.h<u>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vexth_d<u>_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(any_extend:V2DI
+	  (vec_select:V2SI
+	    (match_operand:V4SI 1 "register_operand" "f")
+	    (parallel [(const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LSX"
+  "vexth.d<u>.w<u>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vexth_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTH_Q_D))]
+  "ISA_HAS_LSX"
+  "vexth.q.d\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vexth_qu_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTH_QU_DU))]
+  "ISA_HAS_LSX"
+  "vexth.qu.du\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vrotri_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(rotatert:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")))]
+  "ISA_HAS_LSX"
+  "vrotri.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vextl_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTL_Q_D))]
+  "ISA_HAS_LSX"
+  "vextl.q.d\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsrlni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSRLNI))]
+  "ISA_HAS_LSX"
+  "vsrlni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrlrni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSRLRNI))]
+  "ISA_HAS_LSX"
+  "vsrlrni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRLNI))]
+  "ISA_HAS_LSX"
+  "vssrlni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlni_<lsxfmt_u>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRLNI2))]
+  "ISA_HAS_LSX"
+  "vssrlni.<lsxfmt_u>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlrni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRLRNI))]
+  "ISA_HAS_LSX"
+  "vssrlrni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlrni_<lsxfmt_u>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRLRNI2))]
+  "ISA_HAS_LSX"
+  "vssrlrni.<lsxfmt_u>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrani_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSRANI))]
+  "ISA_HAS_LSX"
+  "vsrani.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrarni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSRARNI))]
+  "ISA_HAS_LSX"
+  "vsrarni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrani_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		    UNSPEC_LSX_VSSRANI))]
+  "ISA_HAS_LSX"
+  "vssrani.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrani_<lsxfmt_u>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRANI2))]
+  "ISA_HAS_LSX"
+  "vssrani.<lsxfmt_u>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrarni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRARNI))]
+  "ISA_HAS_LSX"
+  "vssrarni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrarni_<lsxfmt_u>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRARNI2))]
+  "ISA_HAS_LSX"
+  "vssrarni.<lsxfmt_u>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpermi_w"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "0")
+		      (match_operand:V4SI 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VPERMI))]
+  "ISA_HAS_LSX"
+  "vpermi.w\t%w0,%w2,%3"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V4SI")])
diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md
index 510973aa339..f430629825e 100644
--- a/gcc/config/loongarch/predicates.md
+++ b/gcc/config/loongarch/predicates.md
@@ -87,10 +87,42 @@ (define_predicate "const_immalsl_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 1, 4)")))
 
+(define_predicate "const_lsx_branch_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), -1024, 1023)")))
+
+(define_predicate "const_uimm3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_8_to_11_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 8, 11)")))
+
+(define_predicate "const_12_to_15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 12, 15)")))
+
+(define_predicate "const_uimm4_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 15)")))
+
 (define_predicate "const_uimm5_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 31)")))
 
+(define_predicate "const_uimm6_operand"
+  (and (match_code "const_int")
+       (match_test "UIMM6_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_uimm7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 127)")))
+
+(define_predicate "const_uimm8_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
 (define_predicate "const_uimm14_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 16383)")))
@@ -99,10 +131,74 @@ (define_predicate "const_uimm15_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 32767)")))
 
+(define_predicate "const_imm5_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), -16, 15)")))
+
+(define_predicate "const_imm10_operand"
+  (and (match_code "const_int")
+       (match_test "IMM10_OPERAND (INTVAL (op))")))
+
 (define_predicate "const_imm12_operand"
   (and (match_code "const_int")
        (match_test "IMM12_OPERAND (INTVAL (op))")))
 
+(define_predicate "const_imm13_operand"
+  (and (match_code "const_int")
+       (match_test "IMM13_OPERAND (INTVAL (op))")))
+
+(define_predicate "reg_imm10_operand"
+  (ior (match_operand 0 "const_imm10_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "aq8b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "aq8h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 1)")))
+
+(define_predicate "aq8w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 2)")))
+
+(define_predicate "aq8d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)")))
+
+(define_predicate "aq10b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 0)")))
+
+(define_predicate "aq10h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 1)")))
+
+(define_predicate "aq10w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)")))
+
+(define_predicate "aq10d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 3)")))
+
+(define_predicate "aq12b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 12, 0)")))
+
+(define_predicate "aq12h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 11, 1)")))
+
+(define_predicate "aq12w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)")))
+
+(define_predicate "aq12d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 9, 3)")))
+
 (define_predicate "sle_operand"
   (and (match_code "const_int")
        (match_test "IMM12_OPERAND (INTVAL (op) + 1)")))
@@ -112,29 +208,206 @@ (define_predicate "sleu_operand"
        (match_test "INTVAL (op) + 1 != 0")))
 
 (define_predicate "const_0_operand"
-  (and (match_code "const_int,const_double,const_vector")
+  (and (match_code "const_int,const_wide_int,const_double,const_vector")
        (match_test "op == CONST0_RTX (GET_MODE (op))")))
 
+(define_predicate "const_m1_operand"
+  (and (match_code "const_int,const_wide_int,const_double,const_vector")
+       (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
+
+(define_predicate "reg_or_m1_operand"
+  (ior (match_operand 0 "const_m1_operand")
+       (match_operand 0 "register_operand")))
+
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
        (match_operand 0 "register_operand")))
 
 (define_predicate "const_1_operand"
-  (and (match_code "const_int,const_double,const_vector")
+  (and (match_code "const_int,const_wide_int,const_double,const_vector")
        (match_test "op == CONST1_RTX (GET_MODE (op))")))
 
 (define_predicate "reg_or_1_operand"
   (ior (match_operand 0 "const_1_operand")
        (match_operand 0 "register_operand")))
 
+;; These are used in vec_merge, hence accept bitmask as const_int.
+(define_predicate "const_exp_2_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 1)")))
+
+(define_predicate "const_exp_4_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 3)")))
+
+(define_predicate "const_exp_8_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 7)")))
+
+(define_predicate "const_exp_16_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 15)")))
+
+(define_predicate "const_exp_32_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 31)")))
+
+;; This is used for indexing into vectors, and hence only accepts const_int.
+(define_predicate "const_0_or_1_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
+
+(define_predicate "const_0_to_3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 3)")))
+
+(define_predicate "const_0_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_2_or_3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 2, 3)")))
+
+(define_predicate "const_4_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 4, 7)")))
+
+(define_predicate "const_8_to_15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_16_to_31_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "qi_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xff")))
+
+(define_predicate "hi_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xffff")))
+
 (define_predicate "lu52i_mask_operand"
   (and (match_code "const_int")
        (match_test "UINTVAL (op) == 0xfffffffffffff")))
 
+(define_predicate "si_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xffffffff")))
+
 (define_predicate "low_bitmask_operand"
   (and (match_code "const_int")
        (match_test "low_bitmask_len (mode, INTVAL (op)) > 12")))
 
+(define_predicate "d_operand"
+  (and (match_code "reg")
+       (match_test "GP_REG_P (REGNO (op))")))
+
+(define_predicate "db4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 4, 0)")))
+
+(define_predicate "db7_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 7, 0)")))
+
+(define_predicate "db8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 8, 0)")))
+
+(define_predicate "ib3_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) - 1, 3, 0)")))
+
+(define_predicate "sb4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 4, 0)")))
+
+(define_predicate "sb5_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 5, 0)")))
+
+(define_predicate "sb8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "sd8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)")))
+
+(define_predicate "ub4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 0)")))
+
+(define_predicate "ub8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "uh4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 1)")))
+
+(define_predicate "uw4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 2)")))
+
+(define_predicate "uw5_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 5, 2)")))
+
+(define_predicate "uw6_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 6, 2)")))
+
+(define_predicate "uw8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 2)")))
+
+(define_predicate "addiur2_operand"
+  (and (match_code "const_int")
+	(ior (match_test "INTVAL (op) == -1")
+	     (match_test "INTVAL (op) == 1")
+	     (match_test "INTVAL (op) == 4")
+	     (match_test "INTVAL (op) == 8")
+	     (match_test "INTVAL (op) == 12")
+	     (match_test "INTVAL (op) == 16")
+	     (match_test "INTVAL (op) == 20")
+	     (match_test "INTVAL (op) == 24"))))
+
+(define_predicate "addiusp_operand"
+  (and (match_code "const_int")
+       (ior (match_test "(IN_RANGE (INTVAL (op), 2, 257))")
+	    (match_test "(IN_RANGE (INTVAL (op), -258, -3))"))))
+
+(define_predicate "andi16_operand"
+  (and (match_code "const_int")
+	(ior (match_test "IN_RANGE (INTVAL (op), 1, 4)")
+	     (match_test "IN_RANGE (INTVAL (op), 7, 8)")
+	     (match_test "IN_RANGE (INTVAL (op), 15, 16)")
+	     (match_test "IN_RANGE (INTVAL (op), 31, 32)")
+	     (match_test "IN_RANGE (INTVAL (op), 63, 64)")
+	     (match_test "INTVAL (op) == 255")
+	     (match_test "INTVAL (op) == 32768")
+	     (match_test "INTVAL (op) == 65535"))))
+
+(define_predicate "movep_src_register"
+  (and (match_code "reg")
+       (ior (match_test ("IN_RANGE (REGNO (op), 2, 3)"))
+	    (match_test ("IN_RANGE (REGNO (op), 16, 20)")))))
+
+(define_predicate "movep_src_operand"
+  (ior (match_operand 0 "const_0_operand")
+       (match_operand 0 "movep_src_register")))
+
+(define_predicate "fcc_reload_operand"
+  (and (match_code "reg,subreg")
+       (match_test "FCC_REG_P (true_regnum (op))")))
+
+(define_predicate "muldiv_target_operand"
+		(match_operand 0 "register_operand"))
+
 (define_predicate "const_call_insn_operand"
   (match_code "const,symbol_ref,label_ref")
 {
@@ -303,3 +576,59 @@ (define_predicate "small_data_pattern"
 (define_predicate "non_volatile_mem_operand"
   (and (match_operand 0 "memory_operand")
        (not (match_test "MEM_VOLATILE_P (op)"))))
+
+(define_predicate "const_vector_same_val_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_val_p (op, mode);
+})
+
+(define_predicate "const_vector_same_simm5_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_int_p (op, mode, -16, 15);
+})
+
+(define_predicate "const_vector_same_uimm5_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_int_p (op, mode, 0, 31);
+})
+
+(define_predicate "const_vector_same_ximm5_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_int_p (op, mode, -31, 31);
+})
+
+(define_predicate "const_vector_same_uimm6_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_int_p (op, mode, 0, 63);
+})
+
+(define_predicate "par_const_vector_shf_set_operand"
+  (match_code "parallel")
+{
+  return loongarch_const_vector_shuffle_set_p (op, mode);
+})
+
+(define_predicate "reg_or_vector_same_val_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_val_operand")))
+
+(define_predicate "reg_or_vector_same_simm5_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_simm5_operand")))
+
+(define_predicate "reg_or_vector_same_uimm5_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_uimm5_operand")))
+
+(define_predicate "reg_or_vector_same_ximm5_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_ximm5_operand")))
+
+(define_predicate "reg_or_vector_same_uimm6_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_uimm6_operand")))
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 24453693d89..daa318ee3da 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -2892,6 +2892,17 @@ as @code{st.w} and @code{ld.w}.
 A signed 12-bit constant (for arithmetic instructions).
 @item K
 An unsigned 12-bit constant (for logic instructions).
+@item M
+A constant that cannot be loaded using @code{lui}, @code{addiu}
+or @code{ori}.
+@item N
+A constant in the range -65535 to -1 (inclusive).
+@item O
+A signed 15-bit constant.
+@item P
+A constant in the range 1 to 65535 (inclusive).
+@item R
+An address that can be used in a non-macro load or store.
 @item ZB
 An address that is held in a general-purpose register.
 The offset is zero.
-- 
2.36.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 2/4] LoongArch: Add Loongson SX directive builtin function support.
  2023-08-31  9:08 [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
  2023-08-31  9:08 ` [PATCH v6 1/4] LoongArch: Add Loongson SX base instruction support Chenghui Pan
@ 2023-08-31  9:08 ` Chenghui Pan
  2023-08-31  9:08 ` [PATCH v6 3/4] LoongArch: Add Loongson ASX base instruction support Chenghui Pan
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Chenghui Pan @ 2023-08-31  9:08 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config.gcc: Export the header file lsxintrin.h.
	* config/loongarch/loongarch-builtins.cc (LARCH_FTYPE_NAME4): Add builtin function support.
	(enum loongarch_builtin_type): Ditto.
	(AVAIL_ALL): Ditto.
	(LARCH_BUILTIN): Ditto.
	(LSX_BUILTIN): Ditto.
	(LSX_BUILTIN_TEST_BRANCH): Ditto.
	(LSX_NO_TARGET_BUILTIN): Ditto.
	(CODE_FOR_lsx_vsadd_b): Ditto.
	(CODE_FOR_lsx_vsadd_h): Ditto.
	(CODE_FOR_lsx_vsadd_w): Ditto.
	(CODE_FOR_lsx_vsadd_d): Ditto.
	(CODE_FOR_lsx_vsadd_bu): Ditto.
	(CODE_FOR_lsx_vsadd_hu): Ditto.
	(CODE_FOR_lsx_vsadd_wu): Ditto.
	(CODE_FOR_lsx_vsadd_du): Ditto.
	(CODE_FOR_lsx_vadd_b): Ditto.
	(CODE_FOR_lsx_vadd_h): Ditto.
	(CODE_FOR_lsx_vadd_w): Ditto.
	(CODE_FOR_lsx_vadd_d): Ditto.
	(CODE_FOR_lsx_vaddi_bu): Ditto.
	(CODE_FOR_lsx_vaddi_hu): Ditto.
	(CODE_FOR_lsx_vaddi_wu): Ditto.
	(CODE_FOR_lsx_vaddi_du): Ditto.
	(CODE_FOR_lsx_vand_v): Ditto.
	(CODE_FOR_lsx_vandi_b): Ditto.
	(CODE_FOR_lsx_bnz_v): Ditto.
	(CODE_FOR_lsx_bz_v): Ditto.
	(CODE_FOR_lsx_vbitsel_v): Ditto.
	(CODE_FOR_lsx_vseqi_b): Ditto.
	(CODE_FOR_lsx_vseqi_h): Ditto.
	(CODE_FOR_lsx_vseqi_w): Ditto.
	(CODE_FOR_lsx_vseqi_d): Ditto.
	(CODE_FOR_lsx_vslti_b): Ditto.
	(CODE_FOR_lsx_vslti_h): Ditto.
	(CODE_FOR_lsx_vslti_w): Ditto.
	(CODE_FOR_lsx_vslti_d): Ditto.
	(CODE_FOR_lsx_vslti_bu): Ditto.
	(CODE_FOR_lsx_vslti_hu): Ditto.
	(CODE_FOR_lsx_vslti_wu): Ditto.
	(CODE_FOR_lsx_vslti_du): Ditto.
	(CODE_FOR_lsx_vslei_b): Ditto.
	(CODE_FOR_lsx_vslei_h): Ditto.
	(CODE_FOR_lsx_vslei_w): Ditto.
	(CODE_FOR_lsx_vslei_d): Ditto.
	(CODE_FOR_lsx_vslei_bu): Ditto.
	(CODE_FOR_lsx_vslei_hu): Ditto.
	(CODE_FOR_lsx_vslei_wu): Ditto.
	(CODE_FOR_lsx_vslei_du): Ditto.
	(CODE_FOR_lsx_vdiv_b): Ditto.
	(CODE_FOR_lsx_vdiv_h): Ditto.
	(CODE_FOR_lsx_vdiv_w): Ditto.
	(CODE_FOR_lsx_vdiv_d): Ditto.
	(CODE_FOR_lsx_vdiv_bu): Ditto.
	(CODE_FOR_lsx_vdiv_hu): Ditto.
	(CODE_FOR_lsx_vdiv_wu): Ditto.
	(CODE_FOR_lsx_vdiv_du): Ditto.
	(CODE_FOR_lsx_vfadd_s): Ditto.
	(CODE_FOR_lsx_vfadd_d): Ditto.
	(CODE_FOR_lsx_vftintrz_w_s): Ditto.
	(CODE_FOR_lsx_vftintrz_l_d): Ditto.
	(CODE_FOR_lsx_vftintrz_wu_s): Ditto.
	(CODE_FOR_lsx_vftintrz_lu_d): Ditto.
	(CODE_FOR_lsx_vffint_s_w): Ditto.
	(CODE_FOR_lsx_vffint_d_l): Ditto.
	(CODE_FOR_lsx_vffint_s_wu): Ditto.
	(CODE_FOR_lsx_vffint_d_lu): Ditto.
	(CODE_FOR_lsx_vfsub_s): Ditto.
	(CODE_FOR_lsx_vfsub_d): Ditto.
	(CODE_FOR_lsx_vfmul_s): Ditto.
	(CODE_FOR_lsx_vfmul_d): Ditto.
	(CODE_FOR_lsx_vfdiv_s): Ditto.
	(CODE_FOR_lsx_vfdiv_d): Ditto.
	(CODE_FOR_lsx_vfmax_s): Ditto.
	(CODE_FOR_lsx_vfmax_d): Ditto.
	(CODE_FOR_lsx_vfmin_s): Ditto.
	(CODE_FOR_lsx_vfmin_d): Ditto.
	(CODE_FOR_lsx_vfsqrt_s): Ditto.
	(CODE_FOR_lsx_vfsqrt_d): Ditto.
	(CODE_FOR_lsx_vflogb_s): Ditto.
	(CODE_FOR_lsx_vflogb_d): Ditto.
	(CODE_FOR_lsx_vmax_b): Ditto.
	(CODE_FOR_lsx_vmax_h): Ditto.
	(CODE_FOR_lsx_vmax_w): Ditto.
	(CODE_FOR_lsx_vmax_d): Ditto.
	(CODE_FOR_lsx_vmaxi_b): Ditto.
	(CODE_FOR_lsx_vmaxi_h): Ditto.
	(CODE_FOR_lsx_vmaxi_w): Ditto.
	(CODE_FOR_lsx_vmaxi_d): Ditto.
	(CODE_FOR_lsx_vmax_bu): Ditto.
	(CODE_FOR_lsx_vmax_hu): Ditto.
	(CODE_FOR_lsx_vmax_wu): Ditto.
	(CODE_FOR_lsx_vmax_du): Ditto.
	(CODE_FOR_lsx_vmaxi_bu): Ditto.
	(CODE_FOR_lsx_vmaxi_hu): Ditto.
	(CODE_FOR_lsx_vmaxi_wu): Ditto.
	(CODE_FOR_lsx_vmaxi_du): Ditto.
	(CODE_FOR_lsx_vmin_b): Ditto.
	(CODE_FOR_lsx_vmin_h): Ditto.
	(CODE_FOR_lsx_vmin_w): Ditto.
	(CODE_FOR_lsx_vmin_d): Ditto.
	(CODE_FOR_lsx_vmini_b): Ditto.
	(CODE_FOR_lsx_vmini_h): Ditto.
	(CODE_FOR_lsx_vmini_w): Ditto.
	(CODE_FOR_lsx_vmini_d): Ditto.
	(CODE_FOR_lsx_vmin_bu): Ditto.
	(CODE_FOR_lsx_vmin_hu): Ditto.
	(CODE_FOR_lsx_vmin_wu): Ditto.
	(CODE_FOR_lsx_vmin_du): Ditto.
	(CODE_FOR_lsx_vmini_bu): Ditto.
	(CODE_FOR_lsx_vmini_hu): Ditto.
	(CODE_FOR_lsx_vmini_wu): Ditto.
	(CODE_FOR_lsx_vmini_du): Ditto.
	(CODE_FOR_lsx_vmod_b): Ditto.
	(CODE_FOR_lsx_vmod_h): Ditto.
	(CODE_FOR_lsx_vmod_w): Ditto.
	(CODE_FOR_lsx_vmod_d): Ditto.
	(CODE_FOR_lsx_vmod_bu): Ditto.
	(CODE_FOR_lsx_vmod_hu): Ditto.
	(CODE_FOR_lsx_vmod_wu): Ditto.
	(CODE_FOR_lsx_vmod_du): Ditto.
	(CODE_FOR_lsx_vmul_b): Ditto.
	(CODE_FOR_lsx_vmul_h): Ditto.
	(CODE_FOR_lsx_vmul_w): Ditto.
	(CODE_FOR_lsx_vmul_d): Ditto.
	(CODE_FOR_lsx_vclz_b): Ditto.
	(CODE_FOR_lsx_vclz_h): Ditto.
	(CODE_FOR_lsx_vclz_w): Ditto.
	(CODE_FOR_lsx_vclz_d): Ditto.
	(CODE_FOR_lsx_vnor_v): Ditto.
	(CODE_FOR_lsx_vor_v): Ditto.
	(CODE_FOR_lsx_vori_b): Ditto.
	(CODE_FOR_lsx_vnori_b): Ditto.
	(CODE_FOR_lsx_vpcnt_b): Ditto.
	(CODE_FOR_lsx_vpcnt_h): Ditto.
	(CODE_FOR_lsx_vpcnt_w): Ditto.
	(CODE_FOR_lsx_vpcnt_d): Ditto.
	(CODE_FOR_lsx_vxor_v): Ditto.
	(CODE_FOR_lsx_vxori_b): Ditto.
	(CODE_FOR_lsx_vsll_b): Ditto.
	(CODE_FOR_lsx_vsll_h): Ditto.
	(CODE_FOR_lsx_vsll_w): Ditto.
	(CODE_FOR_lsx_vsll_d): Ditto.
	(CODE_FOR_lsx_vslli_b): Ditto.
	(CODE_FOR_lsx_vslli_h): Ditto.
	(CODE_FOR_lsx_vslli_w): Ditto.
	(CODE_FOR_lsx_vslli_d): Ditto.
	(CODE_FOR_lsx_vsra_b): Ditto.
	(CODE_FOR_lsx_vsra_h): Ditto.
	(CODE_FOR_lsx_vsra_w): Ditto.
	(CODE_FOR_lsx_vsra_d): Ditto.
	(CODE_FOR_lsx_vsrai_b): Ditto.
	(CODE_FOR_lsx_vsrai_h): Ditto.
	(CODE_FOR_lsx_vsrai_w): Ditto.
	(CODE_FOR_lsx_vsrai_d): Ditto.
	(CODE_FOR_lsx_vsrl_b): Ditto.
	(CODE_FOR_lsx_vsrl_h): Ditto.
	(CODE_FOR_lsx_vsrl_w): Ditto.
	(CODE_FOR_lsx_vsrl_d): Ditto.
	(CODE_FOR_lsx_vsrli_b): Ditto.
	(CODE_FOR_lsx_vsrli_h): Ditto.
	(CODE_FOR_lsx_vsrli_w): Ditto.
	(CODE_FOR_lsx_vsrli_d): Ditto.
	(CODE_FOR_lsx_vsub_b): Ditto.
	(CODE_FOR_lsx_vsub_h): Ditto.
	(CODE_FOR_lsx_vsub_w): Ditto.
	(CODE_FOR_lsx_vsub_d): Ditto.
	(CODE_FOR_lsx_vsubi_bu): Ditto.
	(CODE_FOR_lsx_vsubi_hu): Ditto.
	(CODE_FOR_lsx_vsubi_wu): Ditto.
	(CODE_FOR_lsx_vsubi_du): Ditto.
	(CODE_FOR_lsx_vpackod_d): Ditto.
	(CODE_FOR_lsx_vpackev_d): Ditto.
	(CODE_FOR_lsx_vpickod_d): Ditto.
	(CODE_FOR_lsx_vpickev_d): Ditto.
	(CODE_FOR_lsx_vrepli_b): Ditto.
	(CODE_FOR_lsx_vrepli_h): Ditto.
	(CODE_FOR_lsx_vrepli_w): Ditto.
	(CODE_FOR_lsx_vrepli_d): Ditto.
	(CODE_FOR_lsx_vsat_b): Ditto.
	(CODE_FOR_lsx_vsat_h): Ditto.
	(CODE_FOR_lsx_vsat_w): Ditto.
	(CODE_FOR_lsx_vsat_d): Ditto.
	(CODE_FOR_lsx_vsat_bu): Ditto.
	(CODE_FOR_lsx_vsat_hu): Ditto.
	(CODE_FOR_lsx_vsat_wu): Ditto.
	(CODE_FOR_lsx_vsat_du): Ditto.
	(CODE_FOR_lsx_vavg_b): Ditto.
	(CODE_FOR_lsx_vavg_h): Ditto.
	(CODE_FOR_lsx_vavg_w): Ditto.
	(CODE_FOR_lsx_vavg_d): Ditto.
	(CODE_FOR_lsx_vavg_bu): Ditto.
	(CODE_FOR_lsx_vavg_hu): Ditto.
	(CODE_FOR_lsx_vavg_wu): Ditto.
	(CODE_FOR_lsx_vavg_du): Ditto.
	(CODE_FOR_lsx_vavgr_b): Ditto.
	(CODE_FOR_lsx_vavgr_h): Ditto.
	(CODE_FOR_lsx_vavgr_w): Ditto.
	(CODE_FOR_lsx_vavgr_d): Ditto.
	(CODE_FOR_lsx_vavgr_bu): Ditto.
	(CODE_FOR_lsx_vavgr_hu): Ditto.
	(CODE_FOR_lsx_vavgr_wu): Ditto.
	(CODE_FOR_lsx_vavgr_du): Ditto.
	(CODE_FOR_lsx_vssub_b): Ditto.
	(CODE_FOR_lsx_vssub_h): Ditto.
	(CODE_FOR_lsx_vssub_w): Ditto.
	(CODE_FOR_lsx_vssub_d): Ditto.
	(CODE_FOR_lsx_vssub_bu): Ditto.
	(CODE_FOR_lsx_vssub_hu): Ditto.
	(CODE_FOR_lsx_vssub_wu): Ditto.
	(CODE_FOR_lsx_vssub_du): Ditto.
	(CODE_FOR_lsx_vabsd_b): Ditto.
	(CODE_FOR_lsx_vabsd_h): Ditto.
	(CODE_FOR_lsx_vabsd_w): Ditto.
	(CODE_FOR_lsx_vabsd_d): Ditto.
	(CODE_FOR_lsx_vabsd_bu): Ditto.
	(CODE_FOR_lsx_vabsd_hu): Ditto.
	(CODE_FOR_lsx_vabsd_wu): Ditto.
	(CODE_FOR_lsx_vabsd_du): Ditto.
	(CODE_FOR_lsx_vftint_w_s): Ditto.
	(CODE_FOR_lsx_vftint_l_d): Ditto.
	(CODE_FOR_lsx_vftint_wu_s): Ditto.
	(CODE_FOR_lsx_vftint_lu_d): Ditto.
	(CODE_FOR_lsx_vandn_v): Ditto.
	(CODE_FOR_lsx_vorn_v): Ditto.
	(CODE_FOR_lsx_vneg_b): Ditto.
	(CODE_FOR_lsx_vneg_h): Ditto.
	(CODE_FOR_lsx_vneg_w): Ditto.
	(CODE_FOR_lsx_vneg_d): Ditto.
	(CODE_FOR_lsx_vshuf4i_d): Ditto.
	(CODE_FOR_lsx_vbsrl_v): Ditto.
	(CODE_FOR_lsx_vbsll_v): Ditto.
	(CODE_FOR_lsx_vfmadd_s): Ditto.
	(CODE_FOR_lsx_vfmadd_d): Ditto.
	(CODE_FOR_lsx_vfmsub_s): Ditto.
	(CODE_FOR_lsx_vfmsub_d): Ditto.
	(CODE_FOR_lsx_vfnmadd_s): Ditto.
	(CODE_FOR_lsx_vfnmadd_d): Ditto.
	(CODE_FOR_lsx_vfnmsub_s): Ditto.
	(CODE_FOR_lsx_vfnmsub_d): Ditto.
	(CODE_FOR_lsx_vmuh_b): Ditto.
	(CODE_FOR_lsx_vmuh_h): Ditto.
	(CODE_FOR_lsx_vmuh_w): Ditto.
	(CODE_FOR_lsx_vmuh_d): Ditto.
	(CODE_FOR_lsx_vmuh_bu): Ditto.
	(CODE_FOR_lsx_vmuh_hu): Ditto.
	(CODE_FOR_lsx_vmuh_wu): Ditto.
	(CODE_FOR_lsx_vmuh_du): Ditto.
	(CODE_FOR_lsx_vsllwil_h_b): Ditto.
	(CODE_FOR_lsx_vsllwil_w_h): Ditto.
	(CODE_FOR_lsx_vsllwil_d_w): Ditto.
	(CODE_FOR_lsx_vsllwil_hu_bu): Ditto.
	(CODE_FOR_lsx_vsllwil_wu_hu): Ditto.
	(CODE_FOR_lsx_vsllwil_du_wu): Ditto.
	(CODE_FOR_lsx_vssran_b_h): Ditto.
	(CODE_FOR_lsx_vssran_h_w): Ditto.
	(CODE_FOR_lsx_vssran_w_d): Ditto.
	(CODE_FOR_lsx_vssran_bu_h): Ditto.
	(CODE_FOR_lsx_vssran_hu_w): Ditto.
	(CODE_FOR_lsx_vssran_wu_d): Ditto.
	(CODE_FOR_lsx_vssrarn_b_h): Ditto.
	(CODE_FOR_lsx_vssrarn_h_w): Ditto.
	(CODE_FOR_lsx_vssrarn_w_d): Ditto.
	(CODE_FOR_lsx_vssrarn_bu_h): Ditto.
	(CODE_FOR_lsx_vssrarn_hu_w): Ditto.
	(CODE_FOR_lsx_vssrarn_wu_d): Ditto.
	(CODE_FOR_lsx_vssrln_bu_h): Ditto.
	(CODE_FOR_lsx_vssrln_hu_w): Ditto.
	(CODE_FOR_lsx_vssrln_wu_d): Ditto.
	(CODE_FOR_lsx_vssrlrn_bu_h): Ditto.
	(CODE_FOR_lsx_vssrlrn_hu_w): Ditto.
	(CODE_FOR_lsx_vssrlrn_wu_d): Ditto.
	(loongarch_builtin_vector_type): Ditto.
	(loongarch_build_cvpointer_type): Ditto.
	(LARCH_ATYPE_CVPOINTER): Ditto.
	(LARCH_ATYPE_BOOLEAN): Ditto.
	(LARCH_ATYPE_V2SF): Ditto.
	(LARCH_ATYPE_V2HI): Ditto.
	(LARCH_ATYPE_V2SI): Ditto.
	(LARCH_ATYPE_V4QI): Ditto.
	(LARCH_ATYPE_V4HI): Ditto.
	(LARCH_ATYPE_V8QI): Ditto.
	(LARCH_ATYPE_V2DI): Ditto.
	(LARCH_ATYPE_V4SI): Ditto.
	(LARCH_ATYPE_V8HI): Ditto.
	(LARCH_ATYPE_V16QI): Ditto.
	(LARCH_ATYPE_V2DF): Ditto.
	(LARCH_ATYPE_V4SF): Ditto.
	(LARCH_ATYPE_V4DI): Ditto.
	(LARCH_ATYPE_V8SI): Ditto.
	(LARCH_ATYPE_V16HI): Ditto.
	(LARCH_ATYPE_V32QI): Ditto.
	(LARCH_ATYPE_V4DF): Ditto.
	(LARCH_ATYPE_V8SF): Ditto.
	(LARCH_ATYPE_UV2DI): Ditto.
	(LARCH_ATYPE_UV4SI): Ditto.
	(LARCH_ATYPE_UV8HI): Ditto.
	(LARCH_ATYPE_UV16QI): Ditto.
	(LARCH_ATYPE_UV4DI): Ditto.
	(LARCH_ATYPE_UV8SI): Ditto.
	(LARCH_ATYPE_UV16HI): Ditto.
	(LARCH_ATYPE_UV32QI): Ditto.
	(LARCH_ATYPE_UV2SI): Ditto.
	(LARCH_ATYPE_UV4HI): Ditto.
	(LARCH_ATYPE_UV8QI): Ditto.
	(loongarch_builtin_vectorized_function): Ditto.
	(LARCH_GET_BUILTIN): Ditto.
	(loongarch_expand_builtin_insn): Ditto.
	(loongarch_expand_builtin_lsx_test_branch): Ditto.
	(loongarch_expand_builtin): Ditto.
	* config/loongarch/loongarch-ftypes.def (1): Ditto.
	(2): Ditto.
	(3): Ditto.
	(4): Ditto.
	* config/loongarch/lsxintrin.h: New file.
---
 gcc/config.gcc                             |    2 +-
 gcc/config/loongarch/loongarch-builtins.cc | 1498 +++++-
 gcc/config/loongarch/loongarch-ftypes.def  |  397 +-
 gcc/config/loongarch/lsxintrin.h           | 5181 ++++++++++++++++++++
 4 files changed, 7071 insertions(+), 7 deletions(-)
 create mode 100644 gcc/config/loongarch/lsxintrin.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b77e1fd5278..8fda51e157c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -469,7 +469,7 @@ mips*-*-*)
 	;;
 loongarch*-*-*)
 	cpu_type=loongarch
-	extra_headers="larchintrin.h"
+	extra_headers="larchintrin.h lsxintrin.h"
 	extra_objs="loongarch-c.o loongarch-builtins.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
 	extra_gcc_objs="loongarch-driver.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
 	extra_options="${extra_options} g.opt fused-madd.opt"
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
index ebe70a986c3..5958f5b7fbe 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -34,14 +34,18 @@ along with GCC; see the file COPYING3.  If not see
 #include "recog.h"
 #include "diagnostic.h"
 #include "fold-const.h"
+#include "explow.h"
 #include "expr.h"
 #include "langhooks.h"
 #include "emit-rtl.h"
+#include "case-cfn-macros.h"
 
 /* Macros to create an enumeration identifier for a function prototype.  */
 #define LARCH_FTYPE_NAME1(A, B) LARCH_##A##_FTYPE_##B
 #define LARCH_FTYPE_NAME2(A, B, C) LARCH_##A##_FTYPE_##B##_##C
 #define LARCH_FTYPE_NAME3(A, B, C, D) LARCH_##A##_FTYPE_##B##_##C##_##D
+#define LARCH_FTYPE_NAME4(A, B, C, D, E) \
+  LARCH_##A##_FTYPE_##B##_##C##_##D##_##E
 
 /* Classifies the prototype of a built-in function.  */
 enum loongarch_function_type
@@ -64,6 +68,12 @@ enum loongarch_builtin_type
      value and the arguments are mapped to operands 0 and above.  */
   LARCH_BUILTIN_DIRECT_NO_TARGET,
 
+  /* For generating LoongArch LSX.  */
+  LARCH_BUILTIN_LSX,
+
+  /* The function corresponds to an LSX conditional branch instruction
+     combined with a compare instruction.  */
+  LARCH_BUILTIN_LSX_TEST_BRANCH,
 };
 
 /* Declare an availability predicate for built-in functions that require
@@ -101,6 +111,7 @@ struct loongarch_builtin_description
 };
 
 AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
+AVAIL_ALL (lsx, ISA_HAS_LSX)
 
 /* Construct a loongarch_builtin_description from the given arguments.
 
@@ -120,8 +131,8 @@ AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
 #define LARCH_BUILTIN(INSN, NAME, BUILTIN_TYPE, FUNCTION_TYPE, AVAIL) \
   { \
     CODE_FOR_loongarch_##INSN, "__builtin_loongarch_" NAME, \
-      BUILTIN_TYPE, FUNCTION_TYPE, \
-      loongarch_builtin_avail_##AVAIL \
+    BUILTIN_TYPE, FUNCTION_TYPE, \
+    loongarch_builtin_avail_##AVAIL \
   }
 
 /* Define __builtin_loongarch_<INSN>, which is a LARCH_BUILTIN_DIRECT function
@@ -137,6 +148,300 @@ AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
   LARCH_BUILTIN (INSN, #INSN, LARCH_BUILTIN_DIRECT_NO_TARGET, \
 		 FUNCTION_TYPE, AVAIL)
 
+/* Define an LSX LARCH_BUILTIN_DIRECT function __builtin_lsx_<INSN>
+   for instruction CODE_FOR_lsx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LSX_BUILTIN(INSN, FUNCTION_TYPE)				\
+  { CODE_FOR_lsx_ ## INSN,						\
+    "__builtin_lsx_" #INSN,  LARCH_BUILTIN_DIRECT,			\
+    FUNCTION_TYPE, loongarch_builtin_avail_lsx }
+
+
+/* Define an LSX LARCH_BUILTIN_LSX_TEST_BRANCH function __builtin_lsx_<INSN>
+   for instruction CODE_FOR_lsx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LSX_BUILTIN_TEST_BRANCH(INSN, FUNCTION_TYPE)			\
+  { CODE_FOR_lsx_ ## INSN,						\
+    "__builtin_lsx_" #INSN, LARCH_BUILTIN_LSX_TEST_BRANCH,		\
+    FUNCTION_TYPE, loongarch_builtin_avail_lsx }
+
+/* Define an LSX LARCH_BUILTIN_DIRECT_NO_TARGET function __builtin_lsx_<INSN>
+   for instruction CODE_FOR_lsx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LSX_NO_TARGET_BUILTIN(INSN, FUNCTION_TYPE)			\
+  { CODE_FOR_lsx_ ## INSN,						\
+    "__builtin_lsx_" #INSN,  LARCH_BUILTIN_DIRECT_NO_TARGET,		\
+    FUNCTION_TYPE, loongarch_builtin_avail_lsx }
+
+/* LoongArch SX define CODE_FOR_lsx_xxx */
+#define CODE_FOR_lsx_vsadd_b CODE_FOR_ssaddv16qi3
+#define CODE_FOR_lsx_vsadd_h CODE_FOR_ssaddv8hi3
+#define CODE_FOR_lsx_vsadd_w CODE_FOR_ssaddv4si3
+#define CODE_FOR_lsx_vsadd_d CODE_FOR_ssaddv2di3
+#define CODE_FOR_lsx_vsadd_bu CODE_FOR_usaddv16qi3
+#define CODE_FOR_lsx_vsadd_hu CODE_FOR_usaddv8hi3
+#define CODE_FOR_lsx_vsadd_wu CODE_FOR_usaddv4si3
+#define CODE_FOR_lsx_vsadd_du CODE_FOR_usaddv2di3
+#define CODE_FOR_lsx_vadd_b CODE_FOR_addv16qi3
+#define CODE_FOR_lsx_vadd_h CODE_FOR_addv8hi3
+#define CODE_FOR_lsx_vadd_w CODE_FOR_addv4si3
+#define CODE_FOR_lsx_vadd_d CODE_FOR_addv2di3
+#define CODE_FOR_lsx_vaddi_bu CODE_FOR_addv16qi3
+#define CODE_FOR_lsx_vaddi_hu CODE_FOR_addv8hi3
+#define CODE_FOR_lsx_vaddi_wu CODE_FOR_addv4si3
+#define CODE_FOR_lsx_vaddi_du CODE_FOR_addv2di3
+#define CODE_FOR_lsx_vand_v CODE_FOR_andv16qi3
+#define CODE_FOR_lsx_vandi_b CODE_FOR_andv16qi3
+#define CODE_FOR_lsx_bnz_v CODE_FOR_lsx_bnz_v_b
+#define CODE_FOR_lsx_bz_v CODE_FOR_lsx_bz_v_b
+#define CODE_FOR_lsx_vbitsel_v CODE_FOR_lsx_vbitsel_b
+#define CODE_FOR_lsx_vseqi_b CODE_FOR_lsx_vseq_b
+#define CODE_FOR_lsx_vseqi_h CODE_FOR_lsx_vseq_h
+#define CODE_FOR_lsx_vseqi_w CODE_FOR_lsx_vseq_w
+#define CODE_FOR_lsx_vseqi_d CODE_FOR_lsx_vseq_d
+#define CODE_FOR_lsx_vslti_b CODE_FOR_lsx_vslt_b
+#define CODE_FOR_lsx_vslti_h CODE_FOR_lsx_vslt_h
+#define CODE_FOR_lsx_vslti_w CODE_FOR_lsx_vslt_w
+#define CODE_FOR_lsx_vslti_d CODE_FOR_lsx_vslt_d
+#define CODE_FOR_lsx_vslti_bu CODE_FOR_lsx_vslt_bu
+#define CODE_FOR_lsx_vslti_hu CODE_FOR_lsx_vslt_hu
+#define CODE_FOR_lsx_vslti_wu CODE_FOR_lsx_vslt_wu
+#define CODE_FOR_lsx_vslti_du CODE_FOR_lsx_vslt_du
+#define CODE_FOR_lsx_vslei_b CODE_FOR_lsx_vsle_b
+#define CODE_FOR_lsx_vslei_h CODE_FOR_lsx_vsle_h
+#define CODE_FOR_lsx_vslei_w CODE_FOR_lsx_vsle_w
+#define CODE_FOR_lsx_vslei_d CODE_FOR_lsx_vsle_d
+#define CODE_FOR_lsx_vslei_bu CODE_FOR_lsx_vsle_bu
+#define CODE_FOR_lsx_vslei_hu CODE_FOR_lsx_vsle_hu
+#define CODE_FOR_lsx_vslei_wu CODE_FOR_lsx_vsle_wu
+#define CODE_FOR_lsx_vslei_du CODE_FOR_lsx_vsle_du
+#define CODE_FOR_lsx_vdiv_b CODE_FOR_divv16qi3
+#define CODE_FOR_lsx_vdiv_h CODE_FOR_divv8hi3
+#define CODE_FOR_lsx_vdiv_w CODE_FOR_divv4si3
+#define CODE_FOR_lsx_vdiv_d CODE_FOR_divv2di3
+#define CODE_FOR_lsx_vdiv_bu CODE_FOR_udivv16qi3
+#define CODE_FOR_lsx_vdiv_hu CODE_FOR_udivv8hi3
+#define CODE_FOR_lsx_vdiv_wu CODE_FOR_udivv4si3
+#define CODE_FOR_lsx_vdiv_du CODE_FOR_udivv2di3
+#define CODE_FOR_lsx_vfadd_s CODE_FOR_addv4sf3
+#define CODE_FOR_lsx_vfadd_d CODE_FOR_addv2df3
+#define CODE_FOR_lsx_vftintrz_w_s CODE_FOR_fix_truncv4sfv4si2
+#define CODE_FOR_lsx_vftintrz_l_d CODE_FOR_fix_truncv2dfv2di2
+#define CODE_FOR_lsx_vftintrz_wu_s CODE_FOR_fixuns_truncv4sfv4si2
+#define CODE_FOR_lsx_vftintrz_lu_d CODE_FOR_fixuns_truncv2dfv2di2
+#define CODE_FOR_lsx_vffint_s_w CODE_FOR_floatv4siv4sf2
+#define CODE_FOR_lsx_vffint_d_l CODE_FOR_floatv2div2df2
+#define CODE_FOR_lsx_vffint_s_wu CODE_FOR_floatunsv4siv4sf2
+#define CODE_FOR_lsx_vffint_d_lu CODE_FOR_floatunsv2div2df2
+#define CODE_FOR_lsx_vfsub_s CODE_FOR_subv4sf3
+#define CODE_FOR_lsx_vfsub_d CODE_FOR_subv2df3
+#define CODE_FOR_lsx_vfmul_s CODE_FOR_mulv4sf3
+#define CODE_FOR_lsx_vfmul_d CODE_FOR_mulv2df3
+#define CODE_FOR_lsx_vfdiv_s CODE_FOR_divv4sf3
+#define CODE_FOR_lsx_vfdiv_d CODE_FOR_divv2df3
+#define CODE_FOR_lsx_vfmax_s CODE_FOR_smaxv4sf3
+#define CODE_FOR_lsx_vfmax_d CODE_FOR_smaxv2df3
+#define CODE_FOR_lsx_vfmin_s CODE_FOR_sminv4sf3
+#define CODE_FOR_lsx_vfmin_d CODE_FOR_sminv2df3
+#define CODE_FOR_lsx_vfsqrt_s CODE_FOR_sqrtv4sf2
+#define CODE_FOR_lsx_vfsqrt_d CODE_FOR_sqrtv2df2
+#define CODE_FOR_lsx_vflogb_s CODE_FOR_logbv4sf2
+#define CODE_FOR_lsx_vflogb_d CODE_FOR_logbv2df2
+#define CODE_FOR_lsx_vmax_b CODE_FOR_smaxv16qi3
+#define CODE_FOR_lsx_vmax_h CODE_FOR_smaxv8hi3
+#define CODE_FOR_lsx_vmax_w CODE_FOR_smaxv4si3
+#define CODE_FOR_lsx_vmax_d CODE_FOR_smaxv2di3
+#define CODE_FOR_lsx_vmaxi_b CODE_FOR_smaxv16qi3
+#define CODE_FOR_lsx_vmaxi_h CODE_FOR_smaxv8hi3
+#define CODE_FOR_lsx_vmaxi_w CODE_FOR_smaxv4si3
+#define CODE_FOR_lsx_vmaxi_d CODE_FOR_smaxv2di3
+#define CODE_FOR_lsx_vmax_bu CODE_FOR_umaxv16qi3
+#define CODE_FOR_lsx_vmax_hu CODE_FOR_umaxv8hi3
+#define CODE_FOR_lsx_vmax_wu CODE_FOR_umaxv4si3
+#define CODE_FOR_lsx_vmax_du CODE_FOR_umaxv2di3
+#define CODE_FOR_lsx_vmaxi_bu CODE_FOR_umaxv16qi3
+#define CODE_FOR_lsx_vmaxi_hu CODE_FOR_umaxv8hi3
+#define CODE_FOR_lsx_vmaxi_wu CODE_FOR_umaxv4si3
+#define CODE_FOR_lsx_vmaxi_du CODE_FOR_umaxv2di3
+#define CODE_FOR_lsx_vmin_b CODE_FOR_sminv16qi3
+#define CODE_FOR_lsx_vmin_h CODE_FOR_sminv8hi3
+#define CODE_FOR_lsx_vmin_w CODE_FOR_sminv4si3
+#define CODE_FOR_lsx_vmin_d CODE_FOR_sminv2di3
+#define CODE_FOR_lsx_vmini_b CODE_FOR_sminv16qi3
+#define CODE_FOR_lsx_vmini_h CODE_FOR_sminv8hi3
+#define CODE_FOR_lsx_vmini_w CODE_FOR_sminv4si3
+#define CODE_FOR_lsx_vmini_d CODE_FOR_sminv2di3
+#define CODE_FOR_lsx_vmin_bu CODE_FOR_uminv16qi3
+#define CODE_FOR_lsx_vmin_hu CODE_FOR_uminv8hi3
+#define CODE_FOR_lsx_vmin_wu CODE_FOR_uminv4si3
+#define CODE_FOR_lsx_vmin_du CODE_FOR_uminv2di3
+#define CODE_FOR_lsx_vmini_bu CODE_FOR_uminv16qi3
+#define CODE_FOR_lsx_vmini_hu CODE_FOR_uminv8hi3
+#define CODE_FOR_lsx_vmini_wu CODE_FOR_uminv4si3
+#define CODE_FOR_lsx_vmini_du CODE_FOR_uminv2di3
+#define CODE_FOR_lsx_vmod_b CODE_FOR_modv16qi3
+#define CODE_FOR_lsx_vmod_h CODE_FOR_modv8hi3
+#define CODE_FOR_lsx_vmod_w CODE_FOR_modv4si3
+#define CODE_FOR_lsx_vmod_d CODE_FOR_modv2di3
+#define CODE_FOR_lsx_vmod_bu CODE_FOR_umodv16qi3
+#define CODE_FOR_lsx_vmod_hu CODE_FOR_umodv8hi3
+#define CODE_FOR_lsx_vmod_wu CODE_FOR_umodv4si3
+#define CODE_FOR_lsx_vmod_du CODE_FOR_umodv2di3
+#define CODE_FOR_lsx_vmul_b CODE_FOR_mulv16qi3
+#define CODE_FOR_lsx_vmul_h CODE_FOR_mulv8hi3
+#define CODE_FOR_lsx_vmul_w CODE_FOR_mulv4si3
+#define CODE_FOR_lsx_vmul_d CODE_FOR_mulv2di3
+#define CODE_FOR_lsx_vclz_b CODE_FOR_clzv16qi2
+#define CODE_FOR_lsx_vclz_h CODE_FOR_clzv8hi2
+#define CODE_FOR_lsx_vclz_w CODE_FOR_clzv4si2
+#define CODE_FOR_lsx_vclz_d CODE_FOR_clzv2di2
+#define CODE_FOR_lsx_vnor_v CODE_FOR_lsx_nor_b
+#define CODE_FOR_lsx_vor_v CODE_FOR_iorv16qi3
+#define CODE_FOR_lsx_vori_b CODE_FOR_iorv16qi3
+#define CODE_FOR_lsx_vnori_b CODE_FOR_lsx_nor_b
+#define CODE_FOR_lsx_vpcnt_b CODE_FOR_popcountv16qi2
+#define CODE_FOR_lsx_vpcnt_h CODE_FOR_popcountv8hi2
+#define CODE_FOR_lsx_vpcnt_w CODE_FOR_popcountv4si2
+#define CODE_FOR_lsx_vpcnt_d CODE_FOR_popcountv2di2
+#define CODE_FOR_lsx_vxor_v CODE_FOR_xorv16qi3
+#define CODE_FOR_lsx_vxori_b CODE_FOR_xorv16qi3
+#define CODE_FOR_lsx_vsll_b CODE_FOR_vashlv16qi3
+#define CODE_FOR_lsx_vsll_h CODE_FOR_vashlv8hi3
+#define CODE_FOR_lsx_vsll_w CODE_FOR_vashlv4si3
+#define CODE_FOR_lsx_vsll_d CODE_FOR_vashlv2di3
+#define CODE_FOR_lsx_vslli_b CODE_FOR_vashlv16qi3
+#define CODE_FOR_lsx_vslli_h CODE_FOR_vashlv8hi3
+#define CODE_FOR_lsx_vslli_w CODE_FOR_vashlv4si3
+#define CODE_FOR_lsx_vslli_d CODE_FOR_vashlv2di3
+#define CODE_FOR_lsx_vsra_b CODE_FOR_vashrv16qi3
+#define CODE_FOR_lsx_vsra_h CODE_FOR_vashrv8hi3
+#define CODE_FOR_lsx_vsra_w CODE_FOR_vashrv4si3
+#define CODE_FOR_lsx_vsra_d CODE_FOR_vashrv2di3
+#define CODE_FOR_lsx_vsrai_b CODE_FOR_vashrv16qi3
+#define CODE_FOR_lsx_vsrai_h CODE_FOR_vashrv8hi3
+#define CODE_FOR_lsx_vsrai_w CODE_FOR_vashrv4si3
+#define CODE_FOR_lsx_vsrai_d CODE_FOR_vashrv2di3
+#define CODE_FOR_lsx_vsrl_b CODE_FOR_vlshrv16qi3
+#define CODE_FOR_lsx_vsrl_h CODE_FOR_vlshrv8hi3
+#define CODE_FOR_lsx_vsrl_w CODE_FOR_vlshrv4si3
+#define CODE_FOR_lsx_vsrl_d CODE_FOR_vlshrv2di3
+#define CODE_FOR_lsx_vsrli_b CODE_FOR_vlshrv16qi3
+#define CODE_FOR_lsx_vsrli_h CODE_FOR_vlshrv8hi3
+#define CODE_FOR_lsx_vsrli_w CODE_FOR_vlshrv4si3
+#define CODE_FOR_lsx_vsrli_d CODE_FOR_vlshrv2di3
+#define CODE_FOR_lsx_vsub_b CODE_FOR_subv16qi3
+#define CODE_FOR_lsx_vsub_h CODE_FOR_subv8hi3
+#define CODE_FOR_lsx_vsub_w CODE_FOR_subv4si3
+#define CODE_FOR_lsx_vsub_d CODE_FOR_subv2di3
+#define CODE_FOR_lsx_vsubi_bu CODE_FOR_subv16qi3
+#define CODE_FOR_lsx_vsubi_hu CODE_FOR_subv8hi3
+#define CODE_FOR_lsx_vsubi_wu CODE_FOR_subv4si3
+#define CODE_FOR_lsx_vsubi_du CODE_FOR_subv2di3
+
+#define CODE_FOR_lsx_vpackod_d CODE_FOR_lsx_vilvh_d
+#define CODE_FOR_lsx_vpackev_d CODE_FOR_lsx_vilvl_d
+#define CODE_FOR_lsx_vpickod_d CODE_FOR_lsx_vilvh_d
+#define CODE_FOR_lsx_vpickev_d CODE_FOR_lsx_vilvl_d
+
+#define CODE_FOR_lsx_vrepli_b CODE_FOR_lsx_vrepliv16qi
+#define CODE_FOR_lsx_vrepli_h CODE_FOR_lsx_vrepliv8hi
+#define CODE_FOR_lsx_vrepli_w CODE_FOR_lsx_vrepliv4si
+#define CODE_FOR_lsx_vrepli_d CODE_FOR_lsx_vrepliv2di
+#define CODE_FOR_lsx_vsat_b CODE_FOR_lsx_vsat_s_b
+#define CODE_FOR_lsx_vsat_h CODE_FOR_lsx_vsat_s_h
+#define CODE_FOR_lsx_vsat_w CODE_FOR_lsx_vsat_s_w
+#define CODE_FOR_lsx_vsat_d CODE_FOR_lsx_vsat_s_d
+#define CODE_FOR_lsx_vsat_bu CODE_FOR_lsx_vsat_u_bu
+#define CODE_FOR_lsx_vsat_hu CODE_FOR_lsx_vsat_u_hu
+#define CODE_FOR_lsx_vsat_wu CODE_FOR_lsx_vsat_u_wu
+#define CODE_FOR_lsx_vsat_du CODE_FOR_lsx_vsat_u_du
+#define CODE_FOR_lsx_vavg_b CODE_FOR_lsx_vavg_s_b
+#define CODE_FOR_lsx_vavg_h CODE_FOR_lsx_vavg_s_h
+#define CODE_FOR_lsx_vavg_w CODE_FOR_lsx_vavg_s_w
+#define CODE_FOR_lsx_vavg_d CODE_FOR_lsx_vavg_s_d
+#define CODE_FOR_lsx_vavg_bu CODE_FOR_lsx_vavg_u_bu
+#define CODE_FOR_lsx_vavg_hu CODE_FOR_lsx_vavg_u_hu
+#define CODE_FOR_lsx_vavg_wu CODE_FOR_lsx_vavg_u_wu
+#define CODE_FOR_lsx_vavg_du CODE_FOR_lsx_vavg_u_du
+#define CODE_FOR_lsx_vavgr_b CODE_FOR_lsx_vavgr_s_b
+#define CODE_FOR_lsx_vavgr_h CODE_FOR_lsx_vavgr_s_h
+#define CODE_FOR_lsx_vavgr_w CODE_FOR_lsx_vavgr_s_w
+#define CODE_FOR_lsx_vavgr_d CODE_FOR_lsx_vavgr_s_d
+#define CODE_FOR_lsx_vavgr_bu CODE_FOR_lsx_vavgr_u_bu
+#define CODE_FOR_lsx_vavgr_hu CODE_FOR_lsx_vavgr_u_hu
+#define CODE_FOR_lsx_vavgr_wu CODE_FOR_lsx_vavgr_u_wu
+#define CODE_FOR_lsx_vavgr_du CODE_FOR_lsx_vavgr_u_du
+#define CODE_FOR_lsx_vssub_b CODE_FOR_lsx_vssub_s_b
+#define CODE_FOR_lsx_vssub_h CODE_FOR_lsx_vssub_s_h
+#define CODE_FOR_lsx_vssub_w CODE_FOR_lsx_vssub_s_w
+#define CODE_FOR_lsx_vssub_d CODE_FOR_lsx_vssub_s_d
+#define CODE_FOR_lsx_vssub_bu CODE_FOR_lsx_vssub_u_bu
+#define CODE_FOR_lsx_vssub_hu CODE_FOR_lsx_vssub_u_hu
+#define CODE_FOR_lsx_vssub_wu CODE_FOR_lsx_vssub_u_wu
+#define CODE_FOR_lsx_vssub_du CODE_FOR_lsx_vssub_u_du
+#define CODE_FOR_lsx_vabsd_b CODE_FOR_lsx_vabsd_s_b
+#define CODE_FOR_lsx_vabsd_h CODE_FOR_lsx_vabsd_s_h
+#define CODE_FOR_lsx_vabsd_w CODE_FOR_lsx_vabsd_s_w
+#define CODE_FOR_lsx_vabsd_d CODE_FOR_lsx_vabsd_s_d
+#define CODE_FOR_lsx_vabsd_bu CODE_FOR_lsx_vabsd_u_bu
+#define CODE_FOR_lsx_vabsd_hu CODE_FOR_lsx_vabsd_u_hu
+#define CODE_FOR_lsx_vabsd_wu CODE_FOR_lsx_vabsd_u_wu
+#define CODE_FOR_lsx_vabsd_du CODE_FOR_lsx_vabsd_u_du
+#define CODE_FOR_lsx_vftint_w_s CODE_FOR_lsx_vftint_s_w_s
+#define CODE_FOR_lsx_vftint_l_d CODE_FOR_lsx_vftint_s_l_d
+#define CODE_FOR_lsx_vftint_wu_s CODE_FOR_lsx_vftint_u_wu_s
+#define CODE_FOR_lsx_vftint_lu_d CODE_FOR_lsx_vftint_u_lu_d
+#define CODE_FOR_lsx_vandn_v CODE_FOR_vandnv16qi3
+#define CODE_FOR_lsx_vorn_v CODE_FOR_vornv16qi3
+#define CODE_FOR_lsx_vneg_b CODE_FOR_vnegv16qi2
+#define CODE_FOR_lsx_vneg_h CODE_FOR_vnegv8hi2
+#define CODE_FOR_lsx_vneg_w CODE_FOR_vnegv4si2
+#define CODE_FOR_lsx_vneg_d CODE_FOR_vnegv2di2
+#define CODE_FOR_lsx_vshuf4i_d CODE_FOR_lsx_vshuf4i_d
+#define CODE_FOR_lsx_vbsrl_v CODE_FOR_lsx_vbsrl_b
+#define CODE_FOR_lsx_vbsll_v CODE_FOR_lsx_vbsll_b
+#define CODE_FOR_lsx_vfmadd_s CODE_FOR_fmav4sf4
+#define CODE_FOR_lsx_vfmadd_d CODE_FOR_fmav2df4
+#define CODE_FOR_lsx_vfmsub_s CODE_FOR_fmsv4sf4
+#define CODE_FOR_lsx_vfmsub_d CODE_FOR_fmsv2df4
+#define CODE_FOR_lsx_vfnmadd_s CODE_FOR_vfnmaddv4sf4_nmadd4
+#define CODE_FOR_lsx_vfnmadd_d CODE_FOR_vfnmaddv2df4_nmadd4
+#define CODE_FOR_lsx_vfnmsub_s CODE_FOR_vfnmsubv4sf4_nmsub4
+#define CODE_FOR_lsx_vfnmsub_d CODE_FOR_vfnmsubv2df4_nmsub4
+
+#define CODE_FOR_lsx_vmuh_b CODE_FOR_lsx_vmuh_s_b
+#define CODE_FOR_lsx_vmuh_h CODE_FOR_lsx_vmuh_s_h
+#define CODE_FOR_lsx_vmuh_w CODE_FOR_lsx_vmuh_s_w
+#define CODE_FOR_lsx_vmuh_d CODE_FOR_lsx_vmuh_s_d
+#define CODE_FOR_lsx_vmuh_bu CODE_FOR_lsx_vmuh_u_bu
+#define CODE_FOR_lsx_vmuh_hu CODE_FOR_lsx_vmuh_u_hu
+#define CODE_FOR_lsx_vmuh_wu CODE_FOR_lsx_vmuh_u_wu
+#define CODE_FOR_lsx_vmuh_du CODE_FOR_lsx_vmuh_u_du
+#define CODE_FOR_lsx_vsllwil_h_b CODE_FOR_lsx_vsllwil_s_h_b
+#define CODE_FOR_lsx_vsllwil_w_h CODE_FOR_lsx_vsllwil_s_w_h
+#define CODE_FOR_lsx_vsllwil_d_w CODE_FOR_lsx_vsllwil_s_d_w
+#define CODE_FOR_lsx_vsllwil_hu_bu CODE_FOR_lsx_vsllwil_u_hu_bu
+#define CODE_FOR_lsx_vsllwil_wu_hu CODE_FOR_lsx_vsllwil_u_wu_hu
+#define CODE_FOR_lsx_vsllwil_du_wu CODE_FOR_lsx_vsllwil_u_du_wu
+#define CODE_FOR_lsx_vssran_b_h CODE_FOR_lsx_vssran_s_b_h
+#define CODE_FOR_lsx_vssran_h_w CODE_FOR_lsx_vssran_s_h_w
+#define CODE_FOR_lsx_vssran_w_d CODE_FOR_lsx_vssran_s_w_d
+#define CODE_FOR_lsx_vssran_bu_h CODE_FOR_lsx_vssran_u_bu_h
+#define CODE_FOR_lsx_vssran_hu_w CODE_FOR_lsx_vssran_u_hu_w
+#define CODE_FOR_lsx_vssran_wu_d CODE_FOR_lsx_vssran_u_wu_d
+#define CODE_FOR_lsx_vssrarn_b_h CODE_FOR_lsx_vssrarn_s_b_h
+#define CODE_FOR_lsx_vssrarn_h_w CODE_FOR_lsx_vssrarn_s_h_w
+#define CODE_FOR_lsx_vssrarn_w_d CODE_FOR_lsx_vssrarn_s_w_d
+#define CODE_FOR_lsx_vssrarn_bu_h CODE_FOR_lsx_vssrarn_u_bu_h
+#define CODE_FOR_lsx_vssrarn_hu_w CODE_FOR_lsx_vssrarn_u_hu_w
+#define CODE_FOR_lsx_vssrarn_wu_d CODE_FOR_lsx_vssrarn_u_wu_d
+#define CODE_FOR_lsx_vssrln_bu_h CODE_FOR_lsx_vssrln_u_bu_h
+#define CODE_FOR_lsx_vssrln_hu_w CODE_FOR_lsx_vssrln_u_hu_w
+#define CODE_FOR_lsx_vssrln_wu_d CODE_FOR_lsx_vssrln_u_wu_d
+#define CODE_FOR_lsx_vssrlrn_bu_h CODE_FOR_lsx_vssrlrn_u_bu_h
+#define CODE_FOR_lsx_vssrlrn_hu_w CODE_FOR_lsx_vssrlrn_u_hu_w
+#define CODE_FOR_lsx_vssrlrn_wu_d CODE_FOR_lsx_vssrlrn_u_wu_d
+
 static const struct loongarch_builtin_description loongarch_builtins[] = {
 #define LARCH_MOVFCSR2GR 0
   DIRECT_BUILTIN (movfcsr2gr, LARCH_USI_FTYPE_UQI, hard_float),
@@ -184,6 +489,727 @@ static const struct loongarch_builtin_description loongarch_builtins[] = {
   DIRECT_NO_TARGET_BUILTIN (asrtgt_d, LARCH_VOID_FTYPE_DI_DI, default),
   DIRECT_NO_TARGET_BUILTIN (syscall, LARCH_VOID_FTYPE_USI, default),
   DIRECT_NO_TARGET_BUILTIN (break, LARCH_VOID_FTYPE_USI, default),
+
+  /* Built-in functions for LSX.  */
+  LSX_BUILTIN (vsll_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsll_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsll_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsll_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vslli_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vslli_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vslli_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vslli_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsra_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsra_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsra_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsra_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsrai_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsrai_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsrai_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsrai_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsrar_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsrar_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrar_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrar_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsrari_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsrari_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsrari_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsrari_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsrl_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsrl_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrl_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrl_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsrli_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsrli_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsrli_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsrli_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsrlr_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsrlr_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrlr_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrlr_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsrlri_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsrlri_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsrlri_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsrlri_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vbitclr_b, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vbitclr_h, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vbitclr_w, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vbitclr_d, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vbitclri_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vbitclri_h, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vbitclri_w, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vbitclri_d, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vbitset_b, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vbitset_h, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vbitset_w, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vbitset_d, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vbitseti_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vbitseti_h, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vbitseti_w, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vbitseti_d, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vbitrev_b, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vbitrev_h, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vbitrev_w, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vbitrev_d, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vbitrevi_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vbitrevi_h, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vbitrevi_w, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vbitrevi_d, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vadd_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vadd_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vadd_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vadd_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vaddi_bu, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vaddi_hu, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vaddi_wu, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vaddi_du, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsub_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsub_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsub_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsub_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsubi_bu, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsubi_hu, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsubi_wu, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsubi_du, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vmax_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmax_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmax_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmax_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmaxi_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vmaxi_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vmaxi_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vmaxi_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vmax_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmax_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmax_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmax_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmaxi_bu, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vmaxi_hu, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vmaxi_wu, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vmaxi_du, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vmin_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmin_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmin_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmin_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmini_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vmini_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vmini_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vmini_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vmin_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmin_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmin_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmin_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmini_bu, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vmini_hu, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vmini_wu, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vmini_du, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vseq_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vseq_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vseq_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vseq_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vseqi_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vseqi_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vseqi_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vseqi_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vslti_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vslt_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vslt_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vslt_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vslt_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vslti_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vslti_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vslti_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vslt_bu, LARCH_V16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vslt_hu, LARCH_V8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vslt_wu, LARCH_V4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vslt_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vslti_bu, LARCH_V16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vslti_hu, LARCH_V8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vslti_wu, LARCH_V4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vslti_du, LARCH_V2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vsle_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsle_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsle_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsle_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vslei_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vslei_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vslei_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vslei_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vsle_bu, LARCH_V16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vsle_hu, LARCH_V8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vsle_wu, LARCH_V4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vsle_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vslei_bu, LARCH_V16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vslei_hu, LARCH_V8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vslei_wu, LARCH_V4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vslei_du, LARCH_V2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vsat_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsat_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsat_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsat_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsat_bu, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vsat_hu, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vsat_wu, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vsat_du, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vadda_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vadda_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vadda_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vadda_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsadd_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsadd_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsadd_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsadd_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsadd_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vsadd_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vsadd_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vsadd_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vavg_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vavg_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vavg_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vavg_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vavg_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vavg_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vavg_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vavg_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vavgr_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vavgr_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vavgr_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vavgr_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vavgr_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vavgr_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vavgr_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vavgr_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vssub_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vssub_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssub_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssub_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssub_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vssub_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssub_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssub_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vabsd_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vabsd_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vabsd_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vabsd_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vabsd_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vabsd_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vabsd_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vabsd_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmul_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmul_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmul_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmul_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmadd_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
+  LSX_BUILTIN (vmadd_h, LARCH_V8HI_FTYPE_V8HI_V8HI_V8HI),
+  LSX_BUILTIN (vmadd_w, LARCH_V4SI_FTYPE_V4SI_V4SI_V4SI),
+  LSX_BUILTIN (vmadd_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vmsub_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
+  LSX_BUILTIN (vmsub_h, LARCH_V8HI_FTYPE_V8HI_V8HI_V8HI),
+  LSX_BUILTIN (vmsub_w, LARCH_V4SI_FTYPE_V4SI_V4SI_V4SI),
+  LSX_BUILTIN (vmsub_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vdiv_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vdiv_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vdiv_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vdiv_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vdiv_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vdiv_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vdiv_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vdiv_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vhaddw_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vhaddw_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vhaddw_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vhaddw_hu_bu, LARCH_UV8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vhaddw_wu_hu, LARCH_UV4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vhaddw_du_wu, LARCH_UV2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vhsubw_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vhsubw_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vhsubw_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vhsubw_hu_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vhsubw_wu_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vhsubw_du_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmod_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmod_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmod_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmod_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmod_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmod_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmod_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmod_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vreplve_b, LARCH_V16QI_FTYPE_V16QI_SI),
+  LSX_BUILTIN (vreplve_h, LARCH_V8HI_FTYPE_V8HI_SI),
+  LSX_BUILTIN (vreplve_w, LARCH_V4SI_FTYPE_V4SI_SI),
+  LSX_BUILTIN (vreplve_d, LARCH_V2DI_FTYPE_V2DI_SI),
+  LSX_BUILTIN (vreplvei_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vreplvei_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vreplvei_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vreplvei_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vpickev_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vpickev_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vpickev_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vpickev_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vpickod_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vpickod_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vpickod_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vpickod_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vilvh_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vilvh_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vilvh_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vilvh_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vilvl_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vilvl_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vilvl_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vilvl_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vpackev_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vpackev_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vpackev_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vpackev_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vpackod_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vpackod_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vpackod_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vpackod_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vshuf_h, LARCH_V8HI_FTYPE_V8HI_V8HI_V8HI),
+  LSX_BUILTIN (vshuf_w, LARCH_V4SI_FTYPE_V4SI_V4SI_V4SI),
+  LSX_BUILTIN (vshuf_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vand_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vandi_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vor_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vori_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vnor_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vnori_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vxor_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vxori_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vbitsel_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI_UV16QI),
+  LSX_BUILTIN (vbitseli_b, LARCH_UV16QI_FTYPE_UV16QI_UV16QI_USI),
+  LSX_BUILTIN (vshuf4i_b, LARCH_V16QI_FTYPE_V16QI_USI),
+  LSX_BUILTIN (vshuf4i_h, LARCH_V8HI_FTYPE_V8HI_USI),
+  LSX_BUILTIN (vshuf4i_w, LARCH_V4SI_FTYPE_V4SI_USI),
+  LSX_BUILTIN (vreplgr2vr_b, LARCH_V16QI_FTYPE_SI),
+  LSX_BUILTIN (vreplgr2vr_h, LARCH_V8HI_FTYPE_SI),
+  LSX_BUILTIN (vreplgr2vr_w, LARCH_V4SI_FTYPE_SI),
+  LSX_BUILTIN (vreplgr2vr_d, LARCH_V2DI_FTYPE_DI),
+  LSX_BUILTIN (vpcnt_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vpcnt_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vpcnt_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vpcnt_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vclo_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vclo_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vclo_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vclo_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vclz_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vclz_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vclz_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vclz_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vpickve2gr_b, LARCH_SI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vpickve2gr_h, LARCH_SI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vpickve2gr_w, LARCH_SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vpickve2gr_d, LARCH_DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vpickve2gr_bu, LARCH_USI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vpickve2gr_hu, LARCH_USI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vpickve2gr_wu, LARCH_USI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vpickve2gr_du, LARCH_UDI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vinsgr2vr_b, LARCH_V16QI_FTYPE_V16QI_SI_UQI),
+  LSX_BUILTIN (vinsgr2vr_h, LARCH_V8HI_FTYPE_V8HI_SI_UQI),
+  LSX_BUILTIN (vinsgr2vr_w, LARCH_V4SI_FTYPE_V4SI_SI_UQI),
+  LSX_BUILTIN (vinsgr2vr_d, LARCH_V2DI_FTYPE_V2DI_DI_UQI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_b, LARCH_SI_FTYPE_UV16QI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_h, LARCH_SI_FTYPE_UV8HI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_w, LARCH_SI_FTYPE_UV4SI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_d, LARCH_SI_FTYPE_UV2DI),
+  LSX_BUILTIN_TEST_BRANCH (bz_b, LARCH_SI_FTYPE_UV16QI),
+  LSX_BUILTIN_TEST_BRANCH (bz_h, LARCH_SI_FTYPE_UV8HI),
+  LSX_BUILTIN_TEST_BRANCH (bz_w, LARCH_SI_FTYPE_UV4SI),
+  LSX_BUILTIN_TEST_BRANCH (bz_d, LARCH_SI_FTYPE_UV2DI),
+  LSX_BUILTIN_TEST_BRANCH (bz_v, LARCH_SI_FTYPE_UV16QI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_v,	LARCH_SI_FTYPE_UV16QI),
+  LSX_BUILTIN (vrepli_b, LARCH_V16QI_FTYPE_HI),
+  LSX_BUILTIN (vrepli_h, LARCH_V8HI_FTYPE_HI),
+  LSX_BUILTIN (vrepli_w, LARCH_V4SI_FTYPE_HI),
+  LSX_BUILTIN (vrepli_d, LARCH_V2DI_FTYPE_HI),
+  LSX_BUILTIN (vfcmp_caf_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_caf_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cor_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cor_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cun_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cun_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cune_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cune_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cueq_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cueq_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_ceq_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_ceq_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cne_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cne_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_clt_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_clt_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cult_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cult_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cle_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cle_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cule_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cule_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_saf_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_saf_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sor_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sor_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sun_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sun_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sune_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sune_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sueq_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sueq_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_seq_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_seq_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sne_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sne_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_slt_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_slt_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sult_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sult_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sle_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sle_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sule_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sule_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfadd_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfadd_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfsub_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfsub_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmul_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmul_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfdiv_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfdiv_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcvt_h_s, LARCH_V8HI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcvt_s_d, LARCH_V4SF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmin_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmin_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmina_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmina_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmax_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmax_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmaxa_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmaxa_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfclass_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vfclass_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vfsqrt_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfsqrt_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrecip_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrecip_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrint_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrint_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrsqrt_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrsqrt_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vflogb_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vflogb_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfcvth_s_h, LARCH_V4SF_FTYPE_V8HI),
+  LSX_BUILTIN (vfcvth_d_s, LARCH_V2DF_FTYPE_V4SF),
+  LSX_BUILTIN (vfcvtl_s_h, LARCH_V4SF_FTYPE_V8HI),
+  LSX_BUILTIN (vfcvtl_d_s, LARCH_V2DF_FTYPE_V4SF),
+  LSX_BUILTIN (vftint_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftint_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftint_wu_s, LARCH_UV4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftint_lu_d, LARCH_UV2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftintrz_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrz_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftintrz_wu_s, LARCH_UV4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrz_lu_d, LARCH_UV2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vffint_s_w, LARCH_V4SF_FTYPE_V4SI),
+  LSX_BUILTIN (vffint_d_l, LARCH_V2DF_FTYPE_V2DI),
+  LSX_BUILTIN (vffint_s_wu, LARCH_V4SF_FTYPE_UV4SI),
+  LSX_BUILTIN (vffint_d_lu, LARCH_V2DF_FTYPE_UV2DI),
+
+  LSX_BUILTIN (vandn_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vneg_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vneg_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vneg_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vneg_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vmuh_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmuh_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmuh_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmuh_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmuh_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmuh_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmuh_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmuh_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsllwil_h_b, LARCH_V8HI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsllwil_w_h, LARCH_V4SI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsllwil_d_w, LARCH_V2DI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsllwil_hu_bu, LARCH_UV8HI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vsllwil_wu_hu, LARCH_UV4SI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vsllwil_du_wu, LARCH_UV2DI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vsran_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsran_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsran_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssran_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssran_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssran_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssran_bu_h, LARCH_UV16QI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssran_hu_w, LARCH_UV8HI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssran_wu_d, LARCH_UV4SI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsrarn_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrarn_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrarn_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrarn_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssrarn_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssrarn_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrarn_bu_h, LARCH_UV16QI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssrarn_hu_w, LARCH_UV8HI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssrarn_wu_d, LARCH_UV4SI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsrln_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrln_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrln_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrln_bu_h, LARCH_UV16QI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssrln_hu_w, LARCH_UV8HI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssrln_wu_d, LARCH_UV4SI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsrlrn_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrlrn_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrlrn_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrlrn_bu_h, LARCH_UV16QI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssrlrn_hu_w, LARCH_UV8HI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssrlrn_wu_d, LARCH_UV4SI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vfrstpi_b, LARCH_V16QI_FTYPE_V16QI_V16QI_UQI),
+  LSX_BUILTIN (vfrstpi_h, LARCH_V8HI_FTYPE_V8HI_V8HI_UQI),
+  LSX_BUILTIN (vfrstp_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
+  LSX_BUILTIN (vfrstp_h, LARCH_V8HI_FTYPE_V8HI_V8HI_V8HI),
+  LSX_BUILTIN (vshuf4i_d, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vbsrl_v, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vbsll_v, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vextrins_b, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vextrins_h, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vextrins_w, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vextrins_d, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vmskltz_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vmskltz_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vmskltz_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vmskltz_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vsigncov_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsigncov_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsigncov_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsigncov_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vfmadd_s, LARCH_V4SF_FTYPE_V4SF_V4SF_V4SF),
+  LSX_BUILTIN (vfmadd_d, LARCH_V2DF_FTYPE_V2DF_V2DF_V2DF),
+  LSX_BUILTIN (vfmsub_s, LARCH_V4SF_FTYPE_V4SF_V4SF_V4SF),
+  LSX_BUILTIN (vfmsub_d, LARCH_V2DF_FTYPE_V2DF_V2DF_V2DF),
+  LSX_BUILTIN (vfnmadd_s, LARCH_V4SF_FTYPE_V4SF_V4SF_V4SF),
+  LSX_BUILTIN (vfnmadd_d, LARCH_V2DF_FTYPE_V2DF_V2DF_V2DF),
+  LSX_BUILTIN (vfnmsub_s, LARCH_V4SF_FTYPE_V4SF_V4SF_V4SF),
+  LSX_BUILTIN (vfnmsub_d, LARCH_V2DF_FTYPE_V2DF_V2DF_V2DF),
+  LSX_BUILTIN (vftintrne_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrne_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftintrp_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrp_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftintrm_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrm_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftint_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vffint_s_l, LARCH_V4SF_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vftintrz_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vftintrp_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vftintrm_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vftintrne_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vftintl_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftinth_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vffinth_d_w, LARCH_V2DF_FTYPE_V4SI),
+  LSX_BUILTIN (vffintl_d_w, LARCH_V2DF_FTYPE_V4SI),
+  LSX_BUILTIN (vftintrzl_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrzh_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrpl_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrph_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrml_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrmh_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrnel_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrneh_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrne_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrne_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrintrz_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrz_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrintrp_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrp_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrintrm_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrm_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_NO_TARGET_BUILTIN (vstelm_b, LARCH_VOID_FTYPE_V16QI_CVPOINTER_SI_UQI),
+  LSX_NO_TARGET_BUILTIN (vstelm_h, LARCH_VOID_FTYPE_V8HI_CVPOINTER_SI_UQI),
+  LSX_NO_TARGET_BUILTIN (vstelm_w, LARCH_VOID_FTYPE_V4SI_CVPOINTER_SI_UQI),
+  LSX_NO_TARGET_BUILTIN (vstelm_d, LARCH_VOID_FTYPE_V2DI_CVPOINTER_SI_UQI),
+  LSX_BUILTIN (vaddwev_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vaddwev_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vaddwev_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vaddwod_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vaddwod_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vaddwod_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vaddwev_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vaddwev_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vaddwev_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vaddwod_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vaddwod_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vaddwod_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vaddwev_d_wu_w, LARCH_V2DI_FTYPE_UV4SI_V4SI),
+  LSX_BUILTIN (vaddwev_w_hu_h, LARCH_V4SI_FTYPE_UV8HI_V8HI),
+  LSX_BUILTIN (vaddwev_h_bu_b, LARCH_V8HI_FTYPE_UV16QI_V16QI),
+  LSX_BUILTIN (vaddwod_d_wu_w, LARCH_V2DI_FTYPE_UV4SI_V4SI),
+  LSX_BUILTIN (vaddwod_w_hu_h, LARCH_V4SI_FTYPE_UV8HI_V8HI),
+  LSX_BUILTIN (vaddwod_h_bu_b, LARCH_V8HI_FTYPE_UV16QI_V16QI),
+  LSX_BUILTIN (vsubwev_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsubwev_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsubwev_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsubwod_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsubwod_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsubwod_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsubwev_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vsubwev_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vsubwev_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vsubwod_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vsubwod_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vsubwod_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vaddwev_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vaddwod_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vaddwev_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vaddwod_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsubwev_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsubwod_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsubwev_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsubwod_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vaddwev_q_du_d, LARCH_V2DI_FTYPE_UV2DI_V2DI),
+  LSX_BUILTIN (vaddwod_q_du_d, LARCH_V2DI_FTYPE_UV2DI_V2DI),
+
+  LSX_BUILTIN (vmulwev_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmulwev_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmulwev_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmulwod_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmulwod_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmulwod_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmulwev_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmulwev_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmulwev_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmulwod_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmulwod_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmulwod_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmulwev_d_wu_w, LARCH_V2DI_FTYPE_UV4SI_V4SI),
+  LSX_BUILTIN (vmulwev_w_hu_h, LARCH_V4SI_FTYPE_UV8HI_V8HI),
+  LSX_BUILTIN (vmulwev_h_bu_b, LARCH_V8HI_FTYPE_UV16QI_V16QI),
+  LSX_BUILTIN (vmulwod_d_wu_w, LARCH_V2DI_FTYPE_UV4SI_V4SI),
+  LSX_BUILTIN (vmulwod_w_hu_h, LARCH_V4SI_FTYPE_UV8HI_V8HI),
+  LSX_BUILTIN (vmulwod_h_bu_b, LARCH_V8HI_FTYPE_UV16QI_V16QI),
+  LSX_BUILTIN (vmulwev_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmulwod_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmulwev_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmulwod_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmulwev_q_du_d, LARCH_V2DI_FTYPE_UV2DI_V2DI),
+  LSX_BUILTIN (vmulwod_q_du_d, LARCH_V2DI_FTYPE_UV2DI_V2DI),
+  LSX_BUILTIN (vhaddw_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vhaddw_qu_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vhsubw_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vhsubw_qu_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmaddwev_d_w, LARCH_V2DI_FTYPE_V2DI_V4SI_V4SI),
+  LSX_BUILTIN (vmaddwev_w_h, LARCH_V4SI_FTYPE_V4SI_V8HI_V8HI),
+  LSX_BUILTIN (vmaddwev_h_b, LARCH_V8HI_FTYPE_V8HI_V16QI_V16QI),
+  LSX_BUILTIN (vmaddwev_d_wu, LARCH_UV2DI_FTYPE_UV2DI_UV4SI_UV4SI),
+  LSX_BUILTIN (vmaddwev_w_hu, LARCH_UV4SI_FTYPE_UV4SI_UV8HI_UV8HI),
+  LSX_BUILTIN (vmaddwev_h_bu, LARCH_UV8HI_FTYPE_UV8HI_UV16QI_UV16QI),
+  LSX_BUILTIN (vmaddwod_d_w, LARCH_V2DI_FTYPE_V2DI_V4SI_V4SI),
+  LSX_BUILTIN (vmaddwod_w_h, LARCH_V4SI_FTYPE_V4SI_V8HI_V8HI),
+  LSX_BUILTIN (vmaddwod_h_b, LARCH_V8HI_FTYPE_V8HI_V16QI_V16QI),
+  LSX_BUILTIN (vmaddwod_d_wu, LARCH_UV2DI_FTYPE_UV2DI_UV4SI_UV4SI),
+  LSX_BUILTIN (vmaddwod_w_hu, LARCH_UV4SI_FTYPE_UV4SI_UV8HI_UV8HI),
+  LSX_BUILTIN (vmaddwod_h_bu, LARCH_UV8HI_FTYPE_UV8HI_UV16QI_UV16QI),
+  LSX_BUILTIN (vmaddwev_d_wu_w, LARCH_V2DI_FTYPE_V2DI_UV4SI_V4SI),
+  LSX_BUILTIN (vmaddwev_w_hu_h, LARCH_V4SI_FTYPE_V4SI_UV8HI_V8HI),
+  LSX_BUILTIN (vmaddwev_h_bu_b, LARCH_V8HI_FTYPE_V8HI_UV16QI_V16QI),
+  LSX_BUILTIN (vmaddwod_d_wu_w, LARCH_V2DI_FTYPE_V2DI_UV4SI_V4SI),
+  LSX_BUILTIN (vmaddwod_w_hu_h, LARCH_V4SI_FTYPE_V4SI_UV8HI_V8HI),
+  LSX_BUILTIN (vmaddwod_h_bu_b, LARCH_V8HI_FTYPE_V8HI_UV16QI_V16QI),
+  LSX_BUILTIN (vmaddwev_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vmaddwod_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vmaddwev_q_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI_UV2DI),
+  LSX_BUILTIN (vmaddwod_q_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI_UV2DI),
+  LSX_BUILTIN (vmaddwev_q_du_d, LARCH_V2DI_FTYPE_V2DI_UV2DI_V2DI),
+  LSX_BUILTIN (vmaddwod_q_du_d, LARCH_V2DI_FTYPE_V2DI_UV2DI_V2DI),
+  LSX_BUILTIN (vrotr_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vrotr_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vrotr_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vrotr_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vadd_q, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsub_q, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vldrepl_b, LARCH_V16QI_FTYPE_CVPOINTER_SI),
+  LSX_BUILTIN (vldrepl_h, LARCH_V8HI_FTYPE_CVPOINTER_SI),
+  LSX_BUILTIN (vldrepl_w, LARCH_V4SI_FTYPE_CVPOINTER_SI),
+  LSX_BUILTIN (vldrepl_d, LARCH_V2DI_FTYPE_CVPOINTER_SI),
+
+  LSX_BUILTIN (vmskgez_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vmsknz_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vexth_h_b, LARCH_V8HI_FTYPE_V16QI),
+  LSX_BUILTIN (vexth_w_h, LARCH_V4SI_FTYPE_V8HI),
+  LSX_BUILTIN (vexth_d_w, LARCH_V2DI_FTYPE_V4SI),
+  LSX_BUILTIN (vexth_q_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vexth_hu_bu, LARCH_UV8HI_FTYPE_UV16QI),
+  LSX_BUILTIN (vexth_wu_hu, LARCH_UV4SI_FTYPE_UV8HI),
+  LSX_BUILTIN (vexth_du_wu, LARCH_UV2DI_FTYPE_UV4SI),
+  LSX_BUILTIN (vexth_qu_du, LARCH_UV2DI_FTYPE_UV2DI),
+  LSX_BUILTIN (vrotri_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vrotri_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vrotri_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vrotri_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vextl_q_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vsrlni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vsrlni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vsrlni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vsrlni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vsrlrni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vsrlrni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vsrlrni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vsrlrni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrlni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vssrlni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vssrlni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vssrlni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrlni_bu_h, LARCH_UV16QI_FTYPE_UV16QI_V16QI_USI),
+  LSX_BUILTIN (vssrlni_hu_w, LARCH_UV8HI_FTYPE_UV8HI_V8HI_USI),
+  LSX_BUILTIN (vssrlni_wu_d, LARCH_UV4SI_FTYPE_UV4SI_V4SI_USI),
+  LSX_BUILTIN (vssrlni_du_q, LARCH_UV2DI_FTYPE_UV2DI_V2DI_USI),
+  LSX_BUILTIN (vssrlrni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vssrlrni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vssrlrni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vssrlrni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrlrni_bu_h, LARCH_UV16QI_FTYPE_UV16QI_V16QI_USI),
+  LSX_BUILTIN (vssrlrni_hu_w, LARCH_UV8HI_FTYPE_UV8HI_V8HI_USI),
+  LSX_BUILTIN (vssrlrni_wu_d, LARCH_UV4SI_FTYPE_UV4SI_V4SI_USI),
+  LSX_BUILTIN (vssrlrni_du_q, LARCH_UV2DI_FTYPE_UV2DI_V2DI_USI),
+  LSX_BUILTIN (vsrani_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vsrani_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vsrani_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vsrani_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vsrarni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vsrarni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vsrarni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vsrarni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrani_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vssrani_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vssrani_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vssrani_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrani_bu_h, LARCH_UV16QI_FTYPE_UV16QI_V16QI_USI),
+  LSX_BUILTIN (vssrani_hu_w, LARCH_UV8HI_FTYPE_UV8HI_V8HI_USI),
+  LSX_BUILTIN (vssrani_wu_d, LARCH_UV4SI_FTYPE_UV4SI_V4SI_USI),
+  LSX_BUILTIN (vssrani_du_q, LARCH_UV2DI_FTYPE_UV2DI_V2DI_USI),
+  LSX_BUILTIN (vssrarni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vssrarni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vssrarni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vssrarni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrarni_bu_h, LARCH_UV16QI_FTYPE_UV16QI_V16QI_USI),
+  LSX_BUILTIN (vssrarni_hu_w, LARCH_UV8HI_FTYPE_UV8HI_V8HI_USI),
+  LSX_BUILTIN (vssrarni_wu_d, LARCH_UV4SI_FTYPE_UV4SI_V4SI_USI),
+  LSX_BUILTIN (vssrarni_du_q, LARCH_UV2DI_FTYPE_UV2DI_V2DI_USI),
+  LSX_BUILTIN (vpermi_w, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vld, LARCH_V16QI_FTYPE_CVPOINTER_SI),
+  LSX_NO_TARGET_BUILTIN (vst, LARCH_VOID_FTYPE_V16QI_CVPOINTER_SI),
+  LSX_BUILTIN (vssrlrn_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssrlrn_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssrlrn_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrln_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssrln_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssrln_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vorn_v, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vldi, LARCH_V2DI_FTYPE_HI),
+  LSX_BUILTIN (vshuf_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
+  LSX_BUILTIN (vldx, LARCH_V16QI_FTYPE_CVPOINTER_DI),
+  LSX_NO_TARGET_BUILTIN (vstx, LARCH_VOID_FTYPE_V16QI_CVPOINTER_DI),
+  LSX_BUILTIN (vextl_qu_du, LARCH_UV2DI_FTYPE_UV2DI)
 };
 
 /* Index I is the function declaration for loongarch_builtins[I], or null if
@@ -193,11 +1219,46 @@ static GTY (()) tree loongarch_builtin_decls[ARRAY_SIZE (loongarch_builtins)];
    using the instruction code or return null if not defined for the target.  */
 static GTY (()) int loongarch_get_builtin_decl_index[NUM_INSN_CODES];
 
+
+/* MODE is a vector mode whose elements have type TYPE.  Return the type
+   of the vector itself.  */
+
+static tree
+loongarch_builtin_vector_type (tree type, machine_mode mode)
+{
+  static tree types[2 * (int) MAX_MACHINE_MODE];
+  int mode_index;
+
+  mode_index = (int) mode;
+
+  if (TREE_CODE (type) == INTEGER_TYPE && TYPE_UNSIGNED (type))
+    mode_index += MAX_MACHINE_MODE;
+
+  if (types[mode_index] == NULL_TREE)
+    types[mode_index] = build_vector_type_for_mode (type, mode);
+  return types[mode_index];
+}
+
+/* Return a type for 'const volatile void *'.  */
+
+static tree
+loongarch_build_cvpointer_type (void)
+{
+  static tree cache;
+
+  if (cache == NULL_TREE)
+    cache = build_pointer_type (build_qualified_type (void_type_node,
+						      TYPE_QUAL_CONST
+						      | TYPE_QUAL_VOLATILE));
+  return cache;
+}
+
 /* Source-level argument types.  */
 #define LARCH_ATYPE_VOID void_type_node
 #define LARCH_ATYPE_INT integer_type_node
 #define LARCH_ATYPE_POINTER ptr_type_node
-
+#define LARCH_ATYPE_CVPOINTER loongarch_build_cvpointer_type ()
+#define LARCH_ATYPE_BOOLEAN boolean_type_node
 /* Standard mode-based argument types.  */
 #define LARCH_ATYPE_QI intQI_type_node
 #define LARCH_ATYPE_UQI unsigned_intQI_type_node
@@ -210,6 +1271,72 @@ static GTY (()) int loongarch_get_builtin_decl_index[NUM_INSN_CODES];
 #define LARCH_ATYPE_SF float_type_node
 #define LARCH_ATYPE_DF double_type_node
 
+/* Vector argument types.  */
+#define LARCH_ATYPE_V2SF						\
+  loongarch_builtin_vector_type (float_type_node, V2SFmode)
+#define LARCH_ATYPE_V2HI						\
+  loongarch_builtin_vector_type (intHI_type_node, V2HImode)
+#define LARCH_ATYPE_V2SI						\
+  loongarch_builtin_vector_type (intSI_type_node, V2SImode)
+#define LARCH_ATYPE_V4QI						\
+  loongarch_builtin_vector_type (intQI_type_node, V4QImode)
+#define LARCH_ATYPE_V4HI						\
+  loongarch_builtin_vector_type (intHI_type_node, V4HImode)
+#define LARCH_ATYPE_V8QI						\
+  loongarch_builtin_vector_type (intQI_type_node, V8QImode)
+
+#define LARCH_ATYPE_V2DI						\
+  loongarch_builtin_vector_type (long_long_integer_type_node, V2DImode)
+#define LARCH_ATYPE_V4SI						\
+  loongarch_builtin_vector_type (intSI_type_node, V4SImode)
+#define LARCH_ATYPE_V8HI						\
+  loongarch_builtin_vector_type (intHI_type_node, V8HImode)
+#define LARCH_ATYPE_V16QI						\
+  loongarch_builtin_vector_type (intQI_type_node, V16QImode)
+#define LARCH_ATYPE_V2DF						\
+  loongarch_builtin_vector_type (double_type_node, V2DFmode)
+#define LARCH_ATYPE_V4SF						\
+  loongarch_builtin_vector_type (float_type_node, V4SFmode)
+
+/* LoongArch ASX.  */
+#define LARCH_ATYPE_V4DI						\
+  loongarch_builtin_vector_type (long_long_integer_type_node, V4DImode)
+#define LARCH_ATYPE_V8SI						\
+  loongarch_builtin_vector_type (intSI_type_node, V8SImode)
+#define LARCH_ATYPE_V16HI						\
+  loongarch_builtin_vector_type (intHI_type_node, V16HImode)
+#define LARCH_ATYPE_V32QI						\
+  loongarch_builtin_vector_type (intQI_type_node, V32QImode)
+#define LARCH_ATYPE_V4DF						\
+  loongarch_builtin_vector_type (double_type_node, V4DFmode)
+#define LARCH_ATYPE_V8SF						\
+  loongarch_builtin_vector_type (float_type_node, V8SFmode)
+
+#define LARCH_ATYPE_UV2DI					\
+  loongarch_builtin_vector_type (long_long_unsigned_type_node, V2DImode)
+#define LARCH_ATYPE_UV4SI					\
+  loongarch_builtin_vector_type (unsigned_intSI_type_node, V4SImode)
+#define LARCH_ATYPE_UV8HI					\
+  loongarch_builtin_vector_type (unsigned_intHI_type_node, V8HImode)
+#define LARCH_ATYPE_UV16QI					\
+  loongarch_builtin_vector_type (unsigned_intQI_type_node, V16QImode)
+
+#define LARCH_ATYPE_UV4DI					\
+  loongarch_builtin_vector_type (long_long_unsigned_type_node, V4DImode)
+#define LARCH_ATYPE_UV8SI					\
+  loongarch_builtin_vector_type (unsigned_intSI_type_node, V8SImode)
+#define LARCH_ATYPE_UV16HI					\
+  loongarch_builtin_vector_type (unsigned_intHI_type_node, V16HImode)
+#define LARCH_ATYPE_UV32QI					\
+  loongarch_builtin_vector_type (unsigned_intQI_type_node, V32QImode)
+
+#define LARCH_ATYPE_UV2SI					\
+  loongarch_builtin_vector_type (unsigned_intSI_type_node, V2SImode)
+#define LARCH_ATYPE_UV4HI					\
+  loongarch_builtin_vector_type (unsigned_intHI_type_node, V4HImode)
+#define LARCH_ATYPE_UV8QI					\
+  loongarch_builtin_vector_type (unsigned_intQI_type_node, V8QImode)
+
 /* LARCH_FTYPE_ATYPESN takes N LARCH_FTYPES-like type codes and lists
    their associated LARCH_ATYPEs.  */
 #define LARCH_FTYPE_ATYPES1(A, B) LARCH_ATYPE_##A, LARCH_ATYPE_##B
@@ -283,6 +1410,92 @@ loongarch_builtin_decl (unsigned int code, bool initialize_p ATTRIBUTE_UNUSED)
   return loongarch_builtin_decls[code];
 }
 
+/* Implement TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.  */
+
+tree
+loongarch_builtin_vectorized_function (unsigned int fn, tree type_out,
+				       tree type_in)
+{
+  machine_mode in_mode, out_mode;
+  int in_n, out_n;
+
+  if (TREE_CODE (type_out) != VECTOR_TYPE
+      || TREE_CODE (type_in) != VECTOR_TYPE
+      || !ISA_HAS_LSX)
+    return NULL_TREE;
+
+  out_mode = TYPE_MODE (TREE_TYPE (type_out));
+  out_n = TYPE_VECTOR_SUBPARTS (type_out);
+  in_mode = TYPE_MODE (TREE_TYPE (type_in));
+  in_n = TYPE_VECTOR_SUBPARTS (type_in);
+
+  /* INSN is the name of the associated instruction pattern, without
+     the leading CODE_FOR_.  */
+#define LARCH_GET_BUILTIN(INSN) \
+  loongarch_builtin_decls[loongarch_get_builtin_decl_index[CODE_FOR_##INSN]]
+
+  switch (fn)
+    {
+    CASE_CFN_CEIL:
+      if (out_mode == DFmode && in_mode == DFmode)
+    {
+      if (out_n == 2 && in_n == 2)
+	return LARCH_GET_BUILTIN (lsx_vfrintrp_d);
+    }
+      if (out_mode == SFmode && in_mode == SFmode)
+    {
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lsx_vfrintrp_s);
+    }
+      break;
+
+    CASE_CFN_TRUNC:
+      if (out_mode == DFmode && in_mode == DFmode)
+    {
+      if (out_n == 2 && in_n == 2)
+	return LARCH_GET_BUILTIN (lsx_vfrintrz_d);
+    }
+      if (out_mode == SFmode && in_mode == SFmode)
+    {
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lsx_vfrintrz_s);
+    }
+      break;
+
+    CASE_CFN_RINT:
+    CASE_CFN_ROUND:
+      if (out_mode == DFmode && in_mode == DFmode)
+    {
+      if (out_n == 2 && in_n == 2)
+	return LARCH_GET_BUILTIN (lsx_vfrint_d);
+    }
+      if (out_mode == SFmode && in_mode == SFmode)
+    {
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lsx_vfrint_s);
+    }
+      break;
+
+    CASE_CFN_FLOOR:
+      if (out_mode == DFmode && in_mode == DFmode)
+    {
+      if (out_n == 2 && in_n == 2)
+	return LARCH_GET_BUILTIN (lsx_vfrintrm_d);
+    }
+      if (out_mode == SFmode && in_mode == SFmode)
+    {
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lsx_vfrintrm_s);
+    }
+      break;
+
+    default:
+      break;
+    }
+
+  return NULL_TREE;
+}
+
 /* Take argument ARGNO from EXP's argument list and convert it into
    an expand operand.  Store the operand in *OP.  */
 
@@ -318,7 +1531,236 @@ static rtx
 loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
 			       struct expand_operand *ops, bool has_target_p)
 {
-  if (!maybe_expand_insn (icode, nops, ops))
+  machine_mode imode;
+  int rangelo = 0, rangehi = 0, error_opno = 0;
+
+  switch (icode)
+    {
+    case CODE_FOR_lsx_vaddi_bu:
+    case CODE_FOR_lsx_vaddi_hu:
+    case CODE_FOR_lsx_vaddi_wu:
+    case CODE_FOR_lsx_vaddi_du:
+    case CODE_FOR_lsx_vslti_bu:
+    case CODE_FOR_lsx_vslti_hu:
+    case CODE_FOR_lsx_vslti_wu:
+    case CODE_FOR_lsx_vslti_du:
+    case CODE_FOR_lsx_vslei_bu:
+    case CODE_FOR_lsx_vslei_hu:
+    case CODE_FOR_lsx_vslei_wu:
+    case CODE_FOR_lsx_vslei_du:
+    case CODE_FOR_lsx_vmaxi_bu:
+    case CODE_FOR_lsx_vmaxi_hu:
+    case CODE_FOR_lsx_vmaxi_wu:
+    case CODE_FOR_lsx_vmaxi_du:
+    case CODE_FOR_lsx_vmini_bu:
+    case CODE_FOR_lsx_vmini_hu:
+    case CODE_FOR_lsx_vmini_wu:
+    case CODE_FOR_lsx_vmini_du:
+    case CODE_FOR_lsx_vsubi_bu:
+    case CODE_FOR_lsx_vsubi_hu:
+    case CODE_FOR_lsx_vsubi_wu:
+    case CODE_FOR_lsx_vsubi_du:
+      gcc_assert (has_target_p && nops == 3);
+      /* We only generate a vector of constants iff the second argument
+	 is an immediate.  We also validate the range of the immediate.  */
+      if (CONST_INT_P (ops[2].value))
+	{
+	  rangelo = 0;
+	  rangehi = 31;
+	  if (IN_RANGE (INTVAL (ops[2].value), rangelo, rangehi))
+	    {
+	      ops[2].mode = ops[0].mode;
+	      ops[2].value = loongarch_gen_const_int_vector (ops[2].mode,
+							     INTVAL (ops[2].value));
+	    }
+	  else
+	    error_opno = 2;
+	}
+      break;
+
+    case CODE_FOR_lsx_vseqi_b:
+    case CODE_FOR_lsx_vseqi_h:
+    case CODE_FOR_lsx_vseqi_w:
+    case CODE_FOR_lsx_vseqi_d:
+    case CODE_FOR_lsx_vslti_b:
+    case CODE_FOR_lsx_vslti_h:
+    case CODE_FOR_lsx_vslti_w:
+    case CODE_FOR_lsx_vslti_d:
+    case CODE_FOR_lsx_vslei_b:
+    case CODE_FOR_lsx_vslei_h:
+    case CODE_FOR_lsx_vslei_w:
+    case CODE_FOR_lsx_vslei_d:
+    case CODE_FOR_lsx_vmaxi_b:
+    case CODE_FOR_lsx_vmaxi_h:
+    case CODE_FOR_lsx_vmaxi_w:
+    case CODE_FOR_lsx_vmaxi_d:
+    case CODE_FOR_lsx_vmini_b:
+    case CODE_FOR_lsx_vmini_h:
+    case CODE_FOR_lsx_vmini_w:
+    case CODE_FOR_lsx_vmini_d:
+      gcc_assert (has_target_p && nops == 3);
+      /* We only generate a vector of constants iff the second argument
+	 is an immediate.  We also validate the range of the immediate.  */
+      if (CONST_INT_P (ops[2].value))
+	{
+	  rangelo = -16;
+	  rangehi = 15;
+	  if (IN_RANGE (INTVAL (ops[2].value), rangelo, rangehi))
+	    {
+	      ops[2].mode = ops[0].mode;
+	      ops[2].value = loongarch_gen_const_int_vector (ops[2].mode,
+							     INTVAL (ops[2].value));
+	    }
+	  else
+	    error_opno = 2;
+	}
+      break;
+
+    case CODE_FOR_lsx_vandi_b:
+    case CODE_FOR_lsx_vori_b:
+    case CODE_FOR_lsx_vnori_b:
+    case CODE_FOR_lsx_vxori_b:
+      gcc_assert (has_target_p && nops == 3);
+      if (!CONST_INT_P (ops[2].value))
+	break;
+      ops[2].mode = ops[0].mode;
+      ops[2].value = loongarch_gen_const_int_vector (ops[2].mode,
+						     INTVAL (ops[2].value));
+      break;
+
+    case CODE_FOR_lsx_vbitseli_b:
+      gcc_assert (has_target_p && nops == 4);
+      if (!CONST_INT_P (ops[3].value))
+	break;
+      ops[3].mode = ops[0].mode;
+      ops[3].value = loongarch_gen_const_int_vector (ops[3].mode,
+						     INTVAL (ops[3].value));
+      break;
+
+    case CODE_FOR_lsx_vreplgr2vr_b:
+    case CODE_FOR_lsx_vreplgr2vr_h:
+    case CODE_FOR_lsx_vreplgr2vr_w:
+    case CODE_FOR_lsx_vreplgr2vr_d:
+      /* Map the built-ins to vector fill operations.  We need fix up the mode
+	 for the element being inserted.  */
+      gcc_assert (has_target_p && nops == 2);
+      imode = GET_MODE_INNER (ops[0].mode);
+      ops[1].value = lowpart_subreg (imode, ops[1].value, ops[1].mode);
+      ops[1].mode = imode;
+      break;
+
+    case CODE_FOR_lsx_vilvh_b:
+    case CODE_FOR_lsx_vilvh_h:
+    case CODE_FOR_lsx_vilvh_w:
+    case CODE_FOR_lsx_vilvh_d:
+    case CODE_FOR_lsx_vilvl_b:
+    case CODE_FOR_lsx_vilvl_h:
+    case CODE_FOR_lsx_vilvl_w:
+    case CODE_FOR_lsx_vilvl_d:
+    case CODE_FOR_lsx_vpackev_b:
+    case CODE_FOR_lsx_vpackev_h:
+    case CODE_FOR_lsx_vpackev_w:
+    case CODE_FOR_lsx_vpackod_b:
+    case CODE_FOR_lsx_vpackod_h:
+    case CODE_FOR_lsx_vpackod_w:
+    case CODE_FOR_lsx_vpickev_b:
+    case CODE_FOR_lsx_vpickev_h:
+    case CODE_FOR_lsx_vpickev_w:
+    case CODE_FOR_lsx_vpickod_b:
+    case CODE_FOR_lsx_vpickod_h:
+    case CODE_FOR_lsx_vpickod_w:
+      /* Swap the operands 1 and 2 for interleave operations.  Built-ins follow
+	 convention of ISA, which have op1 as higher component and op2 as lower
+	 component.  However, the VEC_PERM op in tree and vec_concat in RTL
+	 expects first operand to be lower component, because of which this
+	 swap is needed for builtins.  */
+      gcc_assert (has_target_p && nops == 3);
+      std::swap (ops[1], ops[2]);
+      break;
+
+    case CODE_FOR_lsx_vslli_b:
+    case CODE_FOR_lsx_vslli_h:
+    case CODE_FOR_lsx_vslli_w:
+    case CODE_FOR_lsx_vslli_d:
+    case CODE_FOR_lsx_vsrai_b:
+    case CODE_FOR_lsx_vsrai_h:
+    case CODE_FOR_lsx_vsrai_w:
+    case CODE_FOR_lsx_vsrai_d:
+    case CODE_FOR_lsx_vsrli_b:
+    case CODE_FOR_lsx_vsrli_h:
+    case CODE_FOR_lsx_vsrli_w:
+    case CODE_FOR_lsx_vsrli_d:
+      gcc_assert (has_target_p && nops == 3);
+      if (CONST_INT_P (ops[2].value))
+	{
+	  rangelo = 0;
+	  rangehi = GET_MODE_UNIT_BITSIZE (ops[0].mode) - 1;
+	  if (IN_RANGE (INTVAL (ops[2].value), rangelo, rangehi))
+	    {
+	      ops[2].mode = ops[0].mode;
+	      ops[2].value = loongarch_gen_const_int_vector (ops[2].mode,
+							     INTVAL (ops[2].value));
+	    }
+	  else
+	    error_opno = 2;
+	}
+      break;
+
+    case CODE_FOR_lsx_vinsgr2vr_b:
+    case CODE_FOR_lsx_vinsgr2vr_h:
+    case CODE_FOR_lsx_vinsgr2vr_w:
+    case CODE_FOR_lsx_vinsgr2vr_d:
+      /* Map the built-ins to insert operations.  We need to swap operands,
+	 fix up the mode for the element being inserted, and generate
+	 a bit mask for vec_merge.  */
+      gcc_assert (has_target_p && nops == 4);
+      std::swap (ops[1], ops[2]);
+      imode = GET_MODE_INNER (ops[0].mode);
+      ops[1].value = lowpart_subreg (imode, ops[1].value, ops[1].mode);
+      ops[1].mode = imode;
+      rangelo = 0;
+      rangehi = GET_MODE_NUNITS (ops[0].mode) - 1;
+      if (CONST_INT_P (ops[3].value)
+	  && IN_RANGE (INTVAL (ops[3].value), rangelo, rangehi))
+	ops[3].value = GEN_INT (1 << INTVAL (ops[3].value));
+      else
+	error_opno = 2;
+      break;
+
+      /* Map the built-ins to element insert operations.  We need to swap
+	 operands and generate a bit mask.  */
+      gcc_assert (has_target_p && nops == 4);
+      std::swap (ops[1], ops[2]);
+      std::swap (ops[1], ops[3]);
+      rangelo = 0;
+      rangehi = GET_MODE_NUNITS (ops[0].mode) - 1;
+      if (CONST_INT_P (ops[3].value)
+	  && IN_RANGE (INTVAL (ops[3].value), rangelo, rangehi))
+	ops[3].value = GEN_INT (1 << INTVAL (ops[3].value));
+      else
+	error_opno = 2;
+      break;
+
+    case CODE_FOR_lsx_vshuf4i_b:
+    case CODE_FOR_lsx_vshuf4i_h:
+    case CODE_FOR_lsx_vshuf4i_w:
+    case CODE_FOR_lsx_vshuf4i_w_f:
+      gcc_assert (has_target_p && nops == 3);
+      ops[2].value = loongarch_gen_const_int_vector_shuffle (ops[0].mode,
+							     INTVAL (ops[2].value));
+      break;
+
+    default:
+      break;
+  }
+
+  if (error_opno != 0)
+    {
+      error ("argument %d to the built-in must be a constant"
+	     " in range %d to %d", error_opno, rangelo, rangehi);
+      return has_target_p ? gen_reg_rtx (ops[0].mode) : const0_rtx;
+    }
+  else if (!maybe_expand_insn (icode, nops, ops))
     {
       error ("invalid argument to built-in function");
       return has_target_p ? gen_reg_rtx (ops[0].mode) : const0_rtx;
@@ -352,6 +1794,50 @@ loongarch_expand_builtin_direct (enum insn_code icode, rtx target, tree exp,
   return loongarch_expand_builtin_insn (icode, opno, ops, has_target_p);
 }
 
+/* Expand an LSX built-in for a compare and branch instruction specified by
+   ICODE, set a general-purpose register to 1 if the branch was taken,
+   0 otherwise.  */
+
+static rtx
+loongarch_expand_builtin_lsx_test_branch (enum insn_code icode, tree exp)
+{
+  struct expand_operand ops[3];
+  rtx_insn *cbranch;
+  rtx_code_label *true_label, *done_label;
+  rtx cmp_result;
+
+  true_label = gen_label_rtx ();
+  done_label = gen_label_rtx ();
+
+  create_input_operand (&ops[0], true_label, TYPE_MODE (TREE_TYPE (exp)));
+  loongarch_prepare_builtin_arg (&ops[1], exp, 0);
+  create_fixed_operand (&ops[2], const0_rtx);
+
+  /* Make sure that the operand 1 is a REG.  */
+  if (GET_CODE (ops[1].value) != REG)
+    ops[1].value = force_reg (ops[1].mode, ops[1].value);
+
+  if ((cbranch = maybe_gen_insn (icode, 3, ops)) == NULL_RTX)
+    error ("failed to expand built-in function");
+
+  cmp_result = gen_reg_rtx (SImode);
+
+  /* First assume that CMP_RESULT is false.  */
+  loongarch_emit_move (cmp_result, const0_rtx);
+
+  /* Branch to TRUE_LABEL if CBRANCH is taken and DONE_LABEL otherwise.  */
+  emit_jump_insn (cbranch);
+  emit_jump_insn (gen_jump (done_label));
+  emit_barrier ();
+
+  /* Set CMP_RESULT to true if the branch was taken.  */
+  emit_label (true_label);
+  loongarch_emit_move (cmp_result, const1_rtx);
+
+  emit_label (done_label);
+  return cmp_result;
+}
+
 /* Implement TARGET_EXPAND_BUILTIN.  */
 
 rtx
@@ -372,10 +1858,14 @@ loongarch_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
   switch (d->builtin_type)
     {
     case LARCH_BUILTIN_DIRECT:
+    case LARCH_BUILTIN_LSX:
       return loongarch_expand_builtin_direct (d->icode, target, exp, true);
 
     case LARCH_BUILTIN_DIRECT_NO_TARGET:
       return loongarch_expand_builtin_direct (d->icode, target, exp, false);
+
+    case LARCH_BUILTIN_LSX_TEST_BRANCH:
+      return loongarch_expand_builtin_lsx_test_branch (d->icode, exp);
     }
   gcc_unreachable ();
 }
diff --git a/gcc/config/loongarch/loongarch-ftypes.def b/gcc/config/loongarch/loongarch-ftypes.def
index 06d2e0519f7..1ce9d83ccab 100644
--- a/gcc/config/loongarch/loongarch-ftypes.def
+++ b/gcc/config/loongarch/loongarch-ftypes.def
@@ -1,7 +1,7 @@
 /* Definitions of prototypes for LoongArch built-in functions.
    Copyright (C) 2021-2023 Free Software Foundation, Inc.
    Contributed by Loongson Ltd.
-   Based on MIPS target for GNU ckompiler.
+   Based on MIPS target for GNU compiler.
 
 This file is part of GCC.
 
@@ -32,7 +32,7 @@ along with GCC; see the file COPYING3.  If not see
       INT for integer_type_node
       POINTER for ptr_type_node
 
-   (we don't use PTR because that's a ANSI-compatibillity macro).
+   (we don't use PTR because that's a ANSI-compatibility macro).
 
    Please keep this list lexicographically sorted by the LIST argument.  */
 
@@ -63,3 +63,396 @@ DEF_LARCH_FTYPE (3, (VOID, USI, USI, SI))
 DEF_LARCH_FTYPE (3, (VOID, USI, UDI, SI))
 DEF_LARCH_FTYPE (3, (USI, USI, USI, USI))
 DEF_LARCH_FTYPE (3, (UDI, UDI, UDI, USI))
+
+DEF_LARCH_FTYPE (1, (DF, DF))
+DEF_LARCH_FTYPE (2, (DF, DF, DF))
+DEF_LARCH_FTYPE (1, (DF, V2DF))
+
+DEF_LARCH_FTYPE (1, (DI, DI))
+DEF_LARCH_FTYPE (1, (DI, SI))
+DEF_LARCH_FTYPE (1, (DI, UQI))
+DEF_LARCH_FTYPE (2, (DI, DI, DI))
+DEF_LARCH_FTYPE (2, (DI, DI, SI))
+DEF_LARCH_FTYPE (3, (DI, DI, SI, SI))
+DEF_LARCH_FTYPE (3, (DI, DI, USI, USI))
+DEF_LARCH_FTYPE (3, (DI, DI, DI, QI))
+DEF_LARCH_FTYPE (3, (DI, DI, V2HI, V2HI))
+DEF_LARCH_FTYPE (3, (DI, DI, V4QI, V4QI))
+DEF_LARCH_FTYPE (2, (DI, POINTER, SI))
+DEF_LARCH_FTYPE (2, (DI, SI, SI))
+DEF_LARCH_FTYPE (2, (DI, USI, USI))
+
+DEF_LARCH_FTYPE (2, (DI, V2DI, UQI))
+
+DEF_LARCH_FTYPE (2, (INT, DF, DF))
+DEF_LARCH_FTYPE (2, (INT, SF, SF))
+
+DEF_LARCH_FTYPE (2, (INT, V2SF, V2SF))
+DEF_LARCH_FTYPE (4, (INT, V2SF, V2SF, V2SF, V2SF))
+
+DEF_LARCH_FTYPE (1, (SF, SF))
+DEF_LARCH_FTYPE (2, (SF, SF, SF))
+DEF_LARCH_FTYPE (1, (SF, V2SF))
+DEF_LARCH_FTYPE (1, (SF, V4SF))
+
+DEF_LARCH_FTYPE (2, (SI, POINTER, SI))
+DEF_LARCH_FTYPE (1, (SI, SI))
+DEF_LARCH_FTYPE (1, (SI, UDI))
+DEF_LARCH_FTYPE (2, (QI, QI, QI))
+DEF_LARCH_FTYPE (2, (HI, HI, HI))
+DEF_LARCH_FTYPE (3, (SI, SI, SI, SI))
+DEF_LARCH_FTYPE (3, (SI, SI, SI, QI))
+DEF_LARCH_FTYPE (1, (SI, UQI))
+DEF_LARCH_FTYPE (1, (SI, UV16QI))
+DEF_LARCH_FTYPE (1, (SI, UV2DI))
+DEF_LARCH_FTYPE (1, (SI, UV4SI))
+DEF_LARCH_FTYPE (1, (SI, UV8HI))
+DEF_LARCH_FTYPE (2, (SI, V16QI, UQI))
+DEF_LARCH_FTYPE (1, (SI, V2HI))
+DEF_LARCH_FTYPE (2, (SI, V2HI, V2HI))
+DEF_LARCH_FTYPE (1, (SI, V4QI))
+DEF_LARCH_FTYPE (2, (SI, V4QI, V4QI))
+DEF_LARCH_FTYPE (2, (SI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (SI, V8HI, UQI))
+DEF_LARCH_FTYPE (1, (SI, VOID))
+
+DEF_LARCH_FTYPE (2, (UDI, UDI, UDI))
+DEF_LARCH_FTYPE (2, (UDI, UV2SI, UV2SI))
+DEF_LARCH_FTYPE (2, (UDI, V2DI, UQI))
+
+DEF_LARCH_FTYPE (2, (USI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (USI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (USI, V8HI, UQI))
+DEF_LARCH_FTYPE (1, (USI, VOID))
+
+DEF_LARCH_FTYPE (2, (UV16QI, UV16QI, UQI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV16QI, USI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (3, (UV16QI, UV16QI, UV16QI, UQI))
+DEF_LARCH_FTYPE (3, (UV16QI, UV16QI, UV16QI, USI))
+DEF_LARCH_FTYPE (3, (UV16QI, UV16QI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV16QI, V16QI))
+
+DEF_LARCH_FTYPE (2, (UV2DI, UV2DI, UQI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, UV2DI, UQI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV2DI, V2DI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (1, (UV2DI, V2DF))
+
+DEF_LARCH_FTYPE (2, (UV2SI, UV2SI, UQI))
+DEF_LARCH_FTYPE (2, (UV2SI, UV2SI, UV2SI))
+
+DEF_LARCH_FTYPE (2, (UV4HI, UV4HI, UQI))
+DEF_LARCH_FTYPE (2, (UV4HI, UV4HI, USI))
+DEF_LARCH_FTYPE (2, (UV4HI, UV4HI, UV4HI))
+DEF_LARCH_FTYPE (3, (UV4HI, UV4HI, UV4HI, UQI))
+DEF_LARCH_FTYPE (3, (UV4HI, UV4HI, UV4HI, USI))
+DEF_LARCH_FTYPE (1, (UV4HI, UV8QI))
+DEF_LARCH_FTYPE (2, (UV4HI, UV8QI, UV8QI))
+
+DEF_LARCH_FTYPE (2, (UV4SI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV4SI, UQI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV4SI, V4SI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (1, (UV4SI, V4SF))
+
+DEF_LARCH_FTYPE (2, (UV8HI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV8HI, UQI))
+DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, UV8HI, UQI))
+DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV8HI, V8HI))
+
+
+
+DEF_LARCH_FTYPE (2, (UV8QI, UV4HI, UV4HI))
+DEF_LARCH_FTYPE (1, (UV8QI, UV8QI))
+DEF_LARCH_FTYPE (2, (UV8QI, UV8QI, UV8QI))
+
+DEF_LARCH_FTYPE (2, (V16QI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (2, (V16QI, CVPOINTER, DI))
+DEF_LARCH_FTYPE (1, (V16QI, HI))
+DEF_LARCH_FTYPE (1, (V16QI, SI))
+DEF_LARCH_FTYPE (2, (V16QI, UV16QI, UQI))
+DEF_LARCH_FTYPE (2, (V16QI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (1, (V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, QI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, SI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, USI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, UQI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, UQI, SI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, UQI, V16QI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, V16QI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, SI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, UQI))
+DEF_LARCH_FTYPE (4, (V16QI, V16QI, V16QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, USI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, V16QI))
+
+
+DEF_LARCH_FTYPE (1, (V2DF, DF))
+DEF_LARCH_FTYPE (1, (V2DF, UV2DI))
+DEF_LARCH_FTYPE (1, (V2DF, V2DF))
+DEF_LARCH_FTYPE (2, (V2DF, V2DF, V2DF))
+DEF_LARCH_FTYPE (3, (V2DF, V2DF, V2DF, V2DF))
+DEF_LARCH_FTYPE (2, (V2DF, V2DF, V2DI))
+DEF_LARCH_FTYPE (1, (V2DF, V2DI))
+DEF_LARCH_FTYPE (1, (V2DF, V4SF))
+DEF_LARCH_FTYPE (1, (V2DF, V4SI))
+
+DEF_LARCH_FTYPE (2, (V2DI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V2DI, DI))
+DEF_LARCH_FTYPE (1, (V2DI, HI))
+DEF_LARCH_FTYPE (2, (V2DI, UV2DI, UQI))
+DEF_LARCH_FTYPE (2, (V2DI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (V2DI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (1, (V2DI, V2DF))
+DEF_LARCH_FTYPE (2, (V2DI, V2DF, V2DF))
+DEF_LARCH_FTYPE (1, (V2DI, V2DI))
+DEF_LARCH_FTYPE (1, (UV2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, QI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, SI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, UQI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, USI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UQI, DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UQI, V2DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, V2DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, SI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, USI))
+DEF_LARCH_FTYPE (4, (V2DI, V2DI, V2DI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, V2DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V2DI, V4SI, V4SI))
+
+DEF_LARCH_FTYPE (1, (V2HI, SI))
+DEF_LARCH_FTYPE (2, (V2HI, SI, SI))
+DEF_LARCH_FTYPE (3, (V2HI, SI, SI, SI))
+DEF_LARCH_FTYPE (1, (V2HI, V2HI))
+DEF_LARCH_FTYPE (2, (V2HI, V2HI, SI))
+DEF_LARCH_FTYPE (2, (V2HI, V2HI, V2HI))
+DEF_LARCH_FTYPE (1, (V2HI, V4QI))
+DEF_LARCH_FTYPE (2, (V2HI, V4QI, V2HI))
+
+DEF_LARCH_FTYPE (2, (V2SF, SF, SF))
+DEF_LARCH_FTYPE (1, (V2SF, V2SF))
+DEF_LARCH_FTYPE (2, (V2SF, V2SF, V2SF))
+DEF_LARCH_FTYPE (3, (V2SF, V2SF, V2SF, INT))
+DEF_LARCH_FTYPE (4, (V2SF, V2SF, V2SF, V2SF, V2SF))
+
+DEF_LARCH_FTYPE (2, (V2SI, V2SI, UQI))
+DEF_LARCH_FTYPE (2, (V2SI, V2SI, V2SI))
+DEF_LARCH_FTYPE (2, (V2SI, V4HI, V4HI))
+
+DEF_LARCH_FTYPE (2, (V4HI, V2SI, V2SI))
+DEF_LARCH_FTYPE (2, (V4HI, V4HI, UQI))
+DEF_LARCH_FTYPE (2, (V4HI, V4HI, USI))
+DEF_LARCH_FTYPE (2, (V4HI, V4HI, V4HI))
+DEF_LARCH_FTYPE (3, (V4HI, V4HI, V4HI, UQI))
+DEF_LARCH_FTYPE (3, (V4HI, V4HI, V4HI, USI))
+
+DEF_LARCH_FTYPE (1, (V4QI, SI))
+DEF_LARCH_FTYPE (2, (V4QI, V2HI, V2HI))
+DEF_LARCH_FTYPE (1, (V4QI, V4QI))
+DEF_LARCH_FTYPE (2, (V4QI, V4QI, SI))
+DEF_LARCH_FTYPE (2, (V4QI, V4QI, V4QI))
+
+DEF_LARCH_FTYPE (1, (V4SF, SF))
+DEF_LARCH_FTYPE (1, (V4SF, UV4SI))
+DEF_LARCH_FTYPE (2, (V4SF, V2DF, V2DF))
+DEF_LARCH_FTYPE (1, (V4SF, V4SF))
+DEF_LARCH_FTYPE (2, (V4SF, V4SF, V4SF))
+DEF_LARCH_FTYPE (3, (V4SF, V4SF, V4SF, V4SF))
+DEF_LARCH_FTYPE (2, (V4SF, V4SF, V4SI))
+DEF_LARCH_FTYPE (1, (V4SF, V4SI))
+DEF_LARCH_FTYPE (1, (V4SF, V8HI))
+
+DEF_LARCH_FTYPE (2, (V4SI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V4SI, HI))
+DEF_LARCH_FTYPE (1, (V4SI, SI))
+DEF_LARCH_FTYPE (2, (V4SI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (V4SI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (V4SI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V4SI, V2DF, V2DF))
+DEF_LARCH_FTYPE (1, (V4SI, V4SF))
+DEF_LARCH_FTYPE (2, (V4SI, V4SF, V4SF))
+DEF_LARCH_FTYPE (1, (V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, QI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, SI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, USI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UQI, SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UQI, V4SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, V4SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, UQI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, USI))
+DEF_LARCH_FTYPE (4, (V4SI, V4SI, V4SI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, V4SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V4SI, V8HI, V8HI))
+
+DEF_LARCH_FTYPE (2, (V8HI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V8HI, HI))
+DEF_LARCH_FTYPE (1, (V8HI, SI))
+DEF_LARCH_FTYPE (2, (V8HI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (V8HI, UV8HI, UQI))
+DEF_LARCH_FTYPE (2, (V8HI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V8HI, V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V8HI, V4SF, V4SF))
+DEF_LARCH_FTYPE (1, (V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, QI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, SI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, SI, UQI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, USI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UQI, SI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UQI, V8HI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, V8HI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, SI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, UQI))
+DEF_LARCH_FTYPE (4, (V8HI, V8HI, V8HI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, USI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, V8HI))
+
+DEF_LARCH_FTYPE (2, (V8QI, V4HI, V4HI))
+DEF_LARCH_FTYPE (1, (V8QI, V8QI))
+DEF_LARCH_FTYPE (2, (V8QI, V8QI, V8QI))
+
+DEF_LARCH_FTYPE (2, (VOID, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (VOID, SI, SI))
+DEF_LARCH_FTYPE (2, (VOID, UQI, SI))
+DEF_LARCH_FTYPE (2, (VOID, USI, UQI))
+DEF_LARCH_FTYPE (1, (VOID, UHI))
+DEF_LARCH_FTYPE (3, (VOID, V16QI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V16QI, CVPOINTER, DI))
+DEF_LARCH_FTYPE (3, (VOID, V2DF, POINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V2DI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (2, (VOID, V2HI, V2HI))
+DEF_LARCH_FTYPE (2, (VOID, V4QI, V4QI))
+DEF_LARCH_FTYPE (3, (VOID, V4SF, POINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V4SI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V8HI, CVPOINTER, SI))
+
+DEF_LARCH_FTYPE (1, (V8HI, V16QI))
+DEF_LARCH_FTYPE (1, (V4SI, V16QI))
+DEF_LARCH_FTYPE (1, (V2DI, V16QI))
+DEF_LARCH_FTYPE (1, (V4SI, V8HI))
+DEF_LARCH_FTYPE (1, (V2DI, V8HI))
+DEF_LARCH_FTYPE (1, (V2DI, V4SI))
+DEF_LARCH_FTYPE (1, (UV8HI, V16QI))
+DEF_LARCH_FTYPE (1, (UV4SI, V16QI))
+DEF_LARCH_FTYPE (1, (UV2DI, V16QI))
+DEF_LARCH_FTYPE (1, (UV4SI, V8HI))
+DEF_LARCH_FTYPE (1, (UV2DI, V8HI))
+DEF_LARCH_FTYPE (1, (UV2DI, V4SI))
+DEF_LARCH_FTYPE (1, (UV8HI, UV16QI))
+DEF_LARCH_FTYPE (1, (UV4SI, UV16QI))
+DEF_LARCH_FTYPE (1, (UV2DI, UV16QI))
+DEF_LARCH_FTYPE (1, (UV4SI, UV8HI))
+DEF_LARCH_FTYPE (1, (UV2DI, UV8HI))
+DEF_LARCH_FTYPE (1, (UV2DI, UV4SI))
+DEF_LARCH_FTYPE (2, (UV8HI, V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (UV4SI, V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (UV2DI, V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V8HI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (V4SI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (V2DI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV16QI, UQI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV8HI, UQI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (V16QI, V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V8HI, V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V4SI, V2DI, V2DI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (V16QI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (V8HI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (V4SI, V2DI, UQI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV8HI, UQI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV2DI, UQI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, DI))
+DEF_LARCH_FTYPE (2, (V16QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UQI, UQI))
+DEF_LARCH_FTYPE (2, (V4SF, V2DI, V2DI))
+DEF_LARCH_FTYPE (1, (V2DI, V4SF))
+DEF_LARCH_FTYPE (2, (V2DI, UQI, USI))
+DEF_LARCH_FTYPE (2, (V2DI, UQI, UQI))
+DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V16QI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V8HI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V4SI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V2DI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V16QI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V8HI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V4SI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V2DI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V8HI, UV16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (UV16QI, V16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (UV8HI, V8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (UV4SI, V4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (V4SI, V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V4SI, UV16QI, V16QI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V4SI, UV8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V2DI, UV4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V2DI, UV2DI, V2DI))
+DEF_LARCH_FTYPE (2, (V2DI, V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V2DI, UV8HI, V8HI))
+DEF_LARCH_FTYPE (2, (UV2DI, V2DI, UV2DI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UV8HI, V8HI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UV2DI, V2DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UV4SI, V4SI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V8HI, V8HI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UV8HI, V8HI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UV16QI, V16QI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V16QI, V16QI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UV16QI, V16QI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV16QI, UV16QI))
+
+DEF_LARCH_FTYPE(4,(VOID,V16QI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V8HI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V4SI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V2DI,CVPOINTER,SI,UQI))
+
+DEF_LARCH_FTYPE (2, (DI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (DI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (DI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (UDI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (UDI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (UDI, V4SI, UQI))
+
+DEF_LARCH_FTYPE (3, (UV16QI, UV16QI, V16QI, USI))
+DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, V8HI, USI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, V4SI, USI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, V2DI, USI))
+
+DEF_LARCH_FTYPE (1, (BOOLEAN,V16QI))
+DEF_LARCH_FTYPE(2,(V16QI,CVPOINTER,CVPOINTER))
+DEF_LARCH_FTYPE(3,(VOID,V16QI,CVPOINTER,CVPOINTER))
+
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, SI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, SI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, DI, UQI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, SI, UQI))
diff --git a/gcc/config/loongarch/lsxintrin.h b/gcc/config/loongarch/lsxintrin.h
new file mode 100644
index 00000000000..ec42069904d
--- /dev/null
+++ b/gcc/config/loongarch/lsxintrin.h
@@ -0,0 +1,5181 @@
+/* LARCH Loongson SX intrinsics include file.
+
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _GCC_LOONGSON_SXINTRIN_H
+#define _GCC_LOONGSON_SXINTRIN_H 1
+
+#if defined(__loongarch_sx)
+typedef signed char v16i8 __attribute__ ((vector_size(16), aligned(16)));
+typedef signed char v16i8_b __attribute__ ((vector_size(16), aligned(1)));
+typedef unsigned char v16u8 __attribute__ ((vector_size(16), aligned(16)));
+typedef unsigned char v16u8_b __attribute__ ((vector_size(16), aligned(1)));
+typedef short v8i16 __attribute__ ((vector_size(16), aligned(16)));
+typedef short v8i16_h __attribute__ ((vector_size(16), aligned(2)));
+typedef unsigned short v8u16 __attribute__ ((vector_size(16), aligned(16)));
+typedef unsigned short v8u16_h __attribute__ ((vector_size(16), aligned(2)));
+typedef int v4i32 __attribute__ ((vector_size(16), aligned(16)));
+typedef int v4i32_w __attribute__ ((vector_size(16), aligned(4)));
+typedef unsigned int v4u32 __attribute__ ((vector_size(16), aligned(16)));
+typedef unsigned int v4u32_w __attribute__ ((vector_size(16), aligned(4)));
+typedef long long v2i64 __attribute__ ((vector_size(16), aligned(16)));
+typedef long long v2i64_d __attribute__ ((vector_size(16), aligned(8)));
+typedef unsigned long long v2u64 __attribute__ ((vector_size(16), aligned(16)));
+typedef unsigned long long v2u64_d __attribute__ ((vector_size(16), aligned(8)));
+typedef float v4f32 __attribute__ ((vector_size(16), aligned(16)));
+typedef float v4f32_w __attribute__ ((vector_size(16), aligned(4)));
+typedef double v2f64 __attribute__ ((vector_size(16), aligned(16)));
+typedef double v2f64_d __attribute__ ((vector_size(16), aligned(8)));
+
+typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__));
+typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
+typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsll_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsll_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsll_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsll_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsll_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsll_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsll_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsll_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vslli_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vslli_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vslli_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vslli_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vslli_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslli_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vslli_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vslli_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsra_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsra_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsra_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsra_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsra_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsra_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsra_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsra_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsrai_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsrai_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsrai_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsrai_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsrai_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsrai_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsrai_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsrai_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrar_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrar_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrar_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrar_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrar_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrar_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrar_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrar_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsrari_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsrari_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsrari_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsrari_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsrari_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsrari_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsrari_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsrari_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrl_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrl_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrl_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrl_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrl_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrl_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrl_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrl_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsrli_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsrli_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsrli_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsrli_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsrli_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsrli_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsrli_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsrli_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlr_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlr_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlr_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlr_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlr_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlr_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlr_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlr_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsrlri_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsrlri_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsrlri_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsrlri_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsrlri_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsrlri_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsrlri_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsrlri_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitclr_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitclr_b ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitclr_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitclr_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitclr_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitclr_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitclr_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitclr_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vbitclri_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vbitclri_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vbitclri_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vbitclri_h ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vbitclri_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbitclri_w ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vbitclri_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vbitclri_d ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitset_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitset_b ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitset_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitset_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitset_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitset_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitset_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitset_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vbitseti_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vbitseti_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vbitseti_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vbitseti_h ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vbitseti_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbitseti_w ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vbitseti_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vbitseti_d ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitrev_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitrev_b ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitrev_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitrev_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitrev_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitrev_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitrev_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitrev_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vbitrevi_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vbitrevi_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vbitrevi_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vbitrevi_h ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vbitrevi_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbitrevi_w ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vbitrevi_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vbitrevi_d ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vaddi_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vaddi_bu ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vaddi_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vaddi_hu ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vaddi_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vaddi_wu ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vaddi_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vaddi_du ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsubi_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsubi_bu ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsubi_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsubi_hu ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsubi_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsubi_wu ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsubi_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsubi_du ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vmaxi_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vmaxi_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vmaxi_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vmaxi_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vmaxi_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vmaxi_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vmaxi_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vmaxi_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vmini_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vmini_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vmini_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vmini_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vmini_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vmini_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vmini_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vmini_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vseq_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vseq_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vseq_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vseq_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vseq_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vseq_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vseq_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vseq_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vseqi_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vseqi_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vseqi_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vseqi_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vseqi_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vseqi_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vseqi_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vseqi_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vslti_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vslti_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vslti_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vslti_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, UV16QI, UQI.  */
+#define __lsx_vslti_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, UV8HI, UQI.  */
+#define __lsx_vslti_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, UV4SI, UQI.  */
+#define __lsx_vslti_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UQI.  */
+#define __lsx_vslti_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vslei_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vslei_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vslei_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vslei_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, UV16QI, UQI.  */
+#define __lsx_vslei_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, UV8HI, UQI.  */
+#define __lsx_vslei_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, UV4SI, UQI.  */
+#define __lsx_vslei_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UQI.  */
+#define __lsx_vslei_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsat_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsat_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsat_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsat_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vsat_bu(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vsat_hu(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vsat_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vsat_du(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadda_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadda_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadda_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadda_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadda_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadda_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadda_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadda_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmul_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmul_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmul_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmul_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmul_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmul_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmul_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmul_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmadd_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmadd_b ((v16i8)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmadd_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmadd_h ((v8i16)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmadd_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmadd_w ((v4i32)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmadd_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmadd_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsub_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmsub_b ((v16i8)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsub_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmsub_h ((v8i16)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsub_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmsub_w ((v4i32)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsub_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmsub_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_hu_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_hu_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_wu_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_wu_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_du_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_du_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_hu_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_hu_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_wu_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_wu_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_du_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_du_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, rk.  */
+/* Data types in instruction templates:  V16QI, V16QI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplve_b (__m128i _1, int _2)
+{
+  return (__m128i)__builtin_lsx_vreplve_b ((v16i8)_1, (int)_2);
+}
+
+/* Assembly instruction format:	vd, vj, rk.  */
+/* Data types in instruction templates:  V8HI, V8HI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplve_h (__m128i _1, int _2)
+{
+  return (__m128i)__builtin_lsx_vreplve_h ((v8i16)_1, (int)_2);
+}
+
+/* Assembly instruction format:	vd, vj, rk.  */
+/* Data types in instruction templates:  V4SI, V4SI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplve_w (__m128i _1, int _2)
+{
+  return (__m128i)__builtin_lsx_vreplve_w ((v4i32)_1, (int)_2);
+}
+
+/* Assembly instruction format:	vd, vj, rk.  */
+/* Data types in instruction templates:  V2DI, V2DI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplve_d (__m128i _1, int _2)
+{
+  return (__m128i)__builtin_lsx_vreplve_d ((v2i64)_1, (int)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vreplvei_b(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vreplvei_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vreplvei_h(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vreplvei_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui2.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vreplvei_w(/*__m128i*/ _1, /*ui2*/ _2) \
+  ((__m128i)__builtin_lsx_vreplvei_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui1.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vreplvei_d(/*__m128i*/ _1, /*ui1*/ _2) \
+  ((__m128i)__builtin_lsx_vreplvei_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickev_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickev_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickev_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickev_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickev_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickev_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickev_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickev_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickod_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickod_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickod_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickod_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickod_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickod_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickod_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickod_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvh_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvh_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvh_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvh_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvh_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvh_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvh_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvh_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvl_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvl_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvl_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvl_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvl_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvl_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvl_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvl_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackev_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackev_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackev_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackev_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackev_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackev_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackev_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackev_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackod_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackod_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackod_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackod_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackod_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackod_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackod_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackod_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vshuf_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vshuf_h ((v8i16)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vshuf_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vshuf_w ((v4i32)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vshuf_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vshuf_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vand_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vand_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vandi_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vandi_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vor_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vor_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vori_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vori_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vnor_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vnor_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vnori_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vnori_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vxor_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vxor_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vxori_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vxori_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitsel_v (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vbitsel_v ((v16u8)_1, (v16u8)_2, (v16u8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI, USI.  */
+#define __lsx_vbitseli_b(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vbitseli_b ((v16u8)(_1), (v16u8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V16QI, V16QI, USI.  */
+#define __lsx_vshuf4i_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vshuf4i_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V8HI, V8HI, USI.  */
+#define __lsx_vshuf4i_h(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vshuf4i_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V4SI, V4SI, USI.  */
+#define __lsx_vshuf4i_w(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vshuf4i_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj.  */
+/* Data types in instruction templates:  V16QI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplgr2vr_b (int _1)
+{
+  return (__m128i)__builtin_lsx_vreplgr2vr_b ((int)_1);
+}
+
+/* Assembly instruction format:	vd, rj.  */
+/* Data types in instruction templates:  V8HI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplgr2vr_h (int _1)
+{
+  return (__m128i)__builtin_lsx_vreplgr2vr_h ((int)_1);
+}
+
+/* Assembly instruction format:	vd, rj.  */
+/* Data types in instruction templates:  V4SI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplgr2vr_w (int _1)
+{
+  return (__m128i)__builtin_lsx_vreplgr2vr_w ((int)_1);
+}
+
+/* Assembly instruction format:	vd, rj.  */
+/* Data types in instruction templates:  V2DI, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplgr2vr_d (long int _1)
+{
+  return (__m128i)__builtin_lsx_vreplgr2vr_d ((long int)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpcnt_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vpcnt_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpcnt_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vpcnt_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpcnt_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vpcnt_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpcnt_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vpcnt_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclo_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclo_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclo_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclo_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclo_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclo_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclo_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclo_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclz_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclz_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclz_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclz_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclz_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclz_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclz_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclz_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	rd, vj, ui4.  */
+/* Data types in instruction templates:  SI, V16QI, UQI.  */
+#define __lsx_vpickve2gr_b(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((int)__builtin_lsx_vpickve2gr_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui3.  */
+/* Data types in instruction templates:  SI, V8HI, UQI.  */
+#define __lsx_vpickve2gr_h(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((int)__builtin_lsx_vpickve2gr_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui2.  */
+/* Data types in instruction templates:  SI, V4SI, UQI.  */
+#define __lsx_vpickve2gr_w(/*__m128i*/ _1, /*ui2*/ _2) \
+  ((int)__builtin_lsx_vpickve2gr_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui1.  */
+/* Data types in instruction templates:  DI, V2DI, UQI.  */
+#define __lsx_vpickve2gr_d(/*__m128i*/ _1, /*ui1*/ _2) \
+  ((long int)__builtin_lsx_vpickve2gr_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui4.  */
+/* Data types in instruction templates:  USI, V16QI, UQI.  */
+#define __lsx_vpickve2gr_bu(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((unsigned int)__builtin_lsx_vpickve2gr_bu ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui3.  */
+/* Data types in instruction templates:  USI, V8HI, UQI.  */
+#define __lsx_vpickve2gr_hu(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((unsigned int)__builtin_lsx_vpickve2gr_hu ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui2.  */
+/* Data types in instruction templates:  USI, V4SI, UQI.  */
+#define __lsx_vpickve2gr_wu(/*__m128i*/ _1, /*ui2*/ _2) \
+  ((unsigned int)__builtin_lsx_vpickve2gr_wu ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui1.  */
+/* Data types in instruction templates:  UDI, V2DI, UQI.  */
+#define __lsx_vpickve2gr_du(/*__m128i*/ _1, /*ui1*/ _2) \
+  ((unsigned long int)__builtin_lsx_vpickve2gr_du ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, SI, UQI.  */
+#define __lsx_vinsgr2vr_b(/*__m128i*/ _1, /*int*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vinsgr2vr_b ((v16i8)(_1), (int)(_2), (_3)))
+
+/* Assembly instruction format:	vd, rj, ui3.  */
+/* Data types in instruction templates:  V8HI, V8HI, SI, UQI.  */
+#define __lsx_vinsgr2vr_h(/*__m128i*/ _1, /*int*/ _2, /*ui3*/ _3) \
+  ((__m128i)__builtin_lsx_vinsgr2vr_h ((v8i16)(_1), (int)(_2), (_3)))
+
+/* Assembly instruction format:	vd, rj, ui2.  */
+/* Data types in instruction templates:  V4SI, V4SI, SI, UQI.  */
+#define __lsx_vinsgr2vr_w(/*__m128i*/ _1, /*int*/ _2, /*ui2*/ _3) \
+  ((__m128i)__builtin_lsx_vinsgr2vr_w ((v4i32)(_1), (int)(_2), (_3)))
+
+/* Assembly instruction format:	vd, rj, ui1.  */
+/* Data types in instruction templates:  V2DI, V2DI, DI, UQI.  */
+#define __lsx_vinsgr2vr_d(/*__m128i*/ _1, /*long int*/ _2, /*ui1*/ _3) \
+  ((__m128i)__builtin_lsx_vinsgr2vr_d ((v2i64)(_1), (long int)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfadd_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfadd_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfadd_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfadd_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfsub_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfsub_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfsub_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfsub_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmul_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmul_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmul_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmul_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfdiv_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfdiv_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfdiv_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfdiv_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcvt_h_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcvt_h_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfcvt_s_d (__m128d _1, __m128d _2)
+{
+  return (__m128)__builtin_lsx_vfcvt_s_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmin_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmin_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmin_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmin_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmina_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmina_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmina_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmina_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmax_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmax_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmax_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmax_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmaxa_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmaxa_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmaxa_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmaxa_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfclass_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vfclass_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfclass_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vfclass_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfsqrt_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfsqrt_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfsqrt_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfsqrt_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrecip_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrecip_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrecip_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrecip_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrint_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrint_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrint_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrint_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrsqrt_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrsqrt_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrsqrt_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrsqrt_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vflogb_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vflogb_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vflogb_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vflogb_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfcvth_s_h (__m128i _1)
+{
+  return (__m128)__builtin_lsx_vfcvth_s_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfcvth_d_s (__m128 _1)
+{
+  return (__m128d)__builtin_lsx_vfcvth_d_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfcvtl_s_h (__m128i _1)
+{
+  return (__m128)__builtin_lsx_vfcvtl_s_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfcvtl_d_s (__m128 _1)
+{
+  return (__m128d)__builtin_lsx_vfcvtl_d_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftint_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftint_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_wu_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftint_wu_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_lu_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftint_lu_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrz_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrz_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_wu_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrz_wu_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_lu_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrz_lu_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vffint_s_w (__m128i _1)
+{
+  return (__m128)__builtin_lsx_vffint_s_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vffint_d_l (__m128i _1)
+{
+  return (__m128d)__builtin_lsx_vffint_d_l ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vffint_s_wu (__m128i _1)
+{
+  return (__m128)__builtin_lsx_vffint_s_wu ((v4u32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vffint_d_lu (__m128i _1)
+{
+  return (__m128d)__builtin_lsx_vffint_d_lu ((v2u64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vandn_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vandn_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vneg_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vneg_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vneg_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vneg_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vneg_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vneg_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vneg_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vneg_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V8HI, V16QI, UQI.  */
+#define __lsx_vsllwil_h_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_h_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V4SI, V8HI, UQI.  */
+#define __lsx_vsllwil_w_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_w_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, V4SI, UQI.  */
+#define __lsx_vsllwil_d_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_d_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV8HI, UV16QI, UQI.  */
+#define __lsx_vsllwil_hu_bu(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_hu_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV4SI, UV8HI, UQI.  */
+#define __lsx_vsllwil_wu_hu(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_wu_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV2DI, UV4SI, UQI.  */
+#define __lsx_vsllwil_du_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_du_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsran_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsran_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsran_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsran_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsran_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsran_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_bu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_bu_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_hu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_hu_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_wu_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_wu_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrarn_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrarn_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrarn_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrarn_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrarn_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrarn_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_bu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_bu_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_hu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_hu_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_wu_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_wu_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrln_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrln_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrln_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrln_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrln_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrln_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_bu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_bu_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_hu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_hu_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_wu_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_wu_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlrn_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlrn_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlrn_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlrn_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlrn_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlrn_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_bu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_bu_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_hu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_hu_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_wu_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_wu_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, UQI.  */
+#define __lsx_vfrstpi_b(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vfrstpi_b ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, UQI.  */
+#define __lsx_vfrstpi_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vfrstpi_h ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfrstp_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vfrstp_b ((v16i8)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfrstp_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vfrstp_h ((v8i16)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vshuf4i_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vshuf4i_d ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vbsrl_v(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbsrl_v ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vbsll_v(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbsll_v ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vextrins_b(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vextrins_b ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vextrins_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vextrins_h ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vextrins_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vextrins_w ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vextrins_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vextrins_d ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskltz_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskltz_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskltz_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskltz_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskltz_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskltz_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskltz_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskltz_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsigncov_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsigncov_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsigncov_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsigncov_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsigncov_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsigncov_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsigncov_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsigncov_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmadd_s (__m128 _1, __m128 _2, __m128 _3)
+{
+  return (__m128)__builtin_lsx_vfmadd_s ((v4f32)_1, (v4f32)_2, (v4f32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmadd_d (__m128d _1, __m128d _2, __m128d _3)
+{
+  return (__m128d)__builtin_lsx_vfmadd_d ((v2f64)_1, (v2f64)_2, (v2f64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmsub_s (__m128 _1, __m128 _2, __m128 _3)
+{
+  return (__m128)__builtin_lsx_vfmsub_s ((v4f32)_1, (v4f32)_2, (v4f32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmsub_d (__m128d _1, __m128d _2, __m128d _3)
+{
+  return (__m128d)__builtin_lsx_vfmsub_d ((v2f64)_1, (v2f64)_2, (v2f64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfnmadd_s (__m128 _1, __m128 _2, __m128 _3)
+{
+  return (__m128)__builtin_lsx_vfnmadd_s ((v4f32)_1, (v4f32)_2, (v4f32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfnmadd_d (__m128d _1, __m128d _2, __m128d _3)
+{
+  return (__m128d)__builtin_lsx_vfnmadd_d ((v2f64)_1, (v2f64)_2, (v2f64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfnmsub_s (__m128 _1, __m128 _2, __m128 _3)
+{
+  return (__m128)__builtin_lsx_vfnmsub_s ((v4f32)_1, (v4f32)_2, (v4f32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfnmsub_d (__m128d _1, __m128d _2, __m128d _3)
+{
+  return (__m128d)__builtin_lsx_vfnmsub_d ((v2f64)_1, (v2f64)_2, (v2f64)_3);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrne_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrne_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrne_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrne_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrp_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrp_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrp_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrp_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrm_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrm_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrm_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrm_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftint_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vffint_s_l (__m128i _1, __m128i _2)
+{
+  return (__m128)__builtin_lsx_vffint_s_l ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftintrz_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrp_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftintrp_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrm_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftintrm_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrne_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftintrne_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintl_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintl_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftinth_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftinth_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vffinth_d_w (__m128i _1)
+{
+  return (__m128d)__builtin_lsx_vffinth_d_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vffintl_d_w (__m128i _1)
+{
+  return (__m128d)__builtin_lsx_vffintl_d_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrzl_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrzl_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrzh_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrzh_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrpl_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrpl_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrph_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrph_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrml_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrml_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrmh_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrmh_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrnel_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrnel_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrneh_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrneh_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrintrne_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrintrne_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrintrne_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrintrne_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrintrz_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrintrz_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrintrz_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrintrz_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrintrp_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrintrp_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrintrp_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrintrp_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrintrm_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrintrm_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrintrm_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrintrm_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V16QI, CVPOINTER, SI, UQI.  */
+#define __lsx_vstelm_b(/*__m128i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lsx_vstelm_b ((v16i8)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	vd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V8HI, CVPOINTER, SI, UQI.  */
+#define __lsx_vstelm_h(/*__m128i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lsx_vstelm_h ((v8i16)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	vd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V4SI, CVPOINTER, SI, UQI.  */
+#define __lsx_vstelm_w(/*__m128i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lsx_vstelm_w ((v4i32)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	vd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V2DI, CVPOINTER, SI, UQI.  */
+#define __lsx_vstelm_d(/*__m128i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lsx_vstelm_d ((v2i64)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_d_wu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_d_wu_w ((v4u32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_w_hu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_w_hu_h ((v8u16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_h_bu_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_h_bu_b ((v16u8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_d_wu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_d_wu_w ((v4u32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_w_hu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_w_hu_h ((v8u16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_h_bu_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_h_bu_b ((v16u8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_q_du_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_q_du_d ((v2u64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_q_du_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_q_du_d ((v2u64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_d_wu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_d_wu_w ((v4u32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_w_hu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_w_hu_h ((v8u16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_h_bu_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_h_bu_b ((v16u8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_d_wu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_d_wu_w ((v4u32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_w_hu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_w_hu_h ((v8u16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_h_bu_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_h_bu_b ((v16u8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_q_du_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_q_du_d ((v2u64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_q_du_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_q_du_d ((v2u64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_qu_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_qu_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_qu_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_qu_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_d_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_d_w ((v2i64)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_w_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_w_h ((v4i32)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_h_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_h_b ((v8i16)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_d_wu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_d_wu ((v2u64)_1, (v4u32)_2, (v4u32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_w_hu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_w_hu ((v4u32)_1, (v8u16)_2, (v8u16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_h_bu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_h_bu ((v8u16)_1, (v16u8)_2, (v16u8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_d_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_d_w ((v2i64)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_w_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_w_h ((v4i32)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_h_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_h_b ((v8i16)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_d_wu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_d_wu ((v2u64)_1, (v4u32)_2, (v4u32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_w_hu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_w_hu ((v4u32)_1, (v8u16)_2, (v8u16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_h_bu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_h_bu ((v8u16)_1, (v16u8)_2, (v16u8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_d_wu_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_d_wu_w ((v2i64)_1, (v4u32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_w_hu_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_w_hu_h ((v4i32)_1, (v8u16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_h_bu_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_h_bu_b ((v8i16)_1, (v16u8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_d_wu_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_d_wu_w ((v2i64)_1, (v4u32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_w_hu_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_w_hu_h ((v4i32)_1, (v8u16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_h_bu_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_h_bu_b ((v8i16)_1, (v16u8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_q_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_q_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_q_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_q_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_q_du (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_q_du ((v2u64)_1, (v2u64)_2, (v2u64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_q_du (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_q_du ((v2u64)_1, (v2u64)_2, (v2u64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_q_du_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_q_du_d ((v2i64)_1, (v2u64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_q_du_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_q_du_d ((v2i64)_1, (v2u64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vrotr_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vrotr_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vrotr_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vrotr_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vrotr_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vrotr_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vrotr_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vrotr_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_q (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_q ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_q (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_q ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, rj, si12.  */
+/* Data types in instruction templates:  V16QI, CVPOINTER, SI.  */
+#define __lsx_vldrepl_b(/*void **/ _1, /*si12*/ _2) \
+  ((__m128i)__builtin_lsx_vldrepl_b ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, si11.  */
+/* Data types in instruction templates:  V8HI, CVPOINTER, SI.  */
+#define __lsx_vldrepl_h(/*void **/ _1, /*si11*/ _2) \
+  ((__m128i)__builtin_lsx_vldrepl_h ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, si10.  */
+/* Data types in instruction templates:  V4SI, CVPOINTER, SI.  */
+#define __lsx_vldrepl_w(/*void **/ _1, /*si10*/ _2) \
+  ((__m128i)__builtin_lsx_vldrepl_w ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, si9.  */
+/* Data types in instruction templates:  V2DI, CVPOINTER, SI.  */
+#define __lsx_vldrepl_d(/*void **/ _1, /*si9*/ _2) \
+  ((__m128i)__builtin_lsx_vldrepl_d ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskgez_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskgez_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsknz_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmsknz_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_h_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_h_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_w_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_w_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_d_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_d_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_q_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_q_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV8HI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_hu_bu (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_hu_bu ((v16u8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV4SI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_wu_hu (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_wu_hu ((v8u16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_du_wu (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_du_wu ((v4u32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_qu_du (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_qu_du ((v2u64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vrotri_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vrotri_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vrotri_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vrotri_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vrotri_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vrotri_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vrotri_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vrotri_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vextl_q_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vextl_q_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vsrlni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vsrlni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vsrlni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vsrlni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vsrlrni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlrni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vsrlrni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlrni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vsrlrni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlrni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vsrlrni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlrni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vssrlni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vssrlni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vssrlni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vssrlni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, V16QI, USI.  */
+#define __lsx_vssrlni_bu_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_bu_h ((v16u8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, V8HI, USI.  */
+#define __lsx_vssrlni_hu_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_hu_w ((v8u16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, V4SI, USI.  */
+#define __lsx_vssrlni_wu_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_wu_d ((v4u32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, V2DI, USI.  */
+#define __lsx_vssrlni_du_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_du_q ((v2u64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vssrlrni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vssrlrni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vssrlrni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vssrlrni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, V16QI, USI.  */
+#define __lsx_vssrlrni_bu_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_bu_h ((v16u8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, V8HI, USI.  */
+#define __lsx_vssrlrni_hu_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_hu_w ((v8u16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, V4SI, USI.  */
+#define __lsx_vssrlrni_wu_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_wu_d ((v4u32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, V2DI, USI.  */
+#define __lsx_vssrlrni_du_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_du_q ((v2u64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vsrani_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vsrani_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vsrani_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vsrani_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vsrani_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vsrani_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vsrani_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vsrani_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vsrarni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vsrarni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vsrarni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vsrarni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vsrarni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vsrarni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vsrarni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vsrarni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vssrani_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vssrani_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vssrani_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vssrani_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, V16QI, USI.  */
+#define __lsx_vssrani_bu_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_bu_h ((v16u8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, V8HI, USI.  */
+#define __lsx_vssrani_hu_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_hu_w ((v8u16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, V4SI, USI.  */
+#define __lsx_vssrani_wu_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_wu_d ((v4u32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, V2DI, USI.  */
+#define __lsx_vssrani_du_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_du_q ((v2u64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vssrarni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vssrarni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vssrarni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vssrarni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, V16QI, USI.  */
+#define __lsx_vssrarni_bu_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_bu_h ((v16u8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, V8HI, USI.  */
+#define __lsx_vssrarni_hu_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_hu_w ((v8u16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, V4SI, USI.  */
+#define __lsx_vssrarni_wu_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_wu_d ((v4u32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, V2DI, USI.  */
+#define __lsx_vssrarni_du_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_du_q ((v2u64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vpermi_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vpermi_w ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, rj, si12.  */
+/* Data types in instruction templates:  V16QI, CVPOINTER, SI.  */
+#define __lsx_vld(/*void **/ _1, /*si12*/ _2) \
+  ((__m128i)__builtin_lsx_vld ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, si12.  */
+/* Data types in instruction templates:  VOID, V16QI, CVPOINTER, SI.  */
+#define __lsx_vst(/*__m128i*/ _1, /*void **/ _2, /*si12*/ _3) \
+  ((void)__builtin_lsx_vst ((v16i8)(_1), (void *)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vorn_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vorn_v ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, i13.  */
+/* Data types in instruction templates:  V2DI, HI.  */
+#define __lsx_vldi(/*i13*/ _1) \
+  ((__m128i)__builtin_lsx_vldi ((_1)))
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vshuf_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vshuf_b ((v16i8)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, rj, rk.  */
+/* Data types in instruction templates:  V16QI, CVPOINTER, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vldx (void * _1, long int _2)
+{
+  return (__m128i)__builtin_lsx_vldx ((void *)_1, (long int)_2);
+}
+
+/* Assembly instruction format:	vd, rj, rk.  */
+/* Data types in instruction templates:  VOID, V16QI, CVPOINTER, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+void __lsx_vstx (__m128i _1, void * _2, long int _3)
+{
+  return (void)__builtin_lsx_vstx ((v16i8)_1, (void *)_2, (long int)_3);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vextl_qu_du (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vextl_qu_du ((v2u64)_1);
+}
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV16QI.  */
+#define __lsx_bnz_b(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_b ((v16u8)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV2DI.  */
+#define __lsx_bnz_d(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_d ((v2u64)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV8HI.  */
+#define __lsx_bnz_h(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_h ((v8u16)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV16QI.  */
+#define __lsx_bnz_v(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_v ((v16u8)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV4SI.  */
+#define __lsx_bnz_w(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_w ((v4u32)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV16QI.  */
+#define __lsx_bz_b(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_b ((v16u8)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV2DI.  */
+#define __lsx_bz_d(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_d ((v2u64)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV8HI.  */
+#define __lsx_bz_h(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_h ((v8u16)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV16QI.  */
+#define __lsx_bz_v(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_v ((v16u8)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV4SI.  */
+#define __lsx_bz_w(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_w ((v4u32)(_1)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_caf_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_caf_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_caf_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_caf_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_ceq_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_ceq_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_ceq_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_ceq_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cle_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cle_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cle_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cle_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_clt_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_clt_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_clt_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_clt_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cne_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cne_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cne_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cne_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cor_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cor_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cor_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cor_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cueq_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cueq_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cueq_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cueq_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cule_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cule_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cule_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cule_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cult_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cult_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cult_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cult_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cun_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cun_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cune_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cune_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cune_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cune_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cun_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cun_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_saf_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_saf_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_saf_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_saf_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_seq_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_seq_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_seq_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_seq_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sle_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sle_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sle_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sle_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_slt_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_slt_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_slt_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_slt_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sne_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sne_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sne_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sne_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sor_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sor_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sor_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sor_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sueq_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sueq_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sueq_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sueq_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sule_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sule_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sule_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sule_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sult_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sult_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sult_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sult_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sun_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sun_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sune_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sune_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sune_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sune_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sun_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sun_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, si10.  */
+/* Data types in instruction templates:  V16QI, HI.  */
+#define __lsx_vrepli_b(/*si10*/ _1) \
+  ((__m128i)__builtin_lsx_vrepli_b ((_1)))
+
+/* Assembly instruction format:	vd, si10.  */
+/* Data types in instruction templates:  V2DI, HI.  */
+#define __lsx_vrepli_d(/*si10*/ _1) \
+  ((__m128i)__builtin_lsx_vrepli_d ((_1)))
+
+/* Assembly instruction format:	vd, si10.  */
+/* Data types in instruction templates:  V8HI, HI.  */
+#define __lsx_vrepli_h(/*si10*/ _1) \
+  ((__m128i)__builtin_lsx_vrepli_h ((_1)))
+
+/* Assembly instruction format:	vd, si10.  */
+/* Data types in instruction templates:  V4SI, HI.  */
+#define __lsx_vrepli_w(/*si10*/ _1) \
+  ((__m128i)__builtin_lsx_vrepli_w ((_1)))
+
+#endif /* defined(__loongarch_sx) */
+#endif /* _GCC_LOONGSON_SXINTRIN_H */
-- 
2.36.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 3/4] LoongArch: Add Loongson ASX base instruction support.
  2023-08-31  9:08 [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
  2023-08-31  9:08 ` [PATCH v6 1/4] LoongArch: Add Loongson SX base instruction support Chenghui Pan
  2023-08-31  9:08 ` [PATCH v6 2/4] LoongArch: Add Loongson SX directive builtin function support Chenghui Pan
@ 2023-08-31  9:08 ` Chenghui Pan
  2023-08-31  9:08 ` [PATCH v6 4/4] LoongArch: Add Loongson ASX directive builtin function support Chenghui Pan
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Chenghui Pan @ 2023-08-31  9:08 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config/loongarch/loongarch-modes.def
	(VECTOR_MODES): Add Loongson ASX instruction support.
	* config/loongarch/loongarch-protos.h (loongarch_split_256bit_move): Ditto.
	(loongarch_split_256bit_move_p): Ditto.
	(loongarch_expand_vector_group_init): Ditto.
	(loongarch_expand_vec_perm_1): Ditto.
	* config/loongarch/loongarch.cc (loongarch_symbol_insns): Ditto.
	(loongarch_valid_offset_p): Ditto.
	(loongarch_address_insns): Ditto.
	(loongarch_const_insns): Ditto.
	(loongarch_legitimize_move): Ditto.
	(loongarch_builtin_vectorization_cost): Ditto.
	(loongarch_split_move_p): Ditto.
	(loongarch_split_move): Ditto.
	(loongarch_output_move_index_float): Ditto.
	(loongarch_split_256bit_move_p): Ditto.
	(loongarch_split_256bit_move): Ditto.
	(loongarch_output_move): Ditto.
	(loongarch_print_operand_reloc): Ditto.
	(loongarch_print_operand): Ditto.
	(loongarch_hard_regno_mode_ok_uncached): Ditto.
	(loongarch_hard_regno_nregs): Ditto.
	(loongarch_class_max_nregs): Ditto.
	(loongarch_can_change_mode_class): Ditto.
	(loongarch_mode_ok_for_mov_fmt_p): Ditto.
	(loongarch_vector_mode_supported_p): Ditto.
	(loongarch_preferred_simd_mode): Ditto.
	(loongarch_autovectorize_vector_modes): Ditto.
	(loongarch_lsx_output_division): Ditto.
	(loongarch_expand_lsx_shuffle): Ditto.
	(loongarch_expand_vec_perm): Ditto.
	(loongarch_expand_vec_perm_interleave): Ditto.
	(loongarch_try_expand_lsx_vshuf_const): Ditto.
	(loongarch_expand_vec_perm_even_odd_1): Ditto.
	(loongarch_expand_vec_perm_even_odd): Ditto.
	(loongarch_expand_vec_perm_1): Ditto.
	(loongarch_expand_vec_perm_const_2): Ditto.
	(loongarch_is_quad_duplicate): Ditto.
	(loongarch_is_double_duplicate): Ditto.
	(loongarch_is_odd_extraction): Ditto.
	(loongarch_is_even_extraction): Ditto.
	(loongarch_is_extraction_permutation): Ditto.
	(loongarch_is_center_extraction): Ditto.
	(loongarch_is_reversing_permutation): Ditto.
	(loongarch_is_di_misalign_extract): Ditto.
	(loongarch_is_si_misalign_extract): Ditto.
	(loongarch_is_lasx_lowpart_interleave): Ditto.
	(loongarch_is_lasx_lowpart_interleave_2): Ditto.
	(COMPARE_SELECTOR): Ditto.
	(loongarch_is_lasx_lowpart_extract): Ditto.
	(loongarch_is_lasx_highpart_interleave): Ditto.
	(loongarch_is_lasx_highpart_interleave_2): Ditto.
	(loongarch_is_elem_duplicate): Ditto.
	(loongarch_is_op_reverse_perm): Ditto.
	(loongarch_is_single_op_perm): Ditto.
	(loongarch_is_divisible_perm): Ditto.
	(loongarch_is_triple_stride_extract): Ditto.
	(loongarch_vectorize_vec_perm_const): Ditto.
	(loongarch_cpu_sched_reassociation_width): Ditto.
	(loongarch_expand_vector_extract): Ditto.
	(emit_reduc_half): Ditto.
	(loongarch_expand_vec_unpack): Ditto.
	(loongarch_expand_vector_group_init): Ditto.
	(loongarch_expand_vector_init): Ditto.
	(loongarch_expand_lsx_cmp): Ditto.
	(loongarch_builtin_support_vector_misalignment): Ditto.
	* config/loongarch/loongarch.h (UNITS_PER_LASX_REG): Ditto.
	(BITS_PER_LASX_REG): Ditto.
	(STRUCTURE_SIZE_BOUNDARY): Ditto.
	(LASX_REG_FIRST): Ditto.
	(LASX_REG_LAST): Ditto.
	(LASX_REG_NUM): Ditto.
	(LASX_REG_P): Ditto.
	(LASX_REG_RTX_P): Ditto.
	(LASX_SUPPORTED_MODE_P): Ditto.
	* config/loongarch/loongarch.md: Ditto.
	* config/loongarch/lasx.md: New file.
---
 gcc/config/loongarch/lasx.md             | 5104 ++++++++++++++++++++++
 gcc/config/loongarch/loongarch-modes.def |    1 +
 gcc/config/loongarch/loongarch-protos.h  |    4 +
 gcc/config/loongarch/loongarch.cc        | 2567 ++++++++++-
 gcc/config/loongarch/loongarch.h         |   60 +-
 gcc/config/loongarch/loongarch.md        |   20 +-
 6 files changed, 7637 insertions(+), 119 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
new file mode 100644
index 00000000000..8111c8bb79a
--- /dev/null
+++ b/gcc/config/loongarch/lasx.md
@@ -0,0 +1,5104 @@
+;; Machine Description for LARCH Loongson ASX ASE
+;;
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+;;
+
+(define_c_enum "unspec" [
+  UNSPEC_LASX_XVABSD_S
+  UNSPEC_LASX_XVABSD_U
+  UNSPEC_LASX_XVAVG_S
+  UNSPEC_LASX_XVAVG_U
+  UNSPEC_LASX_XVAVGR_S
+  UNSPEC_LASX_XVAVGR_U
+  UNSPEC_LASX_XVBITCLR
+  UNSPEC_LASX_XVBITCLRI
+  UNSPEC_LASX_XVBITREV
+  UNSPEC_LASX_XVBITREVI
+  UNSPEC_LASX_XVBITSET
+  UNSPEC_LASX_XVBITSETI
+  UNSPEC_LASX_XVFCMP_CAF
+  UNSPEC_LASX_XVFCLASS
+  UNSPEC_LASX_XVFCMP_CUNE
+  UNSPEC_LASX_XVFCVT
+  UNSPEC_LASX_XVFCVTH
+  UNSPEC_LASX_XVFCVTL
+  UNSPEC_LASX_XVFLOGB
+  UNSPEC_LASX_XVFRECIP
+  UNSPEC_LASX_XVFRINT
+  UNSPEC_LASX_XVFRSQRT
+  UNSPEC_LASX_XVFCMP_SAF
+  UNSPEC_LASX_XVFCMP_SEQ
+  UNSPEC_LASX_XVFCMP_SLE
+  UNSPEC_LASX_XVFCMP_SLT
+  UNSPEC_LASX_XVFCMP_SNE
+  UNSPEC_LASX_XVFCMP_SOR
+  UNSPEC_LASX_XVFCMP_SUEQ
+  UNSPEC_LASX_XVFCMP_SULE
+  UNSPEC_LASX_XVFCMP_SULT
+  UNSPEC_LASX_XVFCMP_SUN
+  UNSPEC_LASX_XVFCMP_SUNE
+  UNSPEC_LASX_XVFTINT_S
+  UNSPEC_LASX_XVFTINT_U
+  UNSPEC_LASX_XVCLO
+  UNSPEC_LASX_XVSAT_S
+  UNSPEC_LASX_XVSAT_U
+  UNSPEC_LASX_XVREPLVE0
+  UNSPEC_LASX_XVREPL128VEI
+  UNSPEC_LASX_XVSRAR
+  UNSPEC_LASX_XVSRARI
+  UNSPEC_LASX_XVSRLR
+  UNSPEC_LASX_XVSRLRI
+  UNSPEC_LASX_XVSHUF
+  UNSPEC_LASX_XVSHUF_B
+  UNSPEC_LASX_BRANCH
+  UNSPEC_LASX_BRANCH_V
+
+  UNSPEC_LASX_XVMUH_S
+  UNSPEC_LASX_XVMUH_U
+  UNSPEC_LASX_MXVEXTW_U
+  UNSPEC_LASX_XVSLLWIL_S
+  UNSPEC_LASX_XVSLLWIL_U
+  UNSPEC_LASX_XVSRAN
+  UNSPEC_LASX_XVSSRAN_S
+  UNSPEC_LASX_XVSSRAN_U
+  UNSPEC_LASX_XVSRARN
+  UNSPEC_LASX_XVSSRARN_S
+  UNSPEC_LASX_XVSSRARN_U
+  UNSPEC_LASX_XVSRLN
+  UNSPEC_LASX_XVSSRLN_U
+  UNSPEC_LASX_XVSRLRN
+  UNSPEC_LASX_XVSSRLRN_U
+  UNSPEC_LASX_XVFRSTPI
+  UNSPEC_LASX_XVFRSTP
+  UNSPEC_LASX_XVSHUF4I
+  UNSPEC_LASX_XVBSRL_V
+  UNSPEC_LASX_XVBSLL_V
+  UNSPEC_LASX_XVEXTRINS
+  UNSPEC_LASX_XVMSKLTZ
+  UNSPEC_LASX_XVSIGNCOV
+  UNSPEC_LASX_XVFTINTRNE_W_S
+  UNSPEC_LASX_XVFTINTRNE_L_D
+  UNSPEC_LASX_XVFTINTRP_W_S
+  UNSPEC_LASX_XVFTINTRP_L_D
+  UNSPEC_LASX_XVFTINTRM_W_S
+  UNSPEC_LASX_XVFTINTRM_L_D
+  UNSPEC_LASX_XVFTINT_W_D
+  UNSPEC_LASX_XVFFINT_S_L
+  UNSPEC_LASX_XVFTINTRZ_W_D
+  UNSPEC_LASX_XVFTINTRP_W_D
+  UNSPEC_LASX_XVFTINTRM_W_D
+  UNSPEC_LASX_XVFTINTRNE_W_D
+  UNSPEC_LASX_XVFTINTH_L_S
+  UNSPEC_LASX_XVFTINTL_L_S
+  UNSPEC_LASX_XVFFINTH_D_W
+  UNSPEC_LASX_XVFFINTL_D_W
+  UNSPEC_LASX_XVFTINTRZH_L_S
+  UNSPEC_LASX_XVFTINTRZL_L_S
+  UNSPEC_LASX_XVFTINTRPH_L_S
+  UNSPEC_LASX_XVFTINTRPL_L_S
+  UNSPEC_LASX_XVFTINTRMH_L_S
+  UNSPEC_LASX_XVFTINTRML_L_S
+  UNSPEC_LASX_XVFTINTRNEL_L_S
+  UNSPEC_LASX_XVFTINTRNEH_L_S
+  UNSPEC_LASX_XVFRINTRNE_S
+  UNSPEC_LASX_XVFRINTRNE_D
+  UNSPEC_LASX_XVFRINTRZ_S
+  UNSPEC_LASX_XVFRINTRZ_D
+  UNSPEC_LASX_XVFRINTRP_S
+  UNSPEC_LASX_XVFRINTRP_D
+  UNSPEC_LASX_XVFRINTRM_S
+  UNSPEC_LASX_XVFRINTRM_D
+  UNSPEC_LASX_XVREPLVE0_Q
+  UNSPEC_LASX_XVPERM_W
+  UNSPEC_LASX_XVPERMI_Q
+  UNSPEC_LASX_XVPERMI_D
+
+  UNSPEC_LASX_XVADDWEV
+  UNSPEC_LASX_XVADDWEV2
+  UNSPEC_LASX_XVADDWEV3
+  UNSPEC_LASX_XVSUBWEV
+  UNSPEC_LASX_XVSUBWEV2
+  UNSPEC_LASX_XVMULWEV
+  UNSPEC_LASX_XVMULWEV2
+  UNSPEC_LASX_XVMULWEV3
+  UNSPEC_LASX_XVADDWOD
+  UNSPEC_LASX_XVADDWOD2
+  UNSPEC_LASX_XVADDWOD3
+  UNSPEC_LASX_XVSUBWOD
+  UNSPEC_LASX_XVSUBWOD2
+  UNSPEC_LASX_XVMULWOD
+  UNSPEC_LASX_XVMULWOD2
+  UNSPEC_LASX_XVMULWOD3
+  UNSPEC_LASX_XVMADDWEV
+  UNSPEC_LASX_XVMADDWEV2
+  UNSPEC_LASX_XVMADDWEV3
+  UNSPEC_LASX_XVMADDWOD
+  UNSPEC_LASX_XVMADDWOD2
+  UNSPEC_LASX_XVMADDWOD3
+  UNSPEC_LASX_XVHADDW_Q_D
+  UNSPEC_LASX_XVHSUBW_Q_D
+  UNSPEC_LASX_XVHADDW_QU_DU
+  UNSPEC_LASX_XVHSUBW_QU_DU
+  UNSPEC_LASX_XVROTR
+  UNSPEC_LASX_XVADD_Q
+  UNSPEC_LASX_XVSUB_Q
+  UNSPEC_LASX_XVREPLVE
+  UNSPEC_LASX_XVSHUF4
+  UNSPEC_LASX_XVMSKGEZ
+  UNSPEC_LASX_XVMSKNZ
+  UNSPEC_LASX_XVEXTH_Q_D
+  UNSPEC_LASX_XVEXTH_QU_DU
+  UNSPEC_LASX_XVEXTL_Q_D
+  UNSPEC_LASX_XVSRLNI
+  UNSPEC_LASX_XVSRLRNI
+  UNSPEC_LASX_XVSSRLNI
+  UNSPEC_LASX_XVSSRLNI2
+  UNSPEC_LASX_XVSSRLRNI
+  UNSPEC_LASX_XVSSRLRNI2
+  UNSPEC_LASX_XVSRANI
+  UNSPEC_LASX_XVSRARNI
+  UNSPEC_LASX_XVSSRANI
+  UNSPEC_LASX_XVSSRANI2
+  UNSPEC_LASX_XVSSRARNI
+  UNSPEC_LASX_XVSSRARNI2
+  UNSPEC_LASX_XVPERMI
+  UNSPEC_LASX_XVINSVE0
+  UNSPEC_LASX_XVPICKVE
+  UNSPEC_LASX_XVSSRLN
+  UNSPEC_LASX_XVSSRLRN
+  UNSPEC_LASX_XVEXTL_QU_DU
+  UNSPEC_LASX_XVLDI
+  UNSPEC_LASX_XVLDX
+  UNSPEC_LASX_XVSTX
+])
+
+;; All vector modes with 256 bits.
+(define_mode_iterator LASX [V4DF V8SF V4DI V8SI V16HI V32QI])
+
+;; Same as LASX.  Used by vcond to iterate two modes.
+(define_mode_iterator LASX_2 [V4DF V8SF V4DI V8SI V16HI V32QI])
+
+;; Only used for splitting insert_d and copy_{u,s}.d.
+(define_mode_iterator LASX_D [V4DI V4DF])
+
+;; Only used for splitting insert_d and copy_{u,s}.d.
+(define_mode_iterator LASX_WD [V4DI V4DF V8SI V8SF])
+
+;; Only used for copy256_{u,s}.w.
+(define_mode_iterator LASX_W    [V8SI V8SF])
+
+;; Only integer modes in LASX.
+(define_mode_iterator ILASX [V4DI V8SI V16HI V32QI])
+
+;; As ILASX but excludes V32QI.
+(define_mode_iterator ILASX_DWH [V4DI V8SI V16HI])
+
+;; As LASX but excludes V32QI.
+(define_mode_iterator LASX_DWH [V4DF V8SF V4DI V8SI V16HI])
+
+;; As ILASX but excludes V4DI.
+(define_mode_iterator ILASX_WHB [V8SI V16HI V32QI])
+
+;; Only integer modes equal or larger than a word.
+(define_mode_iterator ILASX_DW  [V4DI V8SI])
+
+;; Only integer modes smaller than a word.
+(define_mode_iterator ILASX_HB  [V16HI V32QI])
+
+;; Only floating-point modes in LASX.
+(define_mode_iterator FLASX  [V4DF V8SF])
+
+;; Only used for immediate set shuffle elements instruction.
+(define_mode_iterator LASX_WHB_W [V8SI V16HI V32QI V8SF])
+
+;; The attribute gives the integer vector mode with same size in Loongson ASX.
+(define_mode_attr VIMODE256
+  [(V4DF "V4DI")
+   (V8SF "V8SI")
+   (V4DI "V4DI")
+   (V8SI "V8SI")
+   (V16HI "V16HI")
+   (V32QI "V32QI")])
+
+;;attribute gives half modes for vector modes.
+;;attribute gives half modes (Same Size) for vector modes.
+(define_mode_attr VHSMODE256
+  [(V16HI "V32QI")
+   (V8SI "V16HI")
+   (V4DI "V8SI")])
+
+;;attribute gives half modes  for vector modes.
+(define_mode_attr VHMODE256
+  [(V32QI "V16QI")
+   (V16HI "V8HI")
+   (V8SI "V4SI")
+   (V4DI "V2DI")])
+
+;;attribute gives half float modes for vector modes.
+(define_mode_attr VFHMODE256
+   [(V8SF "V4SF")
+   (V4DF "V2DF")])
+
+;; The attribute gives double modes for vector modes in LASX.
+(define_mode_attr VDMODE256
+  [(V8SI "V4DI")
+   (V16HI "V8SI")
+   (V32QI "V16HI")])
+
+;; extended from VDMODE256
+(define_mode_attr VDMODEEXD256
+  [(V4DI "V4DI")
+   (V8SI "V4DI")
+   (V16HI "V8SI")
+   (V32QI "V16HI")])
+
+;; The attribute gives half modes with same number of elements for vector modes.
+(define_mode_attr VTRUNCMODE256
+  [(V16HI "V16QI")
+   (V8SI "V8HI")
+   (V4DI "V4SI")])
+
+;; Double-sized Vector MODE with same elemet type. "Vector, Enlarged-MODE"
+(define_mode_attr VEMODE256
+  [(V8SF "V16SF")
+   (V8SI "V16SI")
+   (V4DF "V8DF")
+   (V4DI "V8DI")])
+
+;; This attribute gives the mode of the result for "copy_s_b, copy_u_b" etc.
+(define_mode_attr VRES256
+  [(V4DF "DF")
+   (V8SF "SF")
+   (V4DI "DI")
+   (V8SI "SI")
+   (V16HI "SI")
+   (V32QI "SI")])
+
+;; Only used with LASX_D iterator.
+(define_mode_attr lasx_d
+  [(V4DI "reg_or_0")
+   (V4DF "register")])
+
+;; This attribute gives the 256 bit integer vector mode with same size.
+(define_mode_attr mode256_i
+  [(V4DF "v4di")
+   (V8SF "v8si")
+   (V4DI "v4di")
+   (V8SI "v8si")
+   (V16HI "v16hi")
+   (V32QI "v32qi")])
+
+
+;; This attribute gives the 256 bit float vector mode with same size.
+(define_mode_attr mode256_f
+  [(V4DF "v4df")
+   (V8SF "v8sf")
+   (V4DI "v4df")
+   (V8SI "v8sf")])
+
+ ;; This attribute gives suffix for LASX instructions.  HOW?
+(define_mode_attr lasxfmt
+  [(V4DF "d")
+   (V8SF "w")
+   (V4DI "d")
+   (V8SI "w")
+   (V16HI "h")
+   (V32QI "b")])
+
+(define_mode_attr flasxfmt
+  [(V4DF "d")
+   (V8SF "s")])
+
+(define_mode_attr lasxfmt_u
+  [(V4DF "du")
+   (V8SF "wu")
+   (V4DI "du")
+   (V8SI "wu")
+   (V16HI "hu")
+   (V32QI "bu")])
+
+(define_mode_attr ilasxfmt
+  [(V4DF "l")
+   (V8SF "w")])
+
+(define_mode_attr ilasxfmt_u
+  [(V4DF "lu")
+   (V8SF "wu")])
+
+;; This attribute gives suffix for integers in VHMODE256.
+(define_mode_attr hlasxfmt
+  [(V4DI "w")
+   (V8SI "h")
+   (V16HI "b")])
+
+(define_mode_attr hlasxfmt_u
+  [(V4DI "wu")
+   (V8SI "hu")
+   (V16HI "bu")])
+
+;; This attribute gives suffix for integers in VHSMODE256.
+(define_mode_attr hslasxfmt
+  [(V4DI "w")
+   (V8SI "h")
+   (V16HI "b")])
+
+;; This attribute gives define_insn suffix for LASX instructions that need
+;; distinction between integer and floating point.
+(define_mode_attr lasxfmt_f
+  [(V4DF "d_f")
+   (V8SF "w_f")
+   (V4DI "d")
+   (V8SI "w")
+   (V16HI "h")
+   (V32QI "b")])
+
+(define_mode_attr flasxfmt_f
+  [(V4DF "d_f")
+   (V8SF "s_f")
+   (V4DI "d")
+   (V8SI "w")
+   (V16HI "h")
+   (V32QI "b")])
+
+;; This attribute gives define_insn suffix for LASX instructions that need
+;; distinction between integer and floating point.
+(define_mode_attr lasxfmt_f_wd
+  [(V4DF "d_f")
+   (V8SF "w_f")
+   (V4DI "d")
+   (V8SI "w")])
+
+;; This attribute gives suffix for integers in VHMODE256.
+(define_mode_attr dlasxfmt
+  [(V8SI "d")
+   (V16HI "w")
+   (V32QI "h")])
+
+(define_mode_attr dlasxfmt_u
+  [(V8SI "du")
+   (V16HI "wu")
+   (V32QI "hu")])
+
+;; for VDMODEEXD256
+(define_mode_attr dlasxqfmt
+  [(V4DI "q")
+   (V8SI "d")
+   (V16HI "w")
+   (V32QI "h")])
+
+;; This is used to form an immediate operand constraint using
+;; "const_<indeximm256>_operand".
+(define_mode_attr indeximm256
+  [(V4DF "0_to_3")
+   (V8SF "0_to_7")
+   (V4DI "0_to_3")
+   (V8SI "0_to_7")
+   (V16HI "uimm4")
+   (V32QI "uimm5")])
+
+;; This is used to form an immediate operand constraint using to ref high half
+;; "const_<indeximm_hi>_operand".
+(define_mode_attr indeximm_hi
+  [(V4DF "2_or_3")
+   (V8SF "4_to_7")
+   (V4DI "2_or_3")
+   (V8SI "4_to_7")
+   (V16HI "8_to_15")
+   (V32QI "16_to_31")])
+
+;; This is used to form an immediate operand constraint using to ref low half
+;; "const_<indeximm_lo>_operand".
+(define_mode_attr indeximm_lo
+  [(V4DF "0_or_1")
+   (V8SF "0_to_3")
+   (V4DI "0_or_1")
+   (V8SI "0_to_3")
+   (V16HI "uimm3")
+   (V32QI "uimm4")])
+
+;; This attribute represents bitmask needed for vec_merge using in lasx
+;; "const_<bitmask256>_operand".
+(define_mode_attr bitmask256
+  [(V4DF "exp_4")
+   (V8SF "exp_8")
+   (V4DI "exp_4")
+   (V8SI "exp_8")
+   (V16HI "exp_16")
+   (V32QI "exp_32")])
+
+;; This attribute represents bitmask needed for vec_merge using to ref low half
+;; "const_<bitmask_lo>_operand".
+(define_mode_attr bitmask_lo
+  [(V4DF "exp_2")
+   (V8SF "exp_4")
+   (V4DI "exp_2")
+   (V8SI "exp_4")
+   (V16HI "exp_8")
+   (V32QI "exp_16")])
+
+
+;; This attribute is used to form an immediate operand constraint using
+;; "const_<bitimm256>_operand".
+(define_mode_attr bitimm256
+  [(V32QI "uimm3")
+   (V16HI  "uimm4")
+   (V8SI  "uimm5")
+   (V4DI  "uimm6")])
+
+
+(define_mode_attr d2lasxfmt
+  [(V8SI "q")
+   (V16HI "d")
+   (V32QI "w")])
+
+(define_mode_attr d2lasxfmt_u
+  [(V8SI "qu")
+   (V16HI "du")
+   (V32QI "wu")])
+
+(define_mode_attr VD2MODE256
+  [(V8SI "V4DI")
+   (V16HI "V4DI")
+   (V32QI "V8SI")])
+
+(define_mode_attr lasxfmt_wd
+  [(V4DI "d")
+   (V8SI "w")
+   (V16HI "w")
+   (V32QI "w")])
+
+(define_int_iterator FRINT256_S [UNSPEC_LASX_XVFRINTRP_S
+			       UNSPEC_LASX_XVFRINTRZ_S
+			       UNSPEC_LASX_XVFRINT
+			       UNSPEC_LASX_XVFRINTRM_S])
+
+(define_int_iterator FRINT256_D [UNSPEC_LASX_XVFRINTRP_D
+			       UNSPEC_LASX_XVFRINTRZ_D
+			       UNSPEC_LASX_XVFRINT
+			       UNSPEC_LASX_XVFRINTRM_D])
+
+(define_int_attr frint256_pattern_s
+  [(UNSPEC_LASX_XVFRINTRP_S  "ceil")
+   (UNSPEC_LASX_XVFRINTRZ_S  "btrunc")
+   (UNSPEC_LASX_XVFRINT	     "rint")
+   (UNSPEC_LASX_XVFRINTRM_S  "floor")])
+
+(define_int_attr frint256_pattern_d
+  [(UNSPEC_LASX_XVFRINTRP_D  "ceil")
+   (UNSPEC_LASX_XVFRINTRZ_D  "btrunc")
+   (UNSPEC_LASX_XVFRINT	     "rint")
+   (UNSPEC_LASX_XVFRINTRM_D  "floor")])
+
+(define_int_attr frint256_suffix
+  [(UNSPEC_LASX_XVFRINTRP_S  "rp")
+   (UNSPEC_LASX_XVFRINTRP_D  "rp")
+   (UNSPEC_LASX_XVFRINTRZ_S  "rz")
+   (UNSPEC_LASX_XVFRINTRZ_D  "rz")
+   (UNSPEC_LASX_XVFRINT	     "")
+   (UNSPEC_LASX_XVFRINTRM_S  "rm")
+   (UNSPEC_LASX_XVFRINTRM_D  "rm")])
+
+(define_expand "vec_init<mode><unitmode>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand:LASX 1 "")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vector_init (operands[0], operands[1]);
+  DONE;
+})
+
+(define_expand "vec_initv32qiv16qi"
+ [(match_operand:V32QI 0 "register_operand")
+  (match_operand:V16QI 1 "")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vector_group_init (operands[0], operands[1]);
+  DONE;
+})
+
+;; FIXME: Delete.
+(define_insn "vec_pack_trunc_<mode>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(vec_concat:<VHSMODE256>
+	  (truncate:<VTRUNCMODE256>
+	    (match_operand:ILASX_DWH 1 "register_operand" "f"))
+	  (truncate:<VTRUNCMODE256>
+	    (match_operand:ILASX_DWH 2 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvpickev.<hslasxfmt>\t%u0,%u2,%u1\n\txvpermi.d\t%u0,%u0,0xd8"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "8")])
+
+(define_expand "vec_unpacks_hi_v8sf"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	  (vec_select:V4SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_dup 2))))]
+  "ISA_HAS_LASX"
+{
+  operands[2] = loongarch_lsx_vec_parallel_const_half (V8SFmode,
+						       true/*high_p*/);
+})
+
+(define_expand "vec_unpacks_lo_v8sf"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	  (vec_select:V4SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_dup 2))))]
+  "ISA_HAS_LASX"
+{
+  operands[2] = loongarch_lsx_vec_parallel_const_half (V8SFmode,
+						       false/*high_p*/);
+})
+
+(define_expand "vec_unpacks_hi_<mode>"
+  [(match_operand:<VDMODE256> 0 "register_operand")
+   (match_operand:ILASX_WHB 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/,
+			       true/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacks_lo_<mode>"
+  [(match_operand:<VDMODE256> 0 "register_operand")
+   (match_operand:ILASX_WHB 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, false/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacku_hi_<mode>"
+  [(match_operand:<VDMODE256> 0 "register_operand")
+   (match_operand:ILASX_WHB 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, true/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacku_lo_<mode>"
+  [(match_operand:<VDMODE256> 0 "register_operand")
+   (match_operand:ILASX_WHB 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, false/*high_p*/);
+  DONE;
+})
+
+(define_insn "lasx_xvinsgr2vr_<lasxfmt_f_wd>"
+  [(set (match_operand:ILASX_DW 0 "register_operand" "=f")
+	(vec_merge:ILASX_DW
+	  (vec_duplicate:ILASX_DW
+	    (match_operand:<UNITMODE> 1 "reg_or_0_operand" "rJ"))
+	  (match_operand:ILASX_DW 2 "register_operand" "0")
+	  (match_operand 3 "const_<bitmask256>_operand" "")))]
+  "ISA_HAS_LASX"
+{
+#if 0
+  if (!TARGET_64BIT && (<MODE>mode == V4DImode || <MODE>mode == V4DFmode))
+    return "#";
+  else
+#endif
+    return "xvinsgr2vr.<lasxfmt>\t%u0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vec_concatv4di"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(vec_concat:V4DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (match_operand:V2DI 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "vec_concatv8si"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_concat:V8SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (match_operand:V4SI 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "vec_concatv16hi"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_concat:V16HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (match_operand:V8HI 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "vec_concatv32qi"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_concat:V32QI
+	  (match_operand:V16QI 1 "register_operand" "0")
+	  (match_operand:V16QI 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "vec_concatv4df"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(vec_concat:V4DF
+	  (match_operand:V2DF 1 "register_operand" "0")
+	  (match_operand:V2DF 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "vec_concatv8sf"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_concat:V8SF
+	  (match_operand:V4SF 1 "register_operand" "0")
+	  (match_operand:V4SF 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+;; xshuf.w
+(define_insn "lasx_xvperm_<lasxfmt_f_wd>"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+	(unspec:LASX_W
+	  [(match_operand:LASX_W 1 "nonimmediate_operand" "f")
+	   (match_operand:V8SI 2 "register_operand" "f")]
+	  UNSPEC_LASX_XVPERM_W))]
+  "ISA_HAS_LASX"
+  "xvperm.w\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+;; xvpermi.d
+(define_insn "lasx_xvpermi_d_<LASX:mode>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	  (unspec:LASX
+	    [(match_operand:LASX 1 "register_operand" "f")
+	     (match_operand:SI     2 "const_uimm8_operand")]
+	    UNSPEC_LASX_XVPERMI_D))]
+  "ISA_HAS_LASX"
+  "xvpermi.d\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpermi_d_<mode>_1"
+  [(set (match_operand:LASX_D 0 "register_operand" "=f")
+	(vec_select:LASX_D
+	 (match_operand:LASX_D 1 "register_operand" "f")
+	 (parallel [(match_operand 2 "const_0_to_3_operand")
+		    (match_operand 3 "const_0_to_3_operand")
+		    (match_operand 4 "const_0_to_3_operand")
+		    (match_operand 5 "const_0_to_3_operand")])))]
+  "ISA_HAS_LASX"
+{
+  int mask = 0;
+  mask |= INTVAL (operands[2]) << 0;
+  mask |= INTVAL (operands[3]) << 2;
+  mask |= INTVAL (operands[4]) << 4;
+  mask |= INTVAL (operands[5]) << 6;
+  operands[2] = GEN_INT (mask);
+  return "xvpermi.d\t%u0,%u1,%2";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+;; xvpermi.q
+(define_insn "lasx_xvpermi_q_<LASX:mode>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(unspec:LASX
+	  [(match_operand:LASX 1 "register_operand" "0")
+	   (match_operand:LASX 2 "register_operand" "f")
+	   (match_operand     3 "const_uimm8_operand")]
+	  UNSPEC_LASX_XVPERMI_Q))]
+  "ISA_HAS_LASX"
+  "xvpermi.q\t%u0,%u2,%3"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpickve2gr_d<u>"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(any_extend:DI
+	  (vec_select:DI
+	    (match_operand:V4DI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_0_to_3_operand" "")]))))]
+  "ISA_HAS_LASX"
+  "xvpickve2gr.d<u>\t%0,%u1,%2"
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "V4DI")])
+
+(define_expand "vec_set<mode>"
+  [(match_operand:ILASX_DW 0 "register_operand")
+   (match_operand:<UNITMODE> 1 "reg_or_0_operand")
+   (match_operand 2 "const_<indeximm256>_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsgr2vr_<lasxfmt_f_wd> (operands[0], operands[1],
+                      operands[0], index));
+  DONE;
+})
+
+(define_expand "vec_set<mode>"
+  [(match_operand:FLASX 0 "register_operand")
+   (match_operand:<UNITMODE> 1 "reg_or_0_operand")
+   (match_operand 2 "const_<indeximm256>_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsve0_<lasxfmt_f>_scalar (operands[0], operands[1],
+                      operands[0], index));
+  DONE;
+})
+
+(define_expand "vec_extract<mode><unitmode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LASX 1 "register_operand")
+   (match_operand 2 "const_<indeximm256>_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vector_extract (operands[0], operands[1],
+      INTVAL (operands[2]));
+  DONE;
+})
+
+(define_expand "vec_perm<mode>"
+ [(match_operand:LASX 0 "register_operand")
+  (match_operand:LASX 1 "register_operand")
+  (match_operand:LASX 2 "register_operand")
+  (match_operand:<VIMODE256> 3 "register_operand")]
+  "ISA_HAS_LASX"
+{
+   loongarch_expand_vec_perm_1 (operands);
+   DONE;
+})
+
+;; FIXME: 256??
+(define_expand "vcondu<LASX:mode><ILASX:mode>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand:LASX 1 "reg_or_m1_operand")
+   (match_operand:LASX 2 "reg_or_0_operand")
+   (match_operator 3 ""
+    [(match_operand:ILASX 4 "register_operand")
+     (match_operand:ILASX 5 "register_operand")])]
+  "ISA_HAS_LASX
+   && (GET_MODE_NUNITS (<LASX:MODE>mode)
+       == GET_MODE_NUNITS (<ILASX:MODE>mode))"
+{
+  loongarch_expand_vec_cond_expr (<LASX:MODE>mode, <LASX:VIMODE256>mode,
+				  operands);
+  DONE;
+})
+
+;; FIXME: 256??
+(define_expand "vcond<LASX:mode><LASX_2:mode>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand:LASX 1 "reg_or_m1_operand")
+   (match_operand:LASX 2 "reg_or_0_operand")
+   (match_operator 3 ""
+     [(match_operand:LASX_2 4 "register_operand")
+      (match_operand:LASX_2 5 "register_operand")])]
+  "ISA_HAS_LASX
+   && (GET_MODE_NUNITS (<LASX:MODE>mode)
+       == GET_MODE_NUNITS (<LASX_2:MODE>mode))"
+{
+  loongarch_expand_vec_cond_expr (<LASX:MODE>mode, <LASX:VIMODE256>mode,
+				  operands);
+  DONE;
+})
+
+;; Same as vcond_
+(define_expand "vcond_mask_<ILASX:mode><ILASX:mode>"
+  [(match_operand:ILASX 0 "register_operand")
+   (match_operand:ILASX 1 "reg_or_m1_operand")
+   (match_operand:ILASX 2 "reg_or_0_operand")
+   (match_operand:ILASX 3 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_cond_mask_expr (<ILASX:MODE>mode,
+				      <ILASX:VIMODE256>mode, operands);
+  DONE;
+})
+
+(define_expand "lasx_xvrepli<mode>"
+  [(match_operand:ILASX 0 "register_operand")
+   (match_operand 1 "const_imm10_operand")]
+  "ISA_HAS_LASX"
+{
+  if (<MODE>mode == V32QImode)
+    operands[1] = GEN_INT (trunc_int_for_mode (INTVAL (operands[1]),
+					       <UNITMODE>mode));
+  emit_move_insn (operands[0],
+  loongarch_gen_const_int_vector (<MODE>mode, INTVAL (operands[1])));
+  DONE;
+})
+
+(define_expand "mov<mode>"
+  [(set (match_operand:LASX 0)
+	(match_operand:LASX 1))]
+  "ISA_HAS_LASX"
+{
+  if (loongarch_legitimize_move (<MODE>mode, operands[0], operands[1]))
+    DONE;
+})
+
+
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:LASX 0)
+	(match_operand:LASX 1))]
+  "ISA_HAS_LASX"
+{
+  if (loongarch_legitimize_move (<MODE>mode, operands[0], operands[1]))
+    DONE;
+})
+
+;; 256-bit LASX modes can only exist in LASX registers or memory.
+(define_insn "mov<mode>_lasx"
+  [(set (match_operand:LASX 0 "nonimmediate_operand" "=f,f,R,*r,*f")
+	(match_operand:LASX 1 "move_operand" "fYGYI,R,f,*f,*r"))]
+  "ISA_HAS_LASX"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "type" "simd_move,simd_load,simd_store,simd_copy,simd_insert")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "8,4,4,4,4")])
+
+
+(define_split
+  [(set (match_operand:LASX 0 "nonimmediate_operand")
+	(match_operand:LASX 1 "move_operand"))]
+  "reload_completed && ISA_HAS_LASX
+   && loongarch_split_move_insn_p (operands[0], operands[1])"
+  [(const_int 0)]
+{
+  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  DONE;
+})
+
+;; Offset load
+(define_expand "lasx_mxld_<lasxfmt_f>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq10<lasxfmt>_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+				      INTVAL (operands[2]));
+  loongarch_emit_move (operands[0], gen_rtx_MEM (<MODE>mode, addr));
+  DONE;
+})
+
+;; Offset store
+(define_expand "lasx_mxst_<lasxfmt_f>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq10<lasxfmt>_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (gen_rtx_MEM (<MODE>mode, addr), operands[0]);
+  DONE;
+})
+
+;; LASX
+(define_insn "add<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f,f")
+	(plus:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_ximm5_operand" "f,Unv5,Uuv5")))]
+  "ISA_HAS_LASX"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "xvadd.<lasxfmt>\t%u0,%u1,%u2";
+    case 1:
+      {
+	HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0));
+
+	operands[2] = GEN_INT (-val);
+	return "xvsubi.<lasxfmt_u>\t%u0,%u1,%d2";
+      }
+    case 2:
+      return "xvaddi.<lasxfmt_u>\t%u0,%u1,%E2";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "alu_type" "simd_add")
+   (set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(minus:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvsub.<lasxfmt>\t%u0,%u1,%u2
+   xvsubi.<lasxfmt_u>\t%u0,%u1,%E2"
+  [(set_attr "alu_type" "simd_add")
+   (set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(mult:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		    (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvmul.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvmadd_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(plus:ILASX (mult:ILASX (match_operand:ILASX 2 "register_operand" "f")
+				(match_operand:ILASX 3 "register_operand" "f"))
+		    (match_operand:ILASX 1 "register_operand" "0")))]
+  "ISA_HAS_LASX"
+  "xvmadd.<lasxfmt>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+
+
+(define_insn "lasx_xvmsub_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(minus:ILASX (match_operand:ILASX 1 "register_operand" "0")
+		     (mult:ILASX (match_operand:ILASX 2 "register_operand" "f")
+				 (match_operand:ILASX 3 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvmsub.<lasxfmt>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "div<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(div:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		   (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_lsx_output_division ("xvdiv.<lasxfmt>\t%u0,%u1,%u2",
+					operands);
+}
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "udiv<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(udiv:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		    (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_lsx_output_division ("xvdiv.<lasxfmt_u>\t%u0,%u1,%u2",
+					operands);
+}
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mod<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(mod:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		   (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_lsx_output_division ("xvmod.<lasxfmt>\t%u0,%u1,%u2",
+					operands);
+}
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umod<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(umod:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		    (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_lsx_output_division ("xvmod.<lasxfmt_u>\t%u0,%u1,%u2",
+					operands);
+}
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "xor<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f,f")
+	(xor:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))]
+  "ISA_HAS_LASX"
+  "@
+   xvxor.v\t%u0,%u1,%u2
+   xvbitrevi.%v0\t%u0,%u1,%V2
+   xvxori.b\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "ior<mode>3"
+  [(set (match_operand:LASX 0 "register_operand" "=f,f,f")
+	(ior:LASX
+	  (match_operand:LASX 1 "register_operand" "f,f,f")
+	  (match_operand:LASX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))]
+  "ISA_HAS_LASX"
+  "@
+   xvor.v\t%u0,%u1,%u2
+   xvbitseti.%v0\t%u0,%u1,%V2
+   xvori.b\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "and<mode>3"
+  [(set (match_operand:LASX 0 "register_operand" "=f,f,f")
+	(and:LASX
+	  (match_operand:LASX 1 "register_operand" "f,f,f")
+	  (match_operand:LASX 2 "reg_or_vector_same_val_operand" "f,YZ,Urv8")))]
+  "ISA_HAS_LASX"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "xvand.v\t%u0,%u1,%u2";
+    case 1:
+      {
+	rtx elt0 = CONST_VECTOR_ELT (operands[2], 0);
+	unsigned HOST_WIDE_INT val = ~UINTVAL (elt0);
+	operands[2] = loongarch_gen_const_int_vector (<MODE>mode, val & (-val));
+	return "xvbitclri.%v0\t%u0,%u1,%V2";
+      }
+    case 2:
+      return "xvandi.b\t%u0,%u1,%B2";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "one_cmpl<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(not:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvnor.v\t%u0,%u1,%u1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V32QI")])
+
+;; LASX
+(define_insn "vlshr<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(lshiftrt:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LASX"
+  "@
+   xvsrl.<lasxfmt>\t%u0,%u1,%u2
+   xvsrli.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; LASX ">>"
+(define_insn "vashr<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(ashiftrt:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LASX"
+  "@
+   xvsra.<lasxfmt>\t%u0,%u1,%u2
+   xvsrai.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; LASX "<<"
+(define_insn "vashl<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(ashift:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LASX"
+  "@
+   xvsll.<lasxfmt>\t%u0,%u1,%u2
+   xvslli.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "add<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(plus:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfadd.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(minus:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		     (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfsub.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(mult:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmul.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fmul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "div<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(div:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		   (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfdiv.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fma<mode>4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(fma:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		   (match_operand:FLASX 2 "register_operand" "f")
+		   (match_operand:FLASX 3 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmadd.<flasxfmt>\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fnma<mode>4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(fma:FLASX (neg:FLASX (match_operand:FLASX 1 "register_operand" "f"))
+		   (match_operand:FLASX 2 "register_operand" "f")
+		   (match_operand:FLASX 3 "register_operand" "0")))]
+  "ISA_HAS_LASX"
+  "xvfnmsub.<flasxfmt>\t%u0,%u1,%u2,%u0"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(sqrt:FLASX (match_operand:FLASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfsqrt.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvadda_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(plus:ILASX (abs:ILASX (match_operand:ILASX 1 "register_operand" "f"))
+		    (abs:ILASX (match_operand:ILASX 2 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvadda.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "ssadd<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(ss_plus:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvsadd.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "usadd<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(us_plus:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvsadd.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvabsd_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVABSD_S))]
+  "ISA_HAS_LASX"
+  "xvabsd.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvabsd_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVABSD_U))]
+  "ISA_HAS_LASX"
+  "xvabsd.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvavg_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVAVG_S))]
+  "ISA_HAS_LASX"
+  "xvavg.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvavg_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVAVG_U))]
+  "ISA_HAS_LASX"
+  "xvavg.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvavgr_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVAVGR_S))]
+  "ISA_HAS_LASX"
+  "xvavgr.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvavgr_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVAVGR_U))]
+  "ISA_HAS_LASX"
+  "xvavgr.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitclr_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVBITCLR))]
+  "ISA_HAS_LASX"
+  "xvbitclr.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitclri_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVBITCLRI))]
+  "ISA_HAS_LASX"
+  "xvbitclri.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitrev_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVBITREV))]
+  "ISA_HAS_LASX"
+  "xvbitrev.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitrevi_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		     UNSPEC_LASX_XVBITREVI))]
+  "ISA_HAS_LASX"
+  "xvbitrevi.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitsel_<lasxfmt_f>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(ior:LASX (and:LASX (not:LASX
+			      (match_operand:LASX 3 "register_operand" "f"))
+			      (match_operand:LASX 1 "register_operand" "f"))
+		  (and:LASX (match_dup 3)
+			    (match_operand:LASX 2 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvbitsel.v\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_bitmov")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitseli_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(ior:V32QI (and:V32QI (not:V32QI
+				(match_operand:V32QI 1 "register_operand" "0"))
+			      (match_operand:V32QI 2 "register_operand" "f"))
+		   (and:V32QI (match_dup 1)
+			      (match_operand:V32QI 3 "const_vector_same_val_operand" "Urv8"))))]
+  "ISA_HAS_LASX"
+  "xvbitseli.b\t%u0,%u2,%B3"
+  [(set_attr "type" "simd_bitmov")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvbitset_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVBITSET))]
+  "ISA_HAS_LASX"
+  "xvbitset.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitseti_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVBITSETI))]
+  "ISA_HAS_LASX"
+  "xvbitseti.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvs<ICC:icc>_<ILASX:lasxfmt><cmpi_1>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(ICC:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_<ICC:cmpi>imm5_operand" "f,U<ICC:cmpi>v5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvs<ICC:icc>.<ILASX:lasxfmt><cmpi_1>\t%u0,%u1,%u2
+   xvs<ICC:icci>.<ILASX:lasxfmt><cmpi_1>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "vec_cmp<mode><mode256_i>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand")
+	(match_operator 1 ""
+	  [(match_operand:LASX 2 "register_operand")
+	   (match_operand:LASX 3 "register_operand")]))]
+  "ISA_HAS_LASX"
+{
+  bool ok = loongarch_expand_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpu<ILASX:mode><mode256_i>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand")
+	(match_operator 1 ""
+	  [(match_operand:ILASX 2 "register_operand")
+	   (match_operand:ILASX 3 "register_operand")]))]
+  "ISA_HAS_LASX"
+{
+  bool ok = loongarch_expand_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_insn "lasx_xvfclass_<flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")]
+			    UNSPEC_LASX_XVFCLASS))]
+  "ISA_HAS_LASX"
+  "xvfclass.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fclass")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfcmp_caf_<flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")
+			     (match_operand:FLASX 2 "register_operand" "f")]
+			    UNSPEC_LASX_XVFCMP_CAF))]
+  "ISA_HAS_LASX"
+  "xvfcmp.caf.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfcmp_cune_<FLASX:flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")
+			     (match_operand:FLASX 2 "register_operand" "f")]
+			    UNSPEC_LASX_XVFCMP_CUNE))]
+  "ISA_HAS_LASX"
+  "xvfcmp.cune.<FLASX:flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+
+
+(define_int_iterator FSC256_UNS [UNSPEC_LASX_XVFCMP_SAF UNSPEC_LASX_XVFCMP_SUN
+				 UNSPEC_LASX_XVFCMP_SOR UNSPEC_LASX_XVFCMP_SEQ
+				 UNSPEC_LASX_XVFCMP_SNE UNSPEC_LASX_XVFCMP_SUEQ
+				 UNSPEC_LASX_XVFCMP_SUNE UNSPEC_LASX_XVFCMP_SULE
+				 UNSPEC_LASX_XVFCMP_SULT UNSPEC_LASX_XVFCMP_SLE
+				 UNSPEC_LASX_XVFCMP_SLT])
+
+(define_int_attr fsc256
+  [(UNSPEC_LASX_XVFCMP_SAF  "saf")
+   (UNSPEC_LASX_XVFCMP_SUN  "sun")
+   (UNSPEC_LASX_XVFCMP_SOR  "sor")
+   (UNSPEC_LASX_XVFCMP_SEQ  "seq")
+   (UNSPEC_LASX_XVFCMP_SNE  "sne")
+   (UNSPEC_LASX_XVFCMP_SUEQ "sueq")
+   (UNSPEC_LASX_XVFCMP_SUNE "sune")
+   (UNSPEC_LASX_XVFCMP_SULE "sule")
+   (UNSPEC_LASX_XVFCMP_SULT "sult")
+   (UNSPEC_LASX_XVFCMP_SLE  "sle")
+   (UNSPEC_LASX_XVFCMP_SLT  "slt")])
+
+(define_insn "lasx_xvfcmp_<vfcond:fcc>_<FLASX:flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(vfcond:<VIMODE256> (match_operand:FLASX 1 "register_operand" "f")
+			    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfcmp.<vfcond:fcc>.<FLASX:flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "lasx_xvfcmp_<fsc256>_<FLASX:flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")
+			     (match_operand:FLASX 2 "register_operand" "f")]
+			    FSC256_UNS))]
+  "ISA_HAS_LASX"
+  "xvfcmp.<fsc256>.<FLASX:flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_mode_attr fint256
+  [(V8SF "v8si")
+   (V4DF "v4di")])
+
+(define_mode_attr FINTCNV256
+  [(V8SF "I2S")
+   (V4DF "I2D")])
+
+(define_mode_attr FINTCNV256_2
+  [(V8SF "S2I")
+   (V4DF "D2I")])
+
+(define_insn "float<fint256><FLASX:mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(float:FLASX (match_operand:<VIMODE256> 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvffint.<flasxfmt>.<ilasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "floatuns<fint256><FLASX:mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unsigned_float:FLASX
+	  (match_operand:<VIMODE256> 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvffint.<flasxfmt>.<ilasxfmt_u>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256>")
+   (set_attr "mode" "<MODE>")])
+
+(define_mode_attr FFQ256
+  [(V4SF "V16HI")
+   (V2DF "V8SI")])
+
+(define_insn "lasx_xvreplgr2vr_<lasxfmt_f>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(vec_duplicate:ILASX
+	  (match_operand:<UNITMODE> 1 "reg_or_0_operand" "r,J")))]
+  "ISA_HAS_LASX"
+{
+  if (which_alternative == 1)
+    return "xvldi.b\t%u0,0" ;
+
+  if (!TARGET_64BIT && (<MODE>mode == V2DImode || <MODE>mode == V2DFmode))
+    return "#";
+  else
+    return "xvreplgr2vr.<lasxfmt>\t%u0,%z1";
+}
+  [(set_attr "type" "simd_fill")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "8")])
+
+(define_insn "logb<mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVFLOGB))]
+  "ISA_HAS_LASX"
+  "xvflogb.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_flog2")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(smax:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmax.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfmaxa_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(if_then_else:FLASX
+	   (gt (abs:FLASX (match_operand:FLASX 1 "register_operand" "f"))
+	       (abs:FLASX (match_operand:FLASX 2 "register_operand" "f")))
+	   (match_dup 1)
+	   (match_dup 2)))]
+  "ISA_HAS_LASX"
+  "xvfmaxa.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(smin:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmin.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfmina_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(if_then_else:FLASX
+	   (lt (abs:FLASX (match_operand:FLASX 1 "register_operand" "f"))
+	       (abs:FLASX (match_operand:FLASX 2 "register_operand" "f")))
+	   (match_dup 1)
+	   (match_dup 2)))]
+  "ISA_HAS_LASX"
+  "xvfmina.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrecip_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVFRECIP))]
+  "ISA_HAS_LASX"
+  "xvfrecip.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrint_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVFRINT))]
+  "ISA_HAS_LASX"
+  "xvfrint.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrsqrt_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVFRSQRT))]
+  "ISA_HAS_LASX"
+  "xvfrsqrt.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvftint_s_<ilasxfmt>_<flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")]
+			    UNSPEC_LASX_XVFTINT_S))]
+  "ISA_HAS_LASX"
+  "xvftint.<ilasxfmt>.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvftint_u_<ilasxfmt_u>_<flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")]
+			    UNSPEC_LASX_XVFTINT_U))]
+  "ISA_HAS_LASX"
+  "xvftint.<ilasxfmt_u>.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256_2>")
+   (set_attr "mode" "<MODE>")])
+
+
+
+(define_insn "fix_trunc<FLASX:mode><mode256_i>2"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(fix:<VIMODE256> (match_operand:FLASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvftintrz.<ilasxfmt>.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256_2>")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "fixuns_trunc<FLASX:mode><mode256_i>2"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unsigned_fix:<VIMODE256> (match_operand:FLASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvftintrz.<ilasxfmt_u>.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvh<optab>w_h<u>_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addsub:V16HI
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))))]
+  "ISA_HAS_LASX"
+  "xvh<optab>w.h<u>.b<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvh<optab>w_w<u>_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addsub:V8SI
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LASX"
+  "xvh<optab>w.w<u>.h<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvh<optab>w_d<u>_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addsub:V4DI
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LASX"
+  "xvh<optab>w.d<u>.w<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvpackev_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0)  (const_int 32)
+		     (const_int 2)  (const_int 34)
+		     (const_int 4)  (const_int 36)
+		     (const_int 6)  (const_int 38)
+		     (const_int 8)  (const_int 40)
+		     (const_int 10)  (const_int 42)
+		     (const_int 12)  (const_int 44)
+		     (const_int 14)  (const_int 46)
+		     (const_int 16)  (const_int 48)
+		     (const_int 18)  (const_int 50)
+		     (const_int 20)  (const_int 52)
+		     (const_int 22)  (const_int 54)
+		     (const_int 24)  (const_int 56)
+		     (const_int 26)  (const_int 58)
+		     (const_int 28)  (const_int 60)
+		     (const_int 30)  (const_int 62)])))]
+  "ISA_HAS_LASX"
+  "xvpackev.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+
+(define_insn "lasx_xvpackev_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0)  (const_int 16)
+		     (const_int 2)  (const_int 18)
+		     (const_int 4)  (const_int 20)
+		     (const_int 6)  (const_int 22)
+		     (const_int 8)  (const_int 24)
+		     (const_int 10) (const_int 26)
+		     (const_int 12) (const_int 28)
+		     (const_int 14) (const_int 30)])))]
+  "ISA_HAS_LASX"
+  "xvpackev.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvpackev_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 2) (const_int 10)
+		     (const_int 4) (const_int 12)
+		     (const_int 6) (const_int 14)])))]
+  "ISA_HAS_LASX"
+  "xvpackev.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvpackev_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 2) (const_int 10)
+		     (const_int 4) (const_int 12)
+		     (const_int 6) (const_int 14)])))]
+  "ISA_HAS_LASX"
+  "xvpackev.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvilvh_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 8) (const_int 40)
+		     (const_int 9) (const_int 41)
+		     (const_int 10) (const_int 42)
+		     (const_int 11) (const_int 43)
+		     (const_int 12) (const_int 44)
+		     (const_int 13) (const_int 45)
+		     (const_int 14) (const_int 46)
+		     (const_int 15) (const_int 47)
+		     (const_int 24) (const_int 56)
+		     (const_int 25) (const_int 57)
+		     (const_int 26) (const_int 58)
+		     (const_int 27) (const_int 59)
+		     (const_int 28) (const_int 60)
+		     (const_int 29) (const_int 61)
+		     (const_int 30) (const_int 62)
+		     (const_int 31) (const_int 63)])))]
+  "ISA_HAS_LASX"
+  "xvilvh.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvilvh_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 4) (const_int 20)
+		     (const_int 5) (const_int 21)
+		     (const_int 6) (const_int 22)
+		     (const_int 7) (const_int 23)
+		     (const_int 12) (const_int 28)
+		     (const_int 13) (const_int 29)
+		     (const_int 14) (const_int 30)
+		     (const_int 15) (const_int 31)])))]
+  "ISA_HAS_LASX"
+  "xvilvh.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_mode_attr xvilvh_suffix
+  [(V8SI "") (V8SF "_f")
+   (V4DI "") (V4DF "_f")])
+
+(define_insn "lasx_xvilvh_w<xvilvh_suffix>"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+	(vec_select:LASX_W
+	  (vec_concat:<VEMODE256>
+	    (match_operand:LASX_W 1 "register_operand" "f")
+	    (match_operand:LASX_W 2 "register_operand" "f"))
+	  (parallel [(const_int 2) (const_int 10)
+		     (const_int 3) (const_int 11)
+		     (const_int 6) (const_int 14)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvilvh.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvilvh_d<xvilvh_suffix>"
+  [(set (match_operand:LASX_D 0 "register_operand" "=f")
+	(vec_select:LASX_D
+	  (vec_concat:<VEMODE256>
+	    (match_operand:LASX_D 1 "register_operand" "f")
+	    (match_operand:LASX_D 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 5)
+		     (const_int 3) (const_int 7)])))]
+  "ISA_HAS_LASX"
+  "xvilvh.d\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpackod_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 1)  (const_int 33)
+		     (const_int 3)  (const_int 35)
+		     (const_int 5)  (const_int 37)
+		     (const_int 7)  (const_int 39)
+		     (const_int 9)  (const_int 41)
+		     (const_int 11)  (const_int 43)
+		     (const_int 13)  (const_int 45)
+		     (const_int 15)  (const_int 47)
+		     (const_int 17)  (const_int 49)
+		     (const_int 19)  (const_int 51)
+		     (const_int 21)  (const_int 53)
+		     (const_int 23)  (const_int 55)
+		     (const_int 25)  (const_int 57)
+		     (const_int 27)  (const_int 59)
+		     (const_int 29)  (const_int 61)
+		     (const_int 31)  (const_int 63)])))]
+  "ISA_HAS_LASX"
+  "xvpackod.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+
+(define_insn "lasx_xvpackod_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 17)
+		     (const_int 3) (const_int 19)
+		     (const_int 5) (const_int 21)
+		     (const_int 7) (const_int 23)
+		     (const_int 9) (const_int 25)
+		     (const_int 11) (const_int 27)
+		     (const_int 13) (const_int 29)
+		     (const_int 15) (const_int 31)])))]
+  "ISA_HAS_LASX"
+  "xvpackod.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+
+(define_insn "lasx_xvpackod_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 9)
+		     (const_int 3) (const_int 11)
+		     (const_int 5) (const_int 13)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvpackod.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+
+(define_insn "lasx_xvpackod_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 9)
+		     (const_int 3) (const_int 11)
+		     (const_int 5) (const_int 13)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvpackod.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvilvl_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 32)
+		     (const_int 1) (const_int 33)
+		     (const_int 2) (const_int 34)
+		     (const_int 3) (const_int 35)
+		     (const_int 4) (const_int 36)
+		     (const_int 5) (const_int 37)
+		     (const_int 6) (const_int 38)
+		     (const_int 7) (const_int 39)
+		     (const_int 16) (const_int 48)
+		     (const_int 17) (const_int 49)
+		     (const_int 18) (const_int 50)
+		     (const_int 19) (const_int 51)
+		     (const_int 20) (const_int 52)
+		     (const_int 21) (const_int 53)
+		     (const_int 22) (const_int 54)
+		     (const_int 23) (const_int 55)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvilvl_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 16)
+		     (const_int 1) (const_int 17)
+		     (const_int 2) (const_int 18)
+		     (const_int 3) (const_int 19)
+		     (const_int 8) (const_int 24)
+		     (const_int 9) (const_int 25)
+		     (const_int 10) (const_int 26)
+		     (const_int 11) (const_int 27)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvilvl_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 1) (const_int 9)
+		     (const_int 4) (const_int 12)
+		     (const_int 5) (const_int 13)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvilvl_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 1) (const_int 9)
+		     (const_int 4) (const_int 12)
+		     (const_int 5) (const_int 13)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvilvl_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(vec_select:V4DI
+	  (vec_concat:V8DI
+	    (match_operand:V4DI 1 "register_operand" "f")
+	    (match_operand:V4DI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 2) (const_int 6)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.d\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvilvl_d_f"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(vec_select:V4DF
+	  (vec_concat:V8DF
+	    (match_operand:V4DF 1 "register_operand" "f")
+	    (match_operand:V4DF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 2) (const_int 6)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.d\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(smax:ILASX (match_operand:ILASX 1 "register_operand" "f,f")
+		    (match_operand:ILASX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvmax.<lasxfmt>\t%u0,%u1,%u2
+   xvmaxi.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umax<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(umax:ILASX (match_operand:ILASX 1 "register_operand" "f,f")
+		    (match_operand:ILASX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvmax.<lasxfmt_u>\t%u0,%u1,%u2
+   xvmaxi.<lasxfmt_u>\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(smin:ILASX (match_operand:ILASX 1 "register_operand" "f,f")
+		    (match_operand:ILASX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvmin.<lasxfmt>\t%u0,%u1,%u2
+   xvmini.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umin<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(umin:ILASX (match_operand:ILASX 1 "register_operand" "f,f")
+		    (match_operand:ILASX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvmin.<lasxfmt_u>\t%u0,%u1,%u2
+   xvmini.<lasxfmt_u>\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvclo_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(clz:ILASX (not:ILASX (match_operand:ILASX 1 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvclo.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "clz<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(clz:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvclz.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvnor_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(and:ILASX (not:ILASX (match_operand:ILASX 1 "register_operand" "f,f"))
+		   (not:ILASX (match_operand:ILASX 2 "reg_or_vector_same_val_operand" "f,Urv8"))))]
+  "ISA_HAS_LASX"
+  "@
+   xvnor.v\t%u0,%u1,%u2
+   xvnori.b\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpickev_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)
+		     (const_int 4) (const_int 6)
+		     (const_int 8) (const_int 10)
+		     (const_int 12) (const_int 14)
+		     (const_int 32) (const_int 34)
+		     (const_int 36) (const_int 38)
+		     (const_int 40) (const_int 42)
+		     (const_int 44) (const_int 46)
+		     (const_int 16) (const_int 18)
+		     (const_int 20) (const_int 22)
+		     (const_int 24) (const_int 26)
+		     (const_int 28) (const_int 30)
+		     (const_int 48) (const_int 50)
+		     (const_int 52) (const_int 54)
+		     (const_int 56) (const_int 58)
+		     (const_int 60) (const_int 62)])))]
+  "ISA_HAS_LASX"
+  "xvpickev.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvpickev_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)
+		     (const_int 4) (const_int 6)
+		     (const_int 16) (const_int 18)
+		     (const_int 20) (const_int 22)
+		     (const_int 8) (const_int 10)
+		     (const_int 12) (const_int 14)
+		     (const_int 24) (const_int 26)
+		     (const_int 28) (const_int 30)])))]
+  "ISA_HAS_LASX"
+  "xvpickev.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvpickev_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)
+		     (const_int 8) (const_int 10)
+		     (const_int 4) (const_int 6)
+		     (const_int 12) (const_int 14)])))]
+  "ISA_HAS_LASX"
+  "xvpickev.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvpickev_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)
+		     (const_int 8) (const_int 10)
+		     (const_int 4) (const_int 6)
+		     (const_int 12) (const_int 14)])))]
+  "ISA_HAS_LASX"
+  "xvpickev.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvpickod_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)
+		     (const_int 5) (const_int 7)
+		     (const_int 9) (const_int 11)
+		     (const_int 13) (const_int 15)
+		     (const_int 33) (const_int 35)
+		     (const_int 37) (const_int 39)
+		     (const_int 41) (const_int 43)
+		     (const_int 45) (const_int 47)
+		     (const_int 17) (const_int 19)
+		     (const_int 21) (const_int 23)
+		     (const_int 25) (const_int 27)
+		     (const_int 29) (const_int 31)
+		     (const_int 49) (const_int 51)
+		     (const_int 53) (const_int 55)
+		     (const_int 57) (const_int 59)
+		     (const_int 61) (const_int 63)])))]
+  "ISA_HAS_LASX"
+  "xvpickod.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvpickod_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)
+		     (const_int 5) (const_int 7)
+		     (const_int 17) (const_int 19)
+		     (const_int 21) (const_int 23)
+		     (const_int 9) (const_int 11)
+		     (const_int 13) (const_int 15)
+		     (const_int 25) (const_int 27)
+		     (const_int 29) (const_int 31)])))]
+  "ISA_HAS_LASX"
+  "xvpickod.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvpickod_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)
+		     (const_int 9) (const_int 11)
+		     (const_int 5) (const_int 7)
+		     (const_int 13) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvpickod.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvpickod_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)
+		     (const_int 9) (const_int 11)
+		     (const_int 5) (const_int 7)
+		     (const_int 13) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvpickod.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "popcount<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(popcount:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvpcnt.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_pcnt")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "lasx_xvsat_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm256>_operand" "")]
+		     UNSPEC_LASX_XVSAT_S))]
+  "ISA_HAS_LASX"
+  "xvsat.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_sat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsat_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVSAT_U))]
+  "ISA_HAS_LASX"
+  "xvsat.<lasxfmt_u>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_sat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf4i_<lasxfmt_f>"
+  [(set (match_operand:LASX_WHB_W 0 "register_operand" "=f")
+	(unspec:LASX_WHB_W [(match_operand:LASX_WHB_W 1 "register_operand" "f")
+			    (match_operand 2 "const_uimm8_operand")]
+			   UNSPEC_LASX_XVSHUF4I))]
+  "ISA_HAS_LASX"
+  "xvshuf4i.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf4i_<lasxfmt_f>_1"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+    (vec_select:LASX_W
+      (match_operand:LASX_W 1 "nonimmediate_operand" "f")
+      (parallel [(match_operand 2 "const_0_to_3_operand")
+             (match_operand 3 "const_0_to_3_operand")
+             (match_operand 4 "const_0_to_3_operand")
+             (match_operand 5 "const_0_to_3_operand")
+             (match_operand 6 "const_4_to_7_operand")
+             (match_operand 7 "const_4_to_7_operand")
+             (match_operand 8 "const_4_to_7_operand")
+             (match_operand 9 "const_4_to_7_operand")])))]
+  "ISA_HAS_LASX
+   && INTVAL (operands[2]) + 4 == INTVAL (operands[6])
+   && INTVAL (operands[3]) + 4 == INTVAL (operands[7])
+   && INTVAL (operands[4]) + 4 == INTVAL (operands[8])
+   && INTVAL (operands[5]) + 4 == INTVAL (operands[9])"
+{
+  int mask = 0;
+  mask |= INTVAL (operands[2]) << 0;
+  mask |= INTVAL (operands[3]) << 2;
+  mask |= INTVAL (operands[4]) << 4;
+  mask |= INTVAL (operands[5]) << 6;
+  operands[2] = GEN_INT (mask);
+
+  return "xvshuf4i.w\t%u0,%u1,%2";
+}
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrar_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVSRAR))]
+  "ISA_HAS_LASX"
+  "xvsrar.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrari_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVSRARI))]
+  "ISA_HAS_LASX"
+  "xvsrari.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrlr_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVSRLR))]
+  "ISA_HAS_LASX"
+  "xvsrlr.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrlri_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVSRLRI))]
+  "ISA_HAS_LASX"
+  "xvsrlri.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssub_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(ss_minus:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvssub.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssub_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(us_minus:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvssub.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf_<lasxfmt_f>"
+  [(set (match_operand:LASX_DWH 0 "register_operand" "=f")
+	(unspec:LASX_DWH [(match_operand:LASX_DWH 1 "register_operand" "0")
+			  (match_operand:LASX_DWH 2 "register_operand" "f")
+			  (match_operand:LASX_DWH 3 "register_operand" "f")]
+			UNSPEC_LASX_XVSHUF))]
+  "ISA_HAS_LASX"
+  "xvshuf.<lasxfmt>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:V32QI 1 "register_operand" "f")
+		       (match_operand:V32QI 2 "register_operand" "f")
+		       (match_operand:V32QI 3 "register_operand" "f")]
+		      UNSPEC_LASX_XVSHUF_B))]
+  "ISA_HAS_LASX"
+  "xvshuf.b\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvreplve0_<lasxfmt_f>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(vec_duplicate:LASX
+	  (vec_select:<UNITMODE>
+	    (match_operand:LASX 1 "register_operand" "f")
+	    (parallel [(const_int 0)]))))]
+  "ISA_HAS_LASX"
+  "xvreplve0.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvrepl128vei_b_internal"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_duplicate:V32QI
+	  (vec_select:V32QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_uimm4_operand" "")
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_operand 3 "const_16_to_31_operand" "")
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)]))))]
+  "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 16)"
+  "xvrepl128vei.b\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvrepl128vei_h_internal"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_duplicate:V16HI
+	  (vec_select:V16HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_uimm3_operand" "")
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2)
+		       (match_operand 3 "const_8_to_15_operand" "")
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3)]))))]
+  "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 8)"
+  "xvrepl128vei.h\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvrepl128vei_w_internal"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_duplicate:V8SI
+	  (vec_select:V8SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_0_to_3_operand" "")
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_operand 3 "const_4_to_7_operand" "")
+		       (match_dup 3) (match_dup 3) (match_dup 3)]))))]
+  "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 4)"
+  "xvrepl128vei.w\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvrepl128vei_d_internal"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(vec_duplicate:V4DI
+	  (vec_select:V4DI
+	    (match_operand:V4DI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_0_or_1_operand" "")
+		       (match_dup 2)
+		       (match_operand 3 "const_2_or_3_operand" "")
+		       (match_dup 3)]))))]
+  "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 2)"
+  "xvrepl128vei.d\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvrepl128vei_<lasxfmt_f>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(unspec:LASX [(match_operand:LASX 1 "register_operand" "f")
+		      (match_operand 2 "const_<indeximm_lo>_operand" "")]
+		     UNSPEC_LASX_XVREPL128VEI))]
+  "ISA_HAS_LASX"
+  "xvrepl128vei.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvreplve0_<lasxfmt_f>_scalar"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+    (vec_duplicate:FLASX
+      (match_operand:<UNITMODE> 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvreplve0.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvreplve0_q"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:V32QI 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVREPLVE0_Q))]
+  "ISA_HAS_LASX"
+  "xvreplve0.q\t%u0,%u1"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvfcvt_h_s"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(unspec:V16HI [(match_operand:V8SF 1 "register_operand" "f")
+		       (match_operand:V8SF 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVFCVT))]
+  "ISA_HAS_LASX"
+  "xvfcvt.h.s\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvfcvt_s_d"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFCVT))]
+  "ISA_HAS_LASX"
+  "xvfcvt.s.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "vec_pack_trunc_v4df"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_concat:V8SF
+	  (float_truncate:V4SF (match_operand:V4DF 1 "register_operand" "f"))
+	  (float_truncate:V4SF (match_operand:V4DF 2 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvfcvt.s.d\t%u0,%u2,%u1\n\txvpermi.d\t%u0,%u0,0xd8"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SF")
+   (set_attr "length" "8")])
+
+;; Define for builtin function.
+(define_insn "lasx_xvfcvth_s_h"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V16HI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFCVTH))]
+  "ISA_HAS_LASX"
+  "xvfcvth.s.h\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SF")])
+
+;; Define for builtin function.
+(define_insn "lasx_xvfcvth_d_s"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	 (vec_select:V4SF
+	  (match_operand:V8SF 1 "register_operand" "f")
+	  (parallel [(const_int 2) (const_int 3)
+		      (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "xvfcvth.d.s\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DF")
+   (set_attr "length" "12")])
+
+;; Define for gen insn.
+(define_insn "lasx_xvfcvth_d_insn"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	(vec_select:V4SF
+	  (match_operand:V8SF 1 "register_operand" "f")
+	  (parallel [(const_int 4) (const_int 5)
+		     (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "xvpermi.d\t%u0,%u1,0xfa\n\txvfcvtl.d.s\t%u0,%u0"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DF")
+   (set_attr "length" "12")])
+
+;; Define for builtin function.
+(define_insn "lasx_xvfcvtl_s_h"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V16HI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFCVTL))]
+  "ISA_HAS_LASX"
+  "xvfcvtl.s.h\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SF")])
+
+;; Define for builtin function.
+(define_insn "lasx_xvfcvtl_d_s"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	(vec_select:V4SF
+	  (match_operand:V8SF 1 "register_operand" "f")
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 4) (const_int 5)]))))]
+  "ISA_HAS_LASX"
+  "xvfcvtl.d.s\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DF")
+   (set_attr "length" "8")])
+
+;; Define for gen insn.
+(define_insn "lasx_xvfcvtl_d_insn"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	(vec_select:V4SF
+	  (match_operand:V8SF 1 "register_operand" "f")
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LASX"
+  "xvpermi.d\t%u0,%u1,0x50\n\txvfcvtl.d.s\t%u0,%u0"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DF")
+   (set_attr "length" "8")])
+
+(define_code_attr lasxbr
+  [(eq "xbz")
+   (ne "xbnz")])
+
+(define_code_attr lasxeq_v
+  [(eq "eqz")
+   (ne "nez")])
+
+(define_code_attr lasxne_v
+  [(eq "nez")
+   (ne "eqz")])
+
+(define_code_attr lasxeq
+  [(eq "anyeqz")
+   (ne "allnez")])
+
+(define_code_attr lasxne
+  [(eq "allnez")
+   (ne "anyeqz")])
+
+(define_insn "lasx_<lasxbr>_<lasxfmt_f>"
+  [(set (pc)
+	(if_then_else
+	  (equality_op
+	    (unspec:SI [(match_operand:LASX 1 "register_operand" "f")]
+		       UNSPEC_LASX_BRANCH)
+	    (match_operand:SI 2 "const_0_operand"))
+	  (label_ref (match_operand 0))
+	  (pc)))
+   (clobber (match_scratch:FCC 3 "=z"))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					 "xvset<lasxeq>.<lasxfmt>\t%Z3%u1\n\tbcnez\t%Z3%0",
+					 "xvset<lasxne>.<lasxfmt>\t%z3%u1\n\tbcnez\t%Z3%0");
+}
+  [(set_attr "type" "simd_branch")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_<lasxbr>_v_<lasxfmt_f>"
+  [(set (pc)
+	(if_then_else
+	  (equality_op
+	    (unspec:SI [(match_operand:LASX 1 "register_operand" "f")]
+		       UNSPEC_LASX_BRANCH_V)
+	    (match_operand:SI 2 "const_0_operand"))
+	  (label_ref (match_operand 0))
+	  (pc)))
+   (clobber (match_scratch:FCC 3 "=z"))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					 "xvset<lasxeq_v>.v\t%Z3%u1\n\tbcnez\t%Z3%0",
+					 "xvset<lasxne_v>.v\t%Z3%u1\n\tbcnez\t%Z3%0");
+}
+  [(set_attr "type" "simd_branch")
+   (set_attr "mode" "<MODE>")])
+
+;; loongson-asx.
+(define_insn "lasx_vext2xv_h<u>_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(any_extend:V16HI
+	  (vec_select:V16QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)
+		       (const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)
+		       (const_int 8) (const_int 9)
+		       (const_int 10) (const_int 11)
+		       (const_int 12) (const_int 13)
+		       (const_int 14) (const_int 15)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.h<u>.b<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_vext2xv_w<u>_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(any_extend:V8SI
+	  (vec_select:V8HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)
+		       (const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.w<u>.h<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_vext2xv_d<u>_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(any_extend:V4DI
+	  (vec_select:V4SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.d<u>.w<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_vext2xv_w<u>_b<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(any_extend:V8SI
+	  (vec_select:V8QI
+	   (match_operand:V32QI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)
+		       (const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.w<u>.b<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_vext2xv_d<u>_h<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(any_extend:V4DI
+	  (vec_select:V4HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.d<u>.h<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_vext2xv_d<u>_b<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(any_extend:V4DI
+	  (vec_select:V4QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.d<u>.b<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DI")])
+
+;; Extend loongson-sx to loongson-asx.
+(define_insn "xvandn<mode>3"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(and:LASX (not:LASX (match_operand:LASX 1 "register_operand" "f"))
+			    (match_operand:LASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvandn.v\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "abs<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(abs:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvsigncov.<lasxfmt>\t%u0,%u1,%u1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "neg<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(neg:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvneg.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvmuh_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVMUH_S))]
+  "ISA_HAS_LASX"
+  "xvmuh.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvmuh_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVMUH_U))]
+  "ISA_HAS_LASX"
+  "xvmuh.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsllwil_s_<dlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VDMODE256> 0 "register_operand" "=f")
+	(unspec:<VDMODE256> [(match_operand:ILASX_WHB 1 "register_operand" "f")
+			     (match_operand 2 "const_<bitimm256>_operand" "")]
+			    UNSPEC_LASX_XVSLLWIL_S))]
+  "ISA_HAS_LASX"
+  "xvsllwil.<dlasxfmt>.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsllwil_u_<dlasxfmt_u>_<lasxfmt_u>"
+  [(set (match_operand:<VDMODE256> 0 "register_operand" "=f")
+	(unspec:<VDMODE256> [(match_operand:ILASX_WHB 1 "register_operand" "f")
+			     (match_operand 2 "const_<bitimm256>_operand" "")]
+			    UNSPEC_LASX_XVSLLWIL_U))]
+  "ISA_HAS_LASX"
+  "xvsllwil.<dlasxfmt_u>.<lasxfmt_u>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsran_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSRAN))]
+  "ISA_HAS_LASX"
+  "xvsran.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssran_s_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRAN_S))]
+  "ISA_HAS_LASX"
+  "xvssran.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssran_u_<hlasxfmt_u>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRAN_U))]
+  "ISA_HAS_LASX"
+  "xvssran.<hlasxfmt_u>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrarn_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSRARN))]
+  "ISA_HAS_LASX"
+  "xvsrarn.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrarn_s_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRARN_S))]
+  "ISA_HAS_LASX"
+  "xvssrarn.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrarn_u_<hlasxfmt_u>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRARN_U))]
+  "ISA_HAS_LASX"
+  "xvssrarn.<hlasxfmt_u>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrln_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSRLN))]
+  "ISA_HAS_LASX"
+  "xvsrln.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrln_u_<hlasxfmt_u>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRLN_U))]
+  "ISA_HAS_LASX"
+  "xvssrln.<hlasxfmt_u>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrlrn_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSRLRN))]
+  "ISA_HAS_LASX"
+  "xvsrlrn.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlrn_u_<hlasxfmt_u>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRLRN_U))]
+  "ISA_HAS_LASX"
+  "xvssrlrn.<hlasxfmt_u>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrstpi_<lasxfmt>"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+	(unspec:ILASX_HB [(match_operand:ILASX_HB 1 "register_operand" "0")
+			  (match_operand:ILASX_HB 2 "register_operand" "f")
+			  (match_operand 3 "const_uimm5_operand" "")]
+			 UNSPEC_LASX_XVFRSTPI))]
+  "ISA_HAS_LASX"
+  "xvfrstpi.<lasxfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrstp_<lasxfmt>"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+	(unspec:ILASX_HB [(match_operand:ILASX_HB 1 "register_operand" "0")
+			  (match_operand:ILASX_HB 2 "register_operand" "f")
+			  (match_operand:ILASX_HB 3 "register_operand" "f")]
+			 UNSPEC_LASX_XVFRSTP))]
+  "ISA_HAS_LASX"
+  "xvfrstp.<lasxfmt>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf4i_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand")]
+		     UNSPEC_LASX_XVSHUF4I))]
+  "ISA_HAS_LASX"
+  "xvshuf4i.d\t%u0,%u2,%3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvbsrl_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_uimm5_operand" "")]
+		      UNSPEC_LASX_XVBSRL_V))]
+  "ISA_HAS_LASX"
+  "xvbsrl.v\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbsll_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_uimm5_operand" "")]
+		      UNSPEC_LASX_XVBSLL_V))]
+  "ISA_HAS_LASX"
+  "xvbsll.v\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvextrins_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVEXTRINS))]
+  "ISA_HAS_LASX"
+  "xvextrins.<lasxfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvmskltz_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVMSKLTZ))]
+  "ISA_HAS_LASX"
+  "xvmskltz.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsigncov_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVSIGNCOV))]
+  "ISA_HAS_LASX"
+  "xvsigncov.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "copysign<mode>3"
+  [(set (match_dup 4)
+	(and:FLASX
+	  (not:FLASX (match_dup 3))
+	  (match_operand:FLASX 1 "register_operand")))
+   (set (match_dup 5)
+	(and:FLASX (match_dup 3)
+		   (match_operand:FLASX 2 "register_operand")))
+   (set (match_operand:FLASX 0 "register_operand")
+	(ior:FLASX (match_dup 4) (match_dup 5)))]
+  "ISA_HAS_LASX"
+{
+  operands[3] = loongarch_build_signbit_mask (<MODE>mode, 1, 0);
+
+  operands[4] = gen_reg_rtx (<MODE>mode);
+  operands[5] = gen_reg_rtx (<MODE>mode);
+})
+
+
+(define_insn "absv4df2"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(abs:V4DF (match_operand:V4DF 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvbitclri.d\t%u0,%u1,63"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "absv8sf2"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(abs:V8SF (match_operand:V8SF 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvbitclri.w\t%u0,%u1,31"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "negv4df2"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(neg:V4DF (match_operand:V4DF 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvbitrevi.d\t%u0,%u1,63"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "negv8sf2"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(neg:V8SF (match_operand:V8SF 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvbitrevi.w\t%u0,%u1,31"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "xvfmadd<mode>4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(fma:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		   (match_operand:FLASX 2 "register_operand" "f")
+		   (match_operand:FLASX 3 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmadd.<flasxfmt>\t%u0,%u1,$u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fms<mode>4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(fma:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		   (match_operand:FLASX 2 "register_operand" "f")
+		   (neg:FLASX (match_operand:FLASX 3 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvfmsub.<flasxfmt>\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "xvfnmsub<mode>4_nmsub4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(neg:FLASX
+	  (fma:FLASX
+	    (match_operand:FLASX 1 "register_operand" "f")
+	    (match_operand:FLASX 2 "register_operand" "f")
+	    (neg:FLASX (match_operand:FLASX 3 "register_operand" "f")))))]
+  "ISA_HAS_LASX"
+  "xvfnmsub.<flasxfmt>\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "xvfnmadd<mode>4_nmadd4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(neg:FLASX
+	  (fma:FLASX
+	    (match_operand:FLASX 1 "register_operand" "f")
+	    (match_operand:FLASX 2 "register_operand" "f")
+	    (match_operand:FLASX 3 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvfnmadd.<flasxfmt>\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvftintrne_w_s"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNE_W_S))]
+  "ISA_HAS_LASX"
+  "xvftintrne.w.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrne_l_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNE_L_D))]
+  "ISA_HAS_LASX"
+  "xvftintrne.l.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrp_w_s"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRP_W_S))]
+  "ISA_HAS_LASX"
+  "xvftintrp.w.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrp_l_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRP_L_D))]
+  "ISA_HAS_LASX"
+  "xvftintrp.l.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrm_w_s"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRM_W_S))]
+  "ISA_HAS_LASX"
+  "xvftintrm.w.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrm_l_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRM_L_D))]
+  "ISA_HAS_LASX"
+  "xvftintrm.l.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftint_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINT_W_D))]
+  "ISA_HAS_LASX"
+  "xvftint.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvffint_s_l"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFFINT_S_L))]
+  "ISA_HAS_LASX"
+  "xvffint.s.l\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvftintrz_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRZ_W_D))]
+  "ISA_HAS_LASX"
+  "xvftintrz.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrp_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRP_W_D))]
+  "ISA_HAS_LASX"
+  "xvftintrp.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrm_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRM_W_D))]
+  "ISA_HAS_LASX"
+  "xvftintrm.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrne_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNE_W_D))]
+  "ISA_HAS_LASX"
+  "xvftintrne.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftinth_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftinth.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintl_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTL_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintl.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvffinth_d_w"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V8SI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFFINTH_D_W))]
+  "ISA_HAS_LASX"
+  "xvffinth.d.w\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvffintl_d_w"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V8SI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFFINTL_D_W))]
+  "ISA_HAS_LASX"
+  "xvffintl.d.w\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvftintrzh_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRZH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrzh.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrzl_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRZL_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrzl.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lasx_xvftintrph_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRPH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrph.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lasx_xvftintrpl_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRPL_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrpl.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrmh_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRMH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrmh.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrml_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRML_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrml.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrneh_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNEH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrneh.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrnel_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNEL_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrnel.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrne_s"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRNE_S))]
+  "ISA_HAS_LASX"
+  "xvfrintrne.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrne_d"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRNE_D))]
+  "ISA_HAS_LASX"
+  "xvfrintrne.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvfrintrz_s"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRZ_S))]
+  "ISA_HAS_LASX"
+  "xvfrintrz.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrz_d"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRZ_D))]
+  "ISA_HAS_LASX"
+  "xvfrintrz.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvfrintrp_s"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRP_S))]
+  "ISA_HAS_LASX"
+  "xvfrintrp.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrp_d"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRP_D))]
+  "ISA_HAS_LASX"
+  "xvfrintrp.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvfrintrm_s"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRM_S))]
+  "ISA_HAS_LASX"
+  "xvfrintrm.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrm_d"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRM_D))]
+  "ISA_HAS_LASX"
+  "xvfrintrm.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+;; Vector versions of the floating-point frint patterns.
+;; Expands to btrunc, ceil, floor, rint.
+(define_insn "<FRINT256_S:frint256_pattern_s>v8sf2"
+ [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+			 FRINT256_S))]
+  "ISA_HAS_LASX"
+  "xvfrint<FRINT256_S:frint256_suffix>.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "<FRINT256_D:frint256_pattern_d>v4df2"
+ [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+			 FRINT256_D))]
+  "ISA_HAS_LASX"
+  "xvfrint<FRINT256_D:frint256_suffix>.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+;; Expands to round.
+(define_insn "round<mode>2"
+ [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+			 UNSPEC_LASX_XVFRINT))]
+  "ISA_HAS_LASX"
+  "xvfrint.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; Offset load and broadcast
+(define_expand "lasx_xvldrepl_<lasxfmt_f>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand 2 "aq12<lasxfmt>_operand")
+   (match_operand 1 "pmode_register_operand")]
+  "ISA_HAS_LASX"
+{
+  emit_insn (gen_lasx_xvldrepl_<lasxfmt_f>_insn
+	     (operands[0], operands[1], operands[2]));
+  DONE;
+})
+
+(define_insn "lasx_xvldrepl_<lasxfmt_f>_insn"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(vec_duplicate:LASX
+	  (mem:<UNITMODE> (plus:DI (match_operand:DI 1 "register_operand" "r")
+				   (match_operand 2 "aq12<lasxfmt>_operand")))))]
+  "ISA_HAS_LASX"
+{
+  return "xvldrepl.<lasxfmt>\t%u0,%1,%2";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;; Offset is "0"
+(define_insn "lasx_xvldrepl_<lasxfmt_f>_insn_0"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+    (vec_duplicate:LASX
+      (mem:<UNITMODE> (match_operand:DI 1 "register_operand" "r"))))]
+  "ISA_HAS_LASX"
+{
+    return "xvldrepl.<lasxfmt>\t%u0,%1,0";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;;XVADDWEV.H.B   XVSUBWEV.H.B   XVMULWEV.H.B
+;;XVADDWEV.H.BU  XVSUBWEV.H.BU  XVMULWEV.H.BU
+(define_insn "lasx_xv<optab>wev_h_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addsubmul:V16HI
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.h.b<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+;;XVADDWEV.W.H   XVSUBWEV.W.H   XVMULWEV.W.H
+;;XVADDWEV.W.HU  XVSUBWEV.W.HU  XVMULWEV.W.HU
+(define_insn "lasx_xv<optab>wev_w_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addsubmul:V8SI
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.w.h<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+;;XVADDWEV.D.W   XVSUBWEV.D.W   XVMULWEV.D.W
+;;XVADDWEV.D.WU  XVSUBWEV.D.WU  XVMULWEV.D.WU
+(define_insn "lasx_xv<optab>wev_d_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addsubmul:V4DI
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.d.w<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWEV.Q.D
+;;TODO2
+(define_insn "lasx_xvaddwev_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWEV))]
+  "ISA_HAS_LASX"
+  "xvaddwev.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUBWEV.Q.D
+;;TODO2
+(define_insn "lasx_xvsubwev_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUBWEV))]
+  "ISA_HAS_LASX"
+  "xvsubwev.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWEV.Q.D
+;;TODO2
+(define_insn "lasx_xvmulwev_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWEV))]
+  "ISA_HAS_LASX"
+  "xvmulwev.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+
+;;XVADDWOD.H.B   XVSUBWOD.H.B   XVMULWOD.H.B
+;;XVADDWOD.H.BU  XVSUBWOD.H.BU  XVMULWOD.H.BU
+(define_insn "lasx_xv<optab>wod_h_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addsubmul:V16HI
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.h.b<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+;;XVADDWOD.W.H   XVSUBWOD.W.H   XVMULWOD.W.H
+;;XVADDWOD.W.HU  XVSUBWOD.W.HU  XVMULWOD.W.HU
+(define_insn "lasx_xv<optab>wod_w_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addsubmul:V8SI
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.w.h<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+
+;;XVADDWOD.D.W   XVSUBWOD.D.W   XVMULWOD.D.W
+;;XVADDWOD.D.WU  XVSUBWOD.D.WU  XVMULWOD.D.WU
+(define_insn "lasx_xv<optab>wod_d_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addsubmul:V4DI
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.d.w<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWOD.Q.D
+;;TODO2
+(define_insn "lasx_xvaddwod_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWOD))]
+  "ISA_HAS_LASX"
+  "xvaddwod.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUBWOD.Q.D
+;;TODO2
+(define_insn "lasx_xvsubwod_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUBWOD))]
+  "ISA_HAS_LASX"
+  "xvsubwod.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWOD.Q.D
+;;TODO2
+(define_insn "lasx_xvmulwod_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWOD))]
+  "ISA_HAS_LASX"
+  "xvmulwod.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWEV.Q.DU
+;;TODO2
+(define_insn "lasx_xvaddwev_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWEV2))]
+  "ISA_HAS_LASX"
+  "xvaddwev.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUBWEV.Q.DU
+;;TODO2
+(define_insn "lasx_xvsubwev_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUBWEV2))]
+  "ISA_HAS_LASX"
+  "xvsubwev.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWEV.Q.DU
+;;TODO2
+(define_insn "lasx_xvmulwev_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWEV2))]
+  "ISA_HAS_LASX"
+  "xvmulwev.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWOD.Q.DU
+;;TODO2
+(define_insn "lasx_xvaddwod_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWOD2))]
+  "ISA_HAS_LASX"
+  "xvaddwod.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUBWOD.Q.DU
+;;TODO2
+(define_insn "lasx_xvsubwod_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUBWOD2))]
+  "ISA_HAS_LASX"
+  "xvsubwod.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWOD.Q.DU
+;;TODO2
+(define_insn "lasx_xvmulwod_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWOD2))]
+  "ISA_HAS_LASX"
+  "xvmulwod.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWEV.H.BU.B   XVMULWEV.H.BU.B
+(define_insn "lasx_xv<optab>wev_h_bu_b"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addmul:V16HI
+	  (zero_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))
+	  (sign_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.h.bu.b\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+;;XVADDWEV.W.HU.H   XVMULWEV.W.HU.H
+(define_insn "lasx_xv<optab>wev_w_hu_h"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addmul:V8SI
+	  (zero_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))
+	  (sign_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.w.hu.h\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+;;XVADDWEV.D.WU.W   XVMULWEV.D.WU.W
+(define_insn "lasx_xv<optab>wev_d_wu_w"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addmul:V4DI
+	  (zero_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))
+	  (sign_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.d.wu.w\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWOD.H.BU.B   XVMULWOD.H.BU.B
+(define_insn "lasx_xv<optab>wod_h_bu_b"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addmul:V16HI
+	  (zero_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))
+	  (sign_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.h.bu.b\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+;;XVADDWOD.W.HU.H   XVMULWOD.W.HU.H
+(define_insn "lasx_xv<optab>wod_w_hu_h"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addmul:V8SI
+	  (zero_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (sign_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.w.hu.h\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+;;XVADDWOD.D.WU.W   XVMULWOD.D.WU.W
+(define_insn "lasx_xv<optab>wod_d_wu_w"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addmul:V4DI
+	  (zero_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (sign_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.d.wu.w\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.H.B   XVMADDWEV.H.BU
+(define_insn "lasx_xvmaddwev_h_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(plus:V16HI
+	  (match_operand:V16HI 1 "register_operand" "0")
+	  (mult:V16HI
+	    (any_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)
+			   (const_int 16) (const_int 18)
+			   (const_int 20) (const_int 22)
+			   (const_int 24) (const_int 26)
+			   (const_int 28) (const_int 30)])))
+	    (any_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)
+			   (const_int 16) (const_int 18)
+			   (const_int 20) (const_int 22)
+			   (const_int 24) (const_int 26)
+			   (const_int 28) (const_int 30)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.h.b<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V16HI")])
+
+;;XVMADDWEV.W.H   XVMADDWEV.W.HU
+(define_insn "lasx_xvmaddwev_w_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(plus:V8SI
+	  (match_operand:V8SI 1 "register_operand" "0")
+	  (mult:V8SI
+	    (any_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)])))
+	    (any_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.w.h<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8SI")])
+
+;;XVMADDWEV.D.W   XVMADDWEV.D.WU
+(define_insn "lasx_xvmaddwev_d_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(plus:V4DI
+	  (match_operand:V4DI 1 "register_operand" "0")
+	  (mult:V4DI
+	    (any_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)])))
+	    (any_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.d.w<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.Q.D
+;;TODO2
+(define_insn "lasx_xvmaddwev_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWEV))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.q.d\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.H.B   XVMADDWOD.H.BU
+(define_insn "lasx_xvmaddwod_h_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(plus:V16HI
+	  (match_operand:V16HI 1 "register_operand" "0")
+	  (mult:V16HI
+	    (any_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)
+			   (const_int 17) (const_int 19)
+			   (const_int 21) (const_int 23)
+			   (const_int 25) (const_int 27)
+			   (const_int 29) (const_int 31)])))
+	    (any_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)
+			   (const_int 17) (const_int 19)
+			   (const_int 21) (const_int 23)
+			   (const_int 25) (const_int 27)
+			   (const_int 29) (const_int 31)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.h.b<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V16HI")])
+
+;;XVMADDWOD.W.H   XVMADDWOD.W.HU
+(define_insn "lasx_xvmaddwod_w_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(plus:V8SI
+	  (match_operand:V8SI 1 "register_operand" "0")
+	  (mult:V8SI
+	    (any_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)])))
+	    (any_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.w.h<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8SI")])
+
+;;XVMADDWOD.D.W   XVMADDWOD.D.WU
+(define_insn "lasx_xvmaddwod_d_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(plus:V4DI
+	  (match_operand:V4DI 1 "register_operand" "0")
+	  (mult:V4DI
+	    (any_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)])))
+	    (any_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.d.w<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.Q.D
+;;TODO2
+(define_insn "lasx_xvmaddwod_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWOD))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.q.d\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.Q.DU
+;;TODO2
+(define_insn "lasx_xvmaddwev_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWEV2))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.q.du\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.Q.DU
+;;TODO2
+(define_insn "lasx_xvmaddwod_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWOD2))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.q.du\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.H.BU.B
+(define_insn "lasx_xvmaddwev_h_bu_b"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(plus:V16HI
+	  (match_operand:V16HI 1 "register_operand" "0")
+	  (mult:V16HI
+	    (zero_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)
+			   (const_int 16) (const_int 18)
+			   (const_int 20) (const_int 22)
+			   (const_int 24) (const_int 26)
+			   (const_int 28) (const_int 30)])))
+	    (sign_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)
+			   (const_int 16) (const_int 18)
+			   (const_int 20) (const_int 22)
+			   (const_int 24) (const_int 26)
+			   (const_int 28) (const_int 30)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.h.bu.b\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V16HI")])
+
+;;XVMADDWEV.W.HU.H
+(define_insn "lasx_xvmaddwev_w_hu_h"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(plus:V8SI
+	  (match_operand:V8SI 1 "register_operand" "0")
+	  (mult:V8SI
+	    (zero_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)])))
+	    (sign_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.w.hu.h\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8SI")])
+
+;;XVMADDWEV.D.WU.W
+(define_insn "lasx_xvmaddwev_d_wu_w"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(plus:V4DI
+	  (match_operand:V4DI 1 "register_operand" "0")
+	  (mult:V4DI
+	    (zero_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)])))
+	    (sign_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.d.wu.w\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.Q.DU.D
+;;TODO2
+(define_insn "lasx_xvmaddwev_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWEV3))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.q.du.d\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.H.BU.B
+(define_insn "lasx_xvmaddwod_h_bu_b"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(plus:V16HI
+	  (match_operand:V16HI 1 "register_operand" "0")
+	  (mult:V16HI
+	    (zero_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)
+			   (const_int 17) (const_int 19)
+			   (const_int 21) (const_int 23)
+			   (const_int 25) (const_int 27)
+			   (const_int 29) (const_int 31)])))
+	    (sign_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)
+			   (const_int 17) (const_int 19)
+			   (const_int 21) (const_int 23)
+			   (const_int 25) (const_int 27)
+			   (const_int 29) (const_int 31)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.h.bu.b\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V16HI")])
+
+;;XVMADDWOD.W.HU.H
+(define_insn "lasx_xvmaddwod_w_hu_h"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(plus:V8SI
+	  (match_operand:V8SI 1 "register_operand" "0")
+	  (mult:V8SI
+	    (zero_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)])))
+	    (sign_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.w.hu.h\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8SI")])
+
+;;XVMADDWOD.D.WU.W
+(define_insn "lasx_xvmaddwod_d_wu_w"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(plus:V4DI
+	  (match_operand:V4DI 1 "register_operand" "0")
+	  (mult:V4DI
+	    (zero_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)])))
+	    (sign_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.d.wu.w\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.Q.DU.D
+;;TODO2
+(define_insn "lasx_xvmaddwod_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWOD3))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.q.du.d\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVHADDW.Q.D
+;;TODO2
+(define_insn "lasx_xvhaddw_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVHADDW_Q_D))]
+  "ISA_HAS_LASX"
+  "xvhaddw.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVHSUBW.Q.D
+;;TODO2
+(define_insn "lasx_xvhsubw_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVHSUBW_Q_D))]
+  "ISA_HAS_LASX"
+  "xvhsubw.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVHADDW.QU.DU
+;;TODO2
+(define_insn "lasx_xvhaddw_qu_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVHADDW_QU_DU))]
+  "ISA_HAS_LASX"
+  "xvhaddw.qu.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVHSUBW.QU.DU
+;;TODO2
+(define_insn "lasx_xvhsubw_qu_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVHSUBW_QU_DU))]
+  "ISA_HAS_LASX"
+  "xvhsubw.qu.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVROTR.B   XVROTR.H   XVROTR.W   XVROTR.D
+;;TODO-478
+(define_insn "lasx_xvrotr_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVROTR))]
+  "ISA_HAS_LASX"
+  "xvrotr.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+;;XVADD.Q
+;;TODO2
+(define_insn "lasx_xvadd_q"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADD_Q))]
+  "ISA_HAS_LASX"
+  "xvadd.q\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUB.Q
+;;TODO2
+(define_insn "lasx_xvsub_q"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUB_Q))]
+  "ISA_HAS_LASX"
+  "xvsub.q\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSSRLN.B.H   XVSSRLN.H.W   XVSSRLN.W.D
+(define_insn "lasx_xvssrln_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRLN))]
+  "ISA_HAS_LASX"
+  "xvssrln.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+;;XVREPLVE.B   XVREPLVE.H   XVREPLVE.W   XVREPLVE.D
+(define_insn "lasx_xvreplve_<lasxfmt_f>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(unspec:LASX [(match_operand:LASX 1 "register_operand" "f")
+		      (match_operand:SI 2 "register_operand" "r")]
+		     UNSPEC_LASX_XVREPLVE))]
+  "ISA_HAS_LASX"
+  "xvreplve.<lasxfmt>\t%u0,%u1,%z2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+;;XVADDWEV.Q.DU.D
+(define_insn "lasx_xvaddwev_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWEV3))]
+  "ISA_HAS_LASX"
+  "xvaddwev.q.du.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWOD.Q.DU.D
+(define_insn "lasx_xvaddwod_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWOD3))]
+  "ISA_HAS_LASX"
+  "xvaddwod.q.du.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWEV.Q.DU.D
+(define_insn "lasx_xvmulwev_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWEV3))]
+  "ISA_HAS_LASX"
+  "xvmulwev.q.du.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWOD.Q.DU.D
+(define_insn "lasx_xvmulwod_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWOD3))]
+  "ISA_HAS_LASX"
+  "xvmulwod.q.du.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvpickve2gr_w<u>"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(any_extend:SI
+	  (vec_select:SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_0_to_7_operand" "")]))))]
+  "ISA_HAS_LASX"
+  "xvpickve2gr.w<u>\t%0,%u1,%2"
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "V8SI")])
+
+
+(define_insn "lasx_xvmskgez_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:V32QI 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVMSKGEZ))]
+  "ISA_HAS_LASX"
+  "xvmskgez.b\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvmsknz_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:V32QI 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVMSKNZ))]
+  "ISA_HAS_LASX"
+  "xvmsknz.b\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvexth_h<u>_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(any_extend:V16HI
+	  (vec_select:V16QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	      (parallel [(const_int 16) (const_int 17)
+			 (const_int 18) (const_int 19)
+			 (const_int 20) (const_int 21)
+			 (const_int 22) (const_int 23)
+			 (const_int 24) (const_int 25)
+			 (const_int 26) (const_int 27)
+			 (const_int 28) (const_int 29)
+			 (const_int 30) (const_int 31)]))))]
+  "ISA_HAS_LASX"
+  "xvexth.h<u>.b<u>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvexth_w<u>_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(any_extend:V8SI
+	  (vec_select:V8HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (parallel [(const_int 8) (const_int 9)
+		       (const_int 10) (const_int 11)
+		       (const_int 12) (const_int 13)
+		       (const_int 14) (const_int 15)]))))]
+  "ISA_HAS_LASX"
+  "xvexth.w<u>.h<u>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvexth_d<u>_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(any_extend:V4DI
+	  (vec_select:V4SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (parallel [(const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "xvexth.d<u>.w<u>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvexth_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVEXTH_Q_D))]
+  "ISA_HAS_LASX"
+  "xvexth.q.d\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvexth_qu_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVEXTH_QU_DU))]
+  "ISA_HAS_LASX"
+  "xvexth.qu.du\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvrotri_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(rotatert:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")))]
+  "ISA_HAS_LASX"
+  "xvrotri.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvextl_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVEXTL_Q_D))]
+  "ISA_HAS_LASX"
+  "xvextl.q.d\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvsrlni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSRLNI))]
+  "ISA_HAS_LASX"
+  "xvsrlni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrlrni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSRLRNI))]
+  "ISA_HAS_LASX"
+  "xvsrlrni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRLNI))]
+  "ISA_HAS_LASX"
+  "xvssrlni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlni_<lasxfmt_u>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRLNI2))]
+  "ISA_HAS_LASX"
+  "xvssrlni.<lasxfmt_u>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlrni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRLRNI))]
+  "ISA_HAS_LASX"
+  "xvssrlrni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlrni_<lasxfmt_u>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRLRNI2))]
+  "ISA_HAS_LASX"
+  "xvssrlrni.<lasxfmt_u>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrani_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSRANI))]
+  "ISA_HAS_LASX"
+  "xvsrani.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrarni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSRARNI))]
+  "ISA_HAS_LASX"
+  "xvsrarni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrani_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRANI))]
+  "ISA_HAS_LASX"
+  "xvssrani.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrani_<lasxfmt_u>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRANI2))]
+  "ISA_HAS_LASX"
+  "xvssrani.<lasxfmt_u>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrarni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRARNI))]
+  "ISA_HAS_LASX"
+  "xvssrarni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrarni_<lasxfmt_u>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRARNI2))]
+  "ISA_HAS_LASX"
+  "xvssrarni.<lasxfmt_u>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_mode_attr VDOUBLEMODEW256
+  [(V8SI "V16SI")
+   (V8SF "V16SF")])
+
+(define_insn "lasx_xvpermi_<lasxfmt_f_wd>"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+    (unspec:LASX_W [(match_operand:LASX_W 1 "register_operand" "0")
+               (match_operand:LASX_W 2 "register_operand" "f")
+                   (match_operand 3 "const_uimm8_operand" "")]
+             UNSPEC_LASX_XVPERMI))]
+  "ISA_HAS_LASX"
+  "xvpermi.w\t%u0,%u2,%3"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpermi_<lasxfmt_f_wd>_1"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+     (vec_select:LASX_W
+       (vec_concat:<VDOUBLEMODEW256>
+         (match_operand:LASX_W 1 "register_operand" "f")
+         (match_operand:LASX_W 2 "register_operand" "0"))
+       (parallel [(match_operand 3  "const_0_to_3_operand")
+              (match_operand 4  "const_0_to_3_operand"  )
+              (match_operand 5  "const_8_to_11_operand" )
+              (match_operand 6  "const_8_to_11_operand" )
+              (match_operand 7  "const_4_to_7_operand"  )
+              (match_operand 8  "const_4_to_7_operand"  )
+              (match_operand 9  "const_12_to_15_operand")
+              (match_operand 10 "const_12_to_15_operand")])))]
+  "ISA_HAS_LASX
+  && INTVAL (operands[3]) + 4 == INTVAL (operands[7])
+  && INTVAL (operands[4]) + 4 == INTVAL (operands[8])
+  && INTVAL (operands[5]) + 4 == INTVAL (operands[9])
+  && INTVAL (operands[6]) + 4 == INTVAL (operands[10])"
+{
+  int mask = 0;
+  mask |= INTVAL (operands[3]) << 0;
+  mask |= INTVAL (operands[4]) << 2;
+  mask |= (INTVAL (operands[5]) - 8) << 4;
+  mask |= (INTVAL (operands[6]) - 8) << 6;
+  operands[3] = GEN_INT (mask);
+
+  return "xvpermi.w\t%u0,%u1,%3";
+}
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "lasx_xvld"
+  [(match_operand:V32QI 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12b_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (operands[0], gen_rtx_MEM (V32QImode, addr));
+  DONE;
+})
+
+(define_expand "lasx_xvst"
+  [(match_operand:V32QI 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12b_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (gen_rtx_MEM (V32QImode, addr), operands[0]);
+  DONE;
+})
+
+(define_expand "lasx_xvstelm_<lasxfmt_f>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand 3 "const_<indeximm256>_operand")
+   (match_operand 2 "aq8<lasxfmt>_operand")
+   (match_operand 1 "pmode_register_operand")]
+  "ISA_HAS_LASX"
+{
+  emit_insn (gen_lasx_xvstelm_<lasxfmt_f>_insn
+	     (operands[1], operands[2], operands[0], operands[3]));
+  DONE;
+})
+
+(define_insn "lasx_xvstelm_<lasxfmt_f>_insn"
+  [(set (mem:<UNITMODE> (plus:DI (match_operand:DI 0 "register_operand" "r")
+				 (match_operand 1 "aq8<lasxfmt>_operand")))
+	(vec_select:<UNITMODE>
+	  (match_operand:LASX 2 "register_operand" "f")
+	  (parallel [(match_operand 3 "const_<indeximm256>_operand" "")])))]
+  "ISA_HAS_LASX"
+{
+  return "xvstelm.<lasxfmt>\t%u2,%0,%1,%3";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;; Offset is "0"
+(define_insn "lasx_xvstelm_<lasxfmt_f>_insn_0"
+  [(set (mem:<UNITMODE> (match_operand:DI 0 "register_operand" "r"))
+    (vec_select:<UNITMODE>
+      (match_operand:LASX_WD 1 "register_operand" "f")
+      (parallel [(match_operand:SI 2 "const_<indeximm256>_operand")])))]
+  "ISA_HAS_LASX"
+{
+    return "xvstelm.<lasxfmt>\t%u1,%0,0,%2";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+(define_insn "lasx_xvinsve0_<lasxfmt_f>"
+  [(set (match_operand:LASX_WD 0 "register_operand" "=f")
+	(unspec:LASX_WD [(match_operand:LASX_WD 1 "register_operand" "0")
+			 (match_operand:LASX_WD 2 "register_operand" "f")
+			 (match_operand 3 "const_<indeximm256>_operand" "")]
+			UNSPEC_LASX_XVINSVE0))]
+  "ISA_HAS_LASX"
+  "xvinsve0.<lasxfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvinsve0_<lasxfmt_f>_scalar"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+    (vec_merge:FLASX
+      (vec_duplicate:FLASX
+        (match_operand:<UNITMODE> 1 "register_operand" "f"))
+      (match_operand:FLASX 2 "register_operand" "0")
+      (match_operand 3 "const_<bitmask256>_operand" "")))]
+  "ISA_HAS_LASX"
+  "xvinsve0.<lasxfmt>\t%u0,%u1,%y3"
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpickve_<lasxfmt_f>"
+  [(set (match_operand:LASX_WD 0 "register_operand" "=f")
+	(unspec:LASX_WD [(match_operand:LASX_WD 1 "register_operand" "f")
+			 (match_operand 2 "const_<indeximm256>_operand" "")]
+			UNSPEC_LASX_XVPICKVE))]
+  "ISA_HAS_LASX"
+  "xvpickve.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpickve_<lasxfmt_f>_scalar"
+  [(set (match_operand:<UNITMODE> 0 "register_operand" "=f")
+	(vec_select:<UNITMODE>
+	 (match_operand:FLASX 1 "register_operand" "f")
+	 (parallel [(match_operand 2 "const_<indeximm256>_operand" "")])))]
+  "ISA_HAS_LASX"
+  "xvpickve.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlrn_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRLRN))]
+  "ISA_HAS_LASX"
+  "xvssrlrn.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "xvorn<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(ior:ILASX (not:ILASX (match_operand:ILASX 2 "register_operand" "f"))
+		   (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvorn.v\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvextl_qu_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVEXTL_QU_DU))]
+  "ISA_HAS_LASX"
+  "xvextl.qu.du\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvldi"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI[(match_operand 1 "const_imm13_operand")]
+		    UNSPEC_LASX_XVLDI))]
+  "ISA_HAS_LASX"
+{
+  HOST_WIDE_INT val = INTVAL (operands[1]);
+  if (val < 0)
+    {
+      HOST_WIDE_INT modeVal = (val & 0xf00) >> 8;
+      if (modeVal < 13)
+	return  "xvldi\t%u0,%1";
+      else
+	{
+	  sorry ("imm13 only support 0000 ~ 1100 in bits '12 ~ 9' when bit '13' is 1");
+	  return "#";
+	}
+    }
+  else
+    return "xvldi\t%u0,%1";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvldx"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:DI 1 "register_operand" "r")
+		       (match_operand:DI 2 "reg_or_0_operand" "rJ")]
+		      UNSPEC_LASX_XVLDX))]
+  "ISA_HAS_LASX"
+{
+  return "xvldx\t%u0,%1,%z2";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvstx"
+  [(set (mem:V32QI (plus:DI (match_operand:DI 1 "register_operand" "r")
+			    (match_operand:DI 2 "reg_or_0_operand" "rJ")))
+	(unspec: V32QI[(match_operand:V32QI 0 "register_operand" "f")]
+		      UNSPEC_LASX_XVSTX))]
+
+  "ISA_HAS_LASX"
+{
+  return "xvstx\t%u0,%1,%z2";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "DI")])
+
+(define_insn "vec_widen_<su>mult_even_v8si"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+    (mult:V4DI
+      (any_extend:V4DI
+        (vec_select:V4SI
+          (match_operand:V8SI 1 "register_operand" "%f")
+          (parallel [(const_int 0) (const_int 2)
+                         (const_int 4) (const_int 6)])))
+      (any_extend:V4DI
+        (vec_select:V4SI
+          (match_operand:V8SI 2 "register_operand" "f")
+          (parallel [(const_int 0) (const_int 2)
+             (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LASX"
+  "xvmulwev.d.w<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;; Vector reduction operation
+(define_expand "reduc_plus_scal_v4di"
+  [(match_operand:DI 0 "register_operand")
+   (match_operand:V4DI 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (V4DImode);
+  rtx tmp1 = gen_reg_rtx (V4DImode);
+  rtx vec_res = gen_reg_rtx (V4DImode);
+  emit_insn (gen_lasx_xvhaddw_q_d (tmp, operands[1], operands[1]));
+  emit_insn (gen_lasx_xvpermi_d_v4di (tmp1, tmp, GEN_INT (2)));
+  emit_insn (gen_addv4di3 (vec_res, tmp, tmp1));
+  emit_insn (gen_vec_extractv4didi (operands[0], vec_res, const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_v8si"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:V8SI 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (V4DImode);
+  rtx tmp1 = gen_reg_rtx (V4DImode);
+  rtx vec_res = gen_reg_rtx (V4DImode);
+  emit_insn (gen_lasx_xvhaddw_d_w (tmp, operands[1], operands[1]));
+  emit_insn (gen_lasx_xvhaddw_q_d (tmp1, tmp, tmp));
+  emit_insn (gen_lasx_xvpermi_d_v4di (tmp, tmp1, GEN_INT (2)));
+  emit_insn (gen_addv4di3 (vec_res, tmp, tmp1));
+  emit_insn (gen_vec_extractv8sisi (operands[0], gen_lowpart (V8SImode,vec_res),
+				    const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:FLASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_add<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_<optab>_scal_<mode>"
+  [(any_bitwise:<UNITMODE>
+     (match_operand:<UNITMODE> 0 "register_operand")
+     (match_operand:ILASX 1 "register_operand"))]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_<optab><mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smax_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_smax<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smin_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_smin<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umax_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_umax<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umin_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_umin<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
diff --git a/gcc/config/loongarch/loongarch-modes.def b/gcc/config/loongarch/loongarch-modes.def
index 6f57b60525d..68a829316f4 100644
--- a/gcc/config/loongarch/loongarch-modes.def
+++ b/gcc/config/loongarch/loongarch-modes.def
@@ -33,6 +33,7 @@ VECTOR_MODES (FLOAT, 8);      /*       V4HF V2SF */
 VECTOR_MODES (INT, 16);	      /* V16QI V8HI V4SI V2DI */
 VECTOR_MODES (FLOAT, 16);     /*	    V4SF V2DF */
 
+/* For LARCH LASX 256 bits.  */
 VECTOR_MODES (INT, 32);	      /* V32QI V16HI V8SI V4DI */
 VECTOR_MODES (FLOAT, 32);     /*	     V8SF V4DF */
 
diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h
index fc33527cdcf..f4430d0d418 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -89,6 +89,8 @@ extern bool loongarch_split_move_insn_p (rtx, rtx);
 extern void loongarch_split_move_insn (rtx, rtx, rtx);
 extern void loongarch_split_128bit_move (rtx, rtx);
 extern bool loongarch_split_128bit_move_p (rtx, rtx);
+extern void loongarch_split_256bit_move (rtx, rtx);
+extern bool loongarch_split_256bit_move_p (rtx, rtx);
 extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx (*)(rtx, rtx, rtx));
 extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
 extern void loongarch_split_lsx_fill_d (rtx, rtx);
@@ -174,9 +176,11 @@ union loongarch_gen_fn_ptrs
 extern void loongarch_expand_atomic_qihi (union loongarch_gen_fn_ptrs,
 					  rtx, rtx, rtx, rtx, rtx);
 
+extern void loongarch_expand_vector_group_init (rtx, rtx);
 extern void loongarch_expand_vector_init (rtx, rtx);
 extern void loongarch_expand_vec_unpack (rtx op[2], bool, bool);
 extern void loongarch_expand_vec_perm (rtx, rtx, rtx, rtx);
+extern void loongarch_expand_vec_perm_1 (rtx[]);
 extern void loongarch_expand_vector_extract (rtx, rtx, int);
 extern void loongarch_expand_vector_reduc (rtx (*)(rtx, rtx, rtx), rtx, rtx);
 
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index dd42816ac27..28e02916ccb 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1928,7 +1928,7 @@ loongarch_symbol_insns (enum loongarch_symbol_type type, machine_mode mode)
 {
   /* LSX LD.* and ST.* cannot support loading symbols via an immediate
      operand.  */
-  if (LSX_SUPPORTED_MODE_P (mode))
+  if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode))
     return 0;
 
   switch (type)
@@ -2061,6 +2061,11 @@ loongarch_valid_offset_p (rtx x, machine_mode mode)
 					loongarch_ldst_scaled_shift (mode)))
     return false;
 
+  /* LASX XVLD.B and XVST.B supports 10-bit signed offsets without shift.  */
+  if (LASX_SUPPORTED_MODE_P (mode)
+      && !loongarch_signed_immediate_p (INTVAL (x), 10, 0))
+    return false;
+
   return true;
 }
 
@@ -2274,7 +2279,9 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p)
 {
   struct loongarch_address_info addr;
   int factor;
-  bool lsx_p = !might_split_p && LSX_SUPPORTED_MODE_P (mode);
+  bool lsx_p = (!might_split_p
+		&& (LSX_SUPPORTED_MODE_P (mode)
+		    || LASX_SUPPORTED_MODE_P (mode)));
 
   if (!loongarch_classify_address (&addr, x, mode, false))
     return 0;
@@ -2420,7 +2427,8 @@ loongarch_const_insns (rtx x)
       return loongarch_integer_cost (INTVAL (x));
 
     case CONST_VECTOR:
-      if (LSX_SUPPORTED_MODE_P (GET_MODE (x))
+      if ((LSX_SUPPORTED_MODE_P (GET_MODE (x))
+	   || LASX_SUPPORTED_MODE_P (GET_MODE (x)))
 	  && loongarch_const_vector_same_int_p (x, GET_MODE (x), -512, 511))
 	return 1;
       /* Fall through.  */
@@ -3259,10 +3267,11 @@ loongarch_legitimize_move (machine_mode mode, rtx dest, rtx src)
 
   /* Both src and dest are non-registers;  one special case is supported where
      the source is (const_int 0) and the store can source the zero register.
-     LSX is never able to source the zero register directly in
+     LSX and LASX are never able to source the zero register directly in
      memory operations.  */
   if (!register_operand (dest, mode) && !register_operand (src, mode)
-      && (!const_0_operand (src, mode) || LSX_SUPPORTED_MODE_P (mode)))
+      && (!const_0_operand (src, mode)
+	  || LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode)))
     {
       loongarch_emit_move (dest, force_reg (mode, src));
       return true;
@@ -3844,6 +3853,7 @@ loongarch_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
 				      int misalign ATTRIBUTE_UNUSED)
 {
   unsigned elements;
+  machine_mode mode = vectype != NULL ? TYPE_MODE (vectype) : DImode;
 
   switch (type_of_cost)
     {
@@ -3860,7 +3870,8 @@ loongarch_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
 	return 1;
 
       case vec_perm:
-	return 1;
+	return LASX_SUPPORTED_MODE_P (mode)
+	  && !LSX_SUPPORTED_MODE_P (mode) ? 2 : 1;
 
       case unaligned_load:
       case vector_gather_load:
@@ -3941,6 +3952,10 @@ loongarch_split_move_p (rtx dest, rtx src)
   if (LSX_SUPPORTED_MODE_P (GET_MODE (dest)))
     return loongarch_split_128bit_move_p (dest, src);
 
+  /* Check if LASX moves need splitting.  */
+  if (LASX_SUPPORTED_MODE_P (GET_MODE (dest)))
+    return loongarch_split_256bit_move_p (dest, src);
+
   /* Otherwise split all multiword moves.  */
   return size > UNITS_PER_WORD;
 }
@@ -3956,6 +3971,8 @@ loongarch_split_move (rtx dest, rtx src, rtx insn_)
   gcc_checking_assert (loongarch_split_move_p (dest, src));
   if (LSX_SUPPORTED_MODE_P (GET_MODE (dest)))
     loongarch_split_128bit_move (dest, src);
+  else if (LASX_SUPPORTED_MODE_P (GET_MODE (dest)))
+    loongarch_split_256bit_move (dest, src);
   else if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src))
     {
       if (!TARGET_64BIT && GET_MODE (dest) == DImode)
@@ -4121,7 +4138,7 @@ const char *
 loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
 {
   int index = exact_log2 (GET_MODE_SIZE (mode));
-  if (!IN_RANGE (index, 2, 4))
+  if (!IN_RANGE (index, 2, 5))
     return NULL;
 
   struct loongarch_address_info info;
@@ -4130,17 +4147,19 @@ loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
       || !loongarch_legitimate_address_p (mode, x, false))
     return NULL;
 
-  const char *const insn[][3] =
+  const char *const insn[][4] =
     {
 	{
 	  "fstx.s\t%1,%0",
 	  "fstx.d\t%1,%0",
-	  "vstx\t%w1,%0"
+	  "vstx\t%w1,%0",
+	  "xvstx\t%u1,%0"
 	},
 	{
 	  "fldx.s\t%0,%1",
 	  "fldx.d\t%0,%1",
-	  "vldx\t%w0,%1"
+	  "vldx\t%w0,%1",
+	  "xvldx\t%u0,%1"
 	}
     };
 
@@ -4174,6 +4193,34 @@ loongarch_split_128bit_move_p (rtx dest, rtx src)
   return true;
 }
 
+/* Return true if a 256-bit move from SRC to DEST should be split.  */
+
+bool
+loongarch_split_256bit_move_p (rtx dest, rtx src)
+{
+  /* LSX-to-LSX moves can be done in a single instruction.  */
+  if (FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
+    return false;
+
+  /* Check for LSX loads and stores.  */
+  if (FP_REG_RTX_P (dest) && MEM_P (src))
+    return false;
+  if (FP_REG_RTX_P (src) && MEM_P (dest))
+    return false;
+
+  /* Check for LSX set to an immediate const vector with valid replicated
+     element.  */
+  if (FP_REG_RTX_P (dest)
+      && loongarch_const_vector_same_int_p (src, GET_MODE (src), -512, 511))
+    return false;
+
+  /* Check for LSX load zero immediate.  */
+  if (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))
+    return false;
+
+  return true;
+}
+
 /* Split a 128-bit move from SRC to DEST.  */
 
 void
@@ -4265,6 +4312,97 @@ loongarch_split_128bit_move (rtx dest, rtx src)
     }
 }
 
+/* Split a 256-bit move from SRC to DEST.  */
+
+void
+loongarch_split_256bit_move (rtx dest, rtx src)
+{
+  int byte, index;
+  rtx low_dest, low_src, d, s;
+
+  if (FP_REG_RTX_P (dest))
+    {
+      gcc_assert (!MEM_P (src));
+
+      rtx new_dest = dest;
+      if (!TARGET_64BIT)
+	{
+	  if (GET_MODE (dest) != V8SImode)
+	    new_dest = simplify_gen_subreg (V8SImode, dest, GET_MODE (dest), 0);
+	}
+      else
+	{
+	  if (GET_MODE (dest) != V4DImode)
+	    new_dest = simplify_gen_subreg (V4DImode, dest, GET_MODE (dest), 0);
+	}
+
+      for (byte = 0, index = 0; byte < GET_MODE_SIZE (GET_MODE (dest));
+	   byte += UNITS_PER_WORD, index++)
+	{
+	  s = loongarch_subword_at_byte (src, byte);
+	  if (!TARGET_64BIT)
+	    emit_insn (gen_lasx_xvinsgr2vr_w (new_dest, s, new_dest,
+					      GEN_INT (1 << index)));
+	  else
+	    emit_insn (gen_lasx_xvinsgr2vr_d (new_dest, s, new_dest,
+					      GEN_INT (1 << index)));
+	}
+    }
+  else if (FP_REG_RTX_P (src))
+    {
+      gcc_assert (!MEM_P (dest));
+
+      rtx new_src = src;
+      if (!TARGET_64BIT)
+	{
+	  if (GET_MODE (src) != V8SImode)
+	    new_src = simplify_gen_subreg (V8SImode, src, GET_MODE (src), 0);
+	}
+      else
+	{
+	  if (GET_MODE (src) != V4DImode)
+	    new_src = simplify_gen_subreg (V4DImode, src, GET_MODE (src), 0);
+	}
+
+      for (byte = 0, index = 0; byte < GET_MODE_SIZE (GET_MODE (src));
+	   byte += UNITS_PER_WORD, index++)
+	{
+	  d = loongarch_subword_at_byte (dest, byte);
+	  if (!TARGET_64BIT)
+	    emit_insn (gen_lsx_vpickve2gr_w (d, new_src, GEN_INT (index)));
+	  else
+	    emit_insn (gen_lsx_vpickve2gr_d (d, new_src, GEN_INT (index)));
+	}
+    }
+  else
+    {
+      low_dest = loongarch_subword_at_byte (dest, 0);
+      low_src = loongarch_subword_at_byte (src, 0);
+      gcc_assert (REG_P (low_dest) && REG_P (low_src));
+      /* Make sure the source register is not written before reading.  */
+      if (REGNO (low_dest) <= REGNO (low_src))
+	{
+	  for (byte = 0; byte < GET_MODE_SIZE (TImode);
+	       byte += UNITS_PER_WORD)
+	    {
+	      d = loongarch_subword_at_byte (dest, byte);
+	      s = loongarch_subword_at_byte (src, byte);
+	      loongarch_emit_move (d, s);
+	    }
+	}
+      else
+	{
+	  for (byte = GET_MODE_SIZE (TImode) - UNITS_PER_WORD; byte >= 0;
+	       byte -= UNITS_PER_WORD)
+	    {
+	      d = loongarch_subword_at_byte (dest, byte);
+	      s = loongarch_subword_at_byte (src, byte);
+	      loongarch_emit_move (d, s);
+	    }
+	}
+    }
+}
+
 
 /* Split a COPY_S.D with operands DEST, SRC and INDEX.  GEN is a function
    used to generate subregs.  */
@@ -4352,11 +4490,12 @@ loongarch_output_move (rtx dest, rtx src)
   machine_mode mode = GET_MODE (dest);
   bool dbl_p = (GET_MODE_SIZE (mode) == 8);
   bool lsx_p = LSX_SUPPORTED_MODE_P (mode);
+  bool lasx_p = LASX_SUPPORTED_MODE_P (mode);
 
   if (loongarch_split_move_p (dest, src))
     return "#";
 
-  if ((lsx_p)
+  if ((lsx_p || lasx_p)
       && dest_code == REG && FP_REG_P (REGNO (dest))
       && src_code == CONST_VECTOR
       && CONST_INT_P (CONST_VECTOR_ELT (src, 0)))
@@ -4366,6 +4505,8 @@ loongarch_output_move (rtx dest, rtx src)
 	{
 	case 16:
 	  return "vrepli.%v0\t%w0,%E1";
+	case 32:
+	  return "xvrepli.%v0\t%u0,%E1";
 	default: gcc_unreachable ();
 	}
     }
@@ -4380,13 +4521,15 @@ loongarch_output_move (rtx dest, rtx src)
 
 	  if (FP_REG_P (REGNO (dest)))
 	    {
-	      if (lsx_p)
+	      if (lsx_p || lasx_p)
 		{
 		  gcc_assert (src == CONST0_RTX (GET_MODE (src)));
 		  switch (GET_MODE_SIZE (mode))
 		    {
 		    case 16:
 		      return "vrepli.b\t%w0,0";
+		    case 32:
+		      return "xvrepli.b\t%u0,0";
 		    default:
 		      gcc_unreachable ();
 		    }
@@ -4519,12 +4662,14 @@ loongarch_output_move (rtx dest, rtx src)
     {
       if (dest_code == REG && FP_REG_P (REGNO (dest)))
 	{
-	  if (lsx_p)
+	  if (lsx_p || lasx_p)
 	    {
 	      switch (GET_MODE_SIZE (mode))
 		{
 		case 16:
 		  return "vori.b\t%w0,%w1,0";
+		case 32:
+		  return "xvori.b\t%u0,%u1,0";
 		default:
 		  gcc_unreachable ();
 		}
@@ -4542,12 +4687,14 @@ loongarch_output_move (rtx dest, rtx src)
 	  if (insn)
 	    return insn;
 
-	  if (lsx_p)
+	  if (lsx_p || lasx_p)
 	    {
 	      switch (GET_MODE_SIZE (mode))
 		{
 		case 16:
 		  return "vst\t%w1,%0";
+		case 32:
+		  return "xvst\t%u1,%0";
 		default:
 		  gcc_unreachable ();
 		}
@@ -4568,12 +4715,14 @@ loongarch_output_move (rtx dest, rtx src)
 	  if (insn)
 	    return insn;
 
-	  if (lsx_p)
+	  if (lsx_p || lasx_p)
 	    {
 	      switch (GET_MODE_SIZE (mode))
 		{
 		case 16:
 		  return "vld\t%w0,%1";
+		case 32:
+		  return "xvld\t%u0,%1";
 		default:
 		  gcc_unreachable ();
 		}
@@ -5591,18 +5740,27 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool hi64_part,
    'T'	Print 'f' for (eq:CC ...), 't' for (ne:CC ...),
 	      'z' for (eq:?I ...), 'n' for (ne:?I ...).
    't'	Like 'T', but with the EQ/NE cases reversed
-   'V'	Print exact log2 of CONST_INT OP element 0 of a replicated
-	  CONST_VECTOR in decimal.
+   'F'	Print the FPU branch condition for comparison OP.
+   'W'	Print the inverse of the FPU branch condition for comparison OP.
+   'w'	Print a LSX register.
+   'u'	Print a LASX register.
+   'T'	Print 'f' for (eq:CC ...), 't' for (ne:CC ...),
+	      'z' for (eq:?I ...), 'n' for (ne:?I ...).
+   't'	Like 'T', but with the EQ/NE cases reversed
+   'Y'	Print loongarch_fp_conditions[INTVAL (OP)]
+   'Z'	Print OP and a comma for 8CC, otherwise print nothing.
+   'z'	Print $0 if OP is zero, otherwise print OP normally.
    'v'	Print the insn size suffix b, h, w or d for vector modes V16QI, V8HI,
 	  V4SI, V2SI, and w, d for vector modes V4SF, V2DF respectively.
+   'V'	Print exact log2 of CONST_INT OP element 0 of a replicated
+	  CONST_VECTOR in decimal.
    'W'	Print the inverse of the FPU branch condition for comparison OP.
-   'w'	Print a LSX register.
    'X'	Print CONST_INT OP in hexadecimal format.
    'x'	Print the low 16 bits of CONST_INT OP in hexadecimal format.
    'Y'	Print loongarch_fp_conditions[INTVAL (OP)]
    'y'	Print exact log2 of CONST_INT OP in decimal.
    'Z'	Print OP and a comma for 8CC, otherwise print nothing.
-   'z'	Print $r0 if OP is zero, otherwise print OP normally.  */
+   'z'	Print $0 if OP is zero, otherwise print OP normally.  */
 
 static void
 loongarch_print_operand (FILE *file, rtx op, int letter)
@@ -5744,46 +5902,11 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
 	output_operand_lossage ("invalid use of '%%%c'", letter);
       break;
 
-    case 'v':
-      switch (GET_MODE (op))
-	{
-	case E_V16QImode:
-	case E_V32QImode:
-	  fprintf (file, "b");
-	  break;
-	case E_V8HImode:
-	case E_V16HImode:
-	  fprintf (file, "h");
-	  break;
-	case E_V4SImode:
-	case E_V4SFmode:
-	case E_V8SImode:
-	case E_V8SFmode:
-	  fprintf (file, "w");
-	  break;
-	case E_V2DImode:
-	case E_V2DFmode:
-	case E_V4DImode:
-	case E_V4DFmode:
-	  fprintf (file, "d");
-	  break;
-	default:
-	  output_operand_lossage ("invalid use of '%%%c'", letter);
-	}
-      break;
-
     case 'W':
       loongarch_print_float_branch_condition (file, reverse_condition (code),
 					      letter);
       break;
 
-    case 'w':
-      if (code == REG && LSX_REG_P (REGNO (op)))
-	fprintf (file, "$vr%s", &reg_names[REGNO (op)][2]);
-      else
-	output_operand_lossage ("invalid use of '%%%c'", letter);
-      break;
-
     case 'x':
       if (CONST_INT_P (op))
 	fprintf (file, HOST_WIDE_INT_PRINT_HEX, INTVAL (op) & 0xffff);
@@ -5825,6 +5948,48 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
       fputc (',', file);
       break;
 
+    case 'w':
+      if (code == REG && LSX_REG_P (REGNO (op)))
+	fprintf (file, "$vr%s", &reg_names[REGNO (op)][2]);
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'u':
+      if (code == REG && LASX_REG_P (REGNO (op)))
+	fprintf (file, "$xr%s", &reg_names[REGNO (op)][2]);
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'v':
+      switch (GET_MODE (op))
+	{
+	case E_V16QImode:
+	case E_V32QImode:
+	  fprintf (file, "b");
+	  break;
+	case E_V8HImode:
+	case E_V16HImode:
+	  fprintf (file, "h");
+	  break;
+	case E_V4SImode:
+	case E_V4SFmode:
+	case E_V8SImode:
+	case E_V8SFmode:
+	  fprintf (file, "w");
+	  break;
+	case E_V2DImode:
+	case E_V2DFmode:
+	case E_V4DImode:
+	case E_V4DFmode:
+	  fprintf (file, "d");
+	  break;
+	default:
+	  output_operand_lossage ("invalid use of '%%%c'", letter);
+	}
+      break;
+
     default:
       switch (code)
 	{
@@ -6155,13 +6320,18 @@ loongarch_hard_regno_mode_ok_uncached (unsigned int regno, machine_mode mode)
   size = GET_MODE_SIZE (mode);
   mclass = GET_MODE_CLASS (mode);
 
-  if (GP_REG_P (regno) && !LSX_SUPPORTED_MODE_P (mode))
+  if (GP_REG_P (regno) && !LSX_SUPPORTED_MODE_P (mode)
+      && !LASX_SUPPORTED_MODE_P (mode))
     return ((regno - GP_REG_FIRST) & 1) == 0 || size <= UNITS_PER_WORD;
 
   /* For LSX, allow TImode and 128-bit vector modes in all FPR.  */
   if (FP_REG_P (regno) && LSX_SUPPORTED_MODE_P (mode))
     return true;
 
+  /* FIXED ME: For LASX, allow TImode and 256-bit vector modes in all FPR.  */
+  if (FP_REG_P (regno) && LASX_SUPPORTED_MODE_P (mode))
+    return true;
+
   if (FP_REG_P (regno))
     {
       if (mclass == MODE_FLOAT
@@ -6214,6 +6384,9 @@ loongarch_hard_regno_nregs (unsigned int regno, machine_mode mode)
       if (LSX_SUPPORTED_MODE_P (mode))
 	return 1;
 
+      if (LASX_SUPPORTED_MODE_P (mode))
+	return 1;
+
       return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
     }
 
@@ -6243,7 +6416,10 @@ loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode)
     {
       if (loongarch_hard_regno_mode_ok (FP_REG_FIRST, mode))
 	{
-	  if (LSX_SUPPORTED_MODE_P (mode))
+	  /* Fixed me.  */
+	  if (LASX_SUPPORTED_MODE_P (mode))
+	    size = MIN (size, UNITS_PER_LASX_REG);
+	  else if (LSX_SUPPORTED_MODE_P (mode))
 	    size = MIN (size, UNITS_PER_LSX_REG);
 	  else
 	    size = MIN (size, UNITS_PER_FPREG);
@@ -6261,6 +6437,10 @@ static bool
 loongarch_can_change_mode_class (machine_mode from, machine_mode to,
 				 reg_class_t rclass)
 {
+  /* Allow conversions between different LSX/LASX vector modes.  */
+  if (LASX_SUPPORTED_MODE_P (from) && LASX_SUPPORTED_MODE_P (to))
+    return true;
+
   /* Allow conversions between different LSX vector modes.  */
   if (LSX_SUPPORTED_MODE_P (from) && LSX_SUPPORTED_MODE_P (to))
     return true;
@@ -6284,7 +6464,8 @@ loongarch_mode_ok_for_mov_fmt_p (machine_mode mode)
       return TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT;
 
     default:
-      return LSX_SUPPORTED_MODE_P (mode);
+      return ISA_HAS_LASX ? LASX_SUPPORTED_MODE_P (mode)
+	: LSX_SUPPORTED_MODE_P (mode);
     }
 }
 
@@ -6486,7 +6667,8 @@ loongarch_valid_pointer_mode (scalar_int_mode mode)
 static bool
 loongarch_vector_mode_supported_p (machine_mode mode)
 {
-  return LSX_SUPPORTED_MODE_P (mode);
+  return ISA_HAS_LASX ? LASX_SUPPORTED_MODE_P (mode)
+    : LSX_SUPPORTED_MODE_P (mode);
 }
 
 /* Implement TARGET_SCALAR_MODE_SUPPORTED_P.  */
@@ -6512,19 +6694,19 @@ loongarch_preferred_simd_mode (scalar_mode mode)
   switch (mode)
     {
     case E_QImode:
-      return E_V16QImode;
+      return ISA_HAS_LASX ? E_V32QImode : E_V16QImode;
     case E_HImode:
-      return E_V8HImode;
+      return ISA_HAS_LASX ? E_V16HImode : E_V8HImode;
     case E_SImode:
-      return E_V4SImode;
+      return ISA_HAS_LASX ? E_V8SImode : E_V4SImode;
     case E_DImode:
-      return E_V2DImode;
+      return ISA_HAS_LASX ? E_V4DImode : E_V2DImode;
 
     case E_SFmode:
-      return E_V4SFmode;
+      return ISA_HAS_LASX ? E_V8SFmode : E_V4SFmode;
 
     case E_DFmode:
-      return E_V2DFmode;
+      return ISA_HAS_LASX ? E_V4DFmode : E_V2DFmode;
 
     default:
       break;
@@ -6535,7 +6717,12 @@ loongarch_preferred_simd_mode (scalar_mode mode)
 static unsigned int
 loongarch_autovectorize_vector_modes (vector_modes *modes, bool)
 {
-  if (ISA_HAS_LSX)
+  if (ISA_HAS_LASX)
+    {
+      modes->safe_push (V32QImode);
+      modes->safe_push (V16QImode);
+    }
+  else if (ISA_HAS_LSX)
     {
       modes->safe_push (V16QImode);
     }
@@ -6715,11 +6902,18 @@ const char *
 loongarch_lsx_output_division (const char *division, rtx *operands)
 {
   const char *s;
+  machine_mode mode = GET_MODE (*operands);
 
   s = division;
   if (TARGET_CHECK_ZERO_DIV)
     {
-      if (ISA_HAS_LSX)
+      if (ISA_HAS_LASX && GET_MODE_SIZE (mode) == 32)
+	{
+	  output_asm_insn ("xvsetallnez.%v0\t$fcc7,%u2",operands);
+	  output_asm_insn (s, operands);
+	  output_asm_insn ("bcnez\t$fcc7,1f", operands);
+	}
+      else if (ISA_HAS_LSX)
 	{
 	  output_asm_insn ("vsetallnez.%v0\t$fcc7,%w2",operands);
 	  output_asm_insn (s, operands);
@@ -7558,7 +7752,7 @@ loongarch_expand_lsx_shuffle (struct expand_vec_perm_d *d)
   rtx_insn *insn;
   unsigned i;
 
-  if (!ISA_HAS_LSX)
+  if (!ISA_HAS_LSX && !ISA_HAS_LASX)
     return false;
 
   for (i = 0; i < d->nelt; i++)
@@ -7582,40 +7776,484 @@ loongarch_expand_lsx_shuffle (struct expand_vec_perm_d *d)
   return true;
 }
 
-void
-loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
+/* Try to simplify a two vector permutation using 2 intra-lane interleave
+   insns and cross-lane shuffle for 32-byte vectors.  */
+
+static bool
+loongarch_expand_vec_perm_interleave (struct expand_vec_perm_d *d)
 {
-  machine_mode vmode = GET_MODE (target);
+  unsigned i, nelt;
+  rtx t1,t2,t3;
+  rtx (*gen_high) (rtx, rtx, rtx);
+  rtx (*gen_low) (rtx, rtx, rtx);
+  machine_mode mode = GET_MODE (d->target);
 
-  switch (vmode)
+  if (d->one_vector_p)
+    return false;
+  if (ISA_HAS_LASX && GET_MODE_SIZE (d->vmode) == 32)
+    ;
+  else
+    return false;
+
+  nelt = d->nelt;
+  if (d->perm[0] != 0 && d->perm[0] != nelt / 2)
+    return false;
+  for (i = 0; i < nelt; i += 2)
+    if (d->perm[i] != d->perm[0] + i / 2
+	|| d->perm[i + 1] != d->perm[0] + i / 2 + nelt)
+      return false;
+
+  if (d->testing_p)
+    return true;
+
+  switch (d->vmode)
     {
-    case E_V16QImode:
-      emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel));
+    case E_V32QImode:
+      gen_high = gen_lasx_xvilvh_b;
+      gen_low = gen_lasx_xvilvl_b;
       break;
-    case E_V2DFmode:
-      emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0));
+    case E_V16HImode:
+      gen_high = gen_lasx_xvilvh_h;
+      gen_low = gen_lasx_xvilvl_h;
       break;
-    case E_V2DImode:
-      emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0));
+    case E_V8SImode:
+      gen_high = gen_lasx_xvilvh_w;
+      gen_low = gen_lasx_xvilvl_w;
       break;
-    case E_V4SFmode:
-      emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0));
+    case E_V4DImode:
+      gen_high = gen_lasx_xvilvh_d;
+      gen_low = gen_lasx_xvilvl_d;
       break;
-    case E_V4SImode:
-      emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0));
+    case E_V8SFmode:
+      gen_high = gen_lasx_xvilvh_w_f;
+      gen_low = gen_lasx_xvilvl_w_f;
       break;
-    case E_V8HImode:
-      emit_insn (gen_lsx_vshuf_h (target, sel, op1, op0));
+    case E_V4DFmode:
+      gen_high = gen_lasx_xvilvh_d_f;
+      gen_low = gen_lasx_xvilvl_d_f;
       break;
     default:
-      break;
+      gcc_unreachable ();
+    }
+
+  t1 = gen_reg_rtx (mode);
+  t2 = gen_reg_rtx (mode);
+  emit_insn (gen_high (t1, d->op0, d->op1));
+  emit_insn (gen_low (t2, d->op0, d->op1));
+  if (mode == V4DFmode || mode == V8SFmode)
+    {
+      t3 = gen_reg_rtx (V4DFmode);
+      if (d->perm[0])
+	emit_insn (gen_lasx_xvpermi_q_v4df (t3, gen_lowpart (V4DFmode, t1),
+					    gen_lowpart (V4DFmode, t2),
+					    GEN_INT (0x31)));
+      else
+	emit_insn (gen_lasx_xvpermi_q_v4df (t3, gen_lowpart (V4DFmode, t1),
+					    gen_lowpart (V4DFmode, t2),
+					    GEN_INT (0x20)));
     }
+  else
+    {
+      t3 = gen_reg_rtx (V4DImode);
+      if (d->perm[0])
+	emit_insn (gen_lasx_xvpermi_q_v4di (t3, gen_lowpart (V4DImode, t1),
+					    gen_lowpart (V4DImode, t2),
+					    GEN_INT (0x31)));
+      else
+	emit_insn (gen_lasx_xvpermi_q_v4di (t3, gen_lowpart (V4DImode, t1),
+					    gen_lowpart (V4DImode, t2),
+					    GEN_INT (0x20)));
+    }
+  emit_move_insn (d->target, gen_lowpart (mode, t3));
+  return true;
 }
 
+/* Implement extract-even and extract-odd permutations.  */
+
 static bool
-loongarch_try_expand_lsx_vshuf_const (struct expand_vec_perm_d *d)
+loongarch_expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd)
 {
-  int i;
+  rtx t1;
+  machine_mode mode = GET_MODE (d->target);
+
+  if (d->testing_p)
+    return true;
+
+  t1 = gen_reg_rtx (mode);
+
+  switch (d->vmode)
+    {
+    case E_V4DFmode:
+      /* Shuffle the lanes around into { 0 4 2 6 } and { 1 5 3 7 }.  */
+      if (odd)
+	emit_insn (gen_lasx_xvilvh_d_f (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvilvl_d_f (t1, d->op0, d->op1));
+
+      /* Shuffle within the 256-bit lanes to produce the result required.
+	 { 0 2 4 6 } | { 1 3 5 7 }.  */
+      emit_insn (gen_lasx_xvpermi_d_v4df (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V4DImode:
+      if (odd)
+	emit_insn (gen_lasx_xvilvh_d (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvilvl_d (t1, d->op0, d->op1));
+
+      emit_insn (gen_lasx_xvpermi_d_v4di (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V8SFmode:
+      /* Shuffle the lanes around into:
+	 { 0 2 8 a 4 6 c e } | { 1 3 9 b 5 7 d f }.  */
+      if (odd)
+	emit_insn (gen_lasx_xvpickod_w_f (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvpickev_w_f (t1, d->op0, d->op1));
+
+      /* Shuffle within the 256-bit lanes to produce the result required.
+	 { 0 2 4 6 8 a c e } | { 1 3 5 7 9 b d f }.  */
+      emit_insn (gen_lasx_xvpermi_d_v8sf (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V8SImode:
+      if (odd)
+	emit_insn (gen_lasx_xvpickod_w (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvpickev_w (t1, d->op0, d->op1));
+
+      emit_insn (gen_lasx_xvpermi_d_v8si (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V16HImode:
+      if (odd)
+	emit_insn (gen_lasx_xvpickod_h (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvpickev_h (t1, d->op0, d->op1));
+
+      emit_insn (gen_lasx_xvpermi_d_v16hi (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V32QImode:
+      if (odd)
+	emit_insn (gen_lasx_xvpickod_b (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvpickev_b (t1, d->op0, d->op1));
+
+      emit_insn (gen_lasx_xvpermi_d_v32qi (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return true;
+}
+
+/* Pattern match extract-even and extract-odd permutations.  */
+
+static bool
+loongarch_expand_vec_perm_even_odd (struct expand_vec_perm_d *d)
+{
+  unsigned i, odd, nelt = d->nelt;
+  if (!ISA_HAS_LASX)
+    return false;
+
+  odd = d->perm[0];
+  if (odd != 0 && odd != 1)
+    return false;
+
+  for (i = 1; i < nelt; ++i)
+    if (d->perm[i] != 2 * i + odd)
+      return false;
+
+  return loongarch_expand_vec_perm_even_odd_1 (d, odd);
+}
+
+/* Expand a variable vector permutation for LASX.  */
+
+void
+loongarch_expand_vec_perm_1 (rtx operands[])
+{
+  rtx target = operands[0];
+  rtx op0 = operands[1];
+  rtx op1 = operands[2];
+  rtx mask = operands[3];
+
+  bool one_operand_shuffle = rtx_equal_p (op0, op1);
+  rtx t1 = NULL;
+  rtx t2 = NULL;
+  rtx t3, t4, t5, t6, vt = NULL;
+  rtx vec[32] = {NULL};
+  machine_mode mode = GET_MODE (op0);
+  machine_mode maskmode = GET_MODE (mask);
+  int w, i;
+
+  /* Number of elements in the vector.  */
+  w = GET_MODE_NUNITS (mode);
+
+  rtx round_data[MAX_VECT_LEN];
+  rtx round_reg, round_data_rtx;
+
+  if (mode != E_V32QImode)
+    {
+      for (int i = 0; i < w; i += 1)
+	{
+	  round_data[i] = GEN_INT (0x1f);
+	}
+
+      if (mode == E_V4DFmode)
+	{
+	  round_data_rtx = gen_rtx_CONST_VECTOR (E_V4DImode,
+						 gen_rtvec_v (w, round_data));
+	  round_reg = gen_reg_rtx (E_V4DImode);
+	}
+      else if (mode == E_V8SFmode)
+	{
+
+	  round_data_rtx = gen_rtx_CONST_VECTOR (E_V8SImode,
+						 gen_rtvec_v (w, round_data));
+	  round_reg = gen_reg_rtx (E_V8SImode);
+	}
+      else
+	{
+	  round_data_rtx = gen_rtx_CONST_VECTOR (mode,
+						 gen_rtvec_v (w, round_data));
+	  round_reg = gen_reg_rtx (mode);
+	}
+
+      emit_move_insn (round_reg, round_data_rtx);
+      switch (mode)
+	{
+	case E_V32QImode:
+	  emit_insn (gen_andv32qi3 (mask, mask, round_reg));
+	  break;
+	case E_V16HImode:
+	  emit_insn (gen_andv16hi3 (mask, mask, round_reg));
+	  break;
+	case E_V8SImode:
+	case E_V8SFmode:
+	  emit_insn (gen_andv8si3 (mask, mask, round_reg));
+	  break;
+	case E_V4DImode:
+	case E_V4DFmode:
+	  emit_insn (gen_andv4di3 (mask, mask, round_reg));
+	  break;
+	default:
+	  gcc_unreachable ();
+	  break;
+	}
+    }
+
+  if (mode == V4DImode || mode == V4DFmode)
+    {
+      maskmode = mode = V8SImode;
+      w = 8;
+      t1 = gen_reg_rtx (maskmode);
+
+      /* Replicate the low bits of the V4DImode mask into V8SImode:
+	 mask = { A B C D }
+	 t1 = { A A B B C C D D }.  */
+      for (i = 0; i < w / 2; ++i)
+	vec[i*2 + 1] = vec[i*2] = GEN_INT (i * 2);
+      vt = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (w, vec));
+      vt = force_reg (maskmode, vt);
+      mask = gen_lowpart (maskmode, mask);
+      emit_insn (gen_lasx_xvperm_w (t1, mask, vt));
+
+      /* Multiply the shuffle indicies by two.  */
+      t1 = expand_simple_binop (maskmode, PLUS, t1, t1, t1, 1,
+				OPTAB_DIRECT);
+
+      /* Add one to the odd shuffle indicies:
+	 t1 = { A*2, A*2+1, B*2, B*2+1, ... }.  */
+      for (i = 0; i < w / 2; ++i)
+	{
+	  vec[i * 2] = const0_rtx;
+	  vec[i * 2 + 1] = const1_rtx;
+	}
+      vt = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (w, vec));
+      vt = validize_mem (force_const_mem (maskmode, vt));
+      t1 = expand_simple_binop (maskmode, PLUS, t1, vt, t1, 1,
+				OPTAB_DIRECT);
+
+      /* Continue as if V8SImode (resp.  V32QImode) was used initially.  */
+      operands[3] = mask = t1;
+      target = gen_reg_rtx (mode);
+      op0 = gen_lowpart (mode, op0);
+      op1 = gen_lowpart (mode, op1);
+    }
+
+  switch (mode)
+    {
+    case E_V8SImode:
+      if (one_operand_shuffle)
+	{
+	  emit_insn (gen_lasx_xvperm_w (target, op0, mask));
+	  if (target != operands[0])
+	    emit_move_insn (operands[0],
+			    gen_lowpart (GET_MODE (operands[0]), target));
+	}
+      else
+	{
+	  t1 = gen_reg_rtx (V8SImode);
+	  t2 = gen_reg_rtx (V8SImode);
+	  emit_insn (gen_lasx_xvperm_w (t1, op0, mask));
+	  emit_insn (gen_lasx_xvperm_w (t2, op1, mask));
+	  goto merge_two;
+	}
+      return;
+
+    case E_V8SFmode:
+      mask = gen_lowpart (V8SImode, mask);
+      if (one_operand_shuffle)
+	emit_insn (gen_lasx_xvperm_w_f (target, op0, mask));
+      else
+	{
+	  t1 = gen_reg_rtx (V8SFmode);
+	  t2 = gen_reg_rtx (V8SFmode);
+	  emit_insn (gen_lasx_xvperm_w_f (t1, op0, mask));
+	  emit_insn (gen_lasx_xvperm_w_f (t2, op1, mask));
+	  goto merge_two;
+	}
+      return;
+
+    case E_V16HImode:
+      if (one_operand_shuffle)
+	{
+	  t1 = gen_reg_rtx (V16HImode);
+	  t2 = gen_reg_rtx (V16HImode);
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t1, op0, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t2, op0, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_h (target, mask, t2, t1));
+	}
+      else
+	{
+	  t1 = gen_reg_rtx (V16HImode);
+	  t2 = gen_reg_rtx (V16HImode);
+	  t3 = gen_reg_rtx (V16HImode);
+	  t4 = gen_reg_rtx (V16HImode);
+	  t5 = gen_reg_rtx (V16HImode);
+	  t6 = gen_reg_rtx (V16HImode);
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t3, op0, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t4, op0, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_h (t1, mask, t4, t3));
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t5, op1, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t6, op1, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_h (t2, mask, t6, t5));
+	  goto merge_two;
+	}
+      return;
+
+    case E_V32QImode:
+      if (one_operand_shuffle)
+	{
+	  t1 = gen_reg_rtx (V32QImode);
+	  t2 = gen_reg_rtx (V32QImode);
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t1, op0, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t2, op0, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_b (target, t2, t1, mask));
+	}
+      else
+	{
+	  t1 = gen_reg_rtx (V32QImode);
+	  t2 = gen_reg_rtx (V32QImode);
+	  t3 = gen_reg_rtx (V32QImode);
+	  t4 = gen_reg_rtx (V32QImode);
+	  t5 = gen_reg_rtx (V32QImode);
+	  t6 = gen_reg_rtx (V32QImode);
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t3, op0, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t4, op0, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_b (t1, t4, t3, mask));
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t5, op1, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t6, op1, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_b (t2, t6, t5, mask));
+	  goto merge_two;
+	}
+      return;
+
+    default:
+      gcc_assert (GET_MODE_SIZE (mode) == 32);
+      break;
+    }
+
+merge_two:
+  /* Then merge them together.  The key is whether any given control
+     element contained a bit set that indicates the second word.  */
+  rtx xops[6];
+  mask = operands[3];
+  vt = GEN_INT (w);
+  vt = gen_const_vec_duplicate (maskmode, vt);
+  vt = force_reg (maskmode, vt);
+  mask = expand_simple_binop (maskmode, AND, mask, vt,
+			      NULL_RTX, 0, OPTAB_DIRECT);
+  if (GET_MODE (target) != mode)
+    target = gen_reg_rtx (mode);
+  xops[0] = target;
+  xops[1] = gen_lowpart (mode, t2);
+  xops[2] = gen_lowpart (mode, t1);
+  xops[3] = gen_rtx_EQ (maskmode, mask, vt);
+  xops[4] = mask;
+  xops[5] = vt;
+
+  loongarch_expand_vec_cond_expr (mode, maskmode, xops);
+  if (target != operands[0])
+    emit_move_insn (operands[0],
+		    gen_lowpart (GET_MODE (operands[0]), target));
+}
+
+void
+loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
+{
+  machine_mode vmode = GET_MODE (target);
+  auto nelt = GET_MODE_NUNITS (vmode);
+  auto round_reg = gen_reg_rtx (vmode);
+  rtx round_data[MAX_VECT_LEN];
+
+  for (int i = 0; i < nelt; i += 1)
+    {
+      round_data[i] = GEN_INT (0x1f);
+    }
+
+  rtx round_data_rtx = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, round_data));
+  emit_move_insn (round_reg, round_data_rtx);
+
+  switch (vmode)
+    {
+    case E_V16QImode:
+      emit_insn (gen_andv16qi3 (sel, sel, round_reg));
+      emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel));
+      break;
+    case E_V2DFmode:
+      emit_insn (gen_andv2di3 (sel, sel, round_reg));
+      emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0));
+      break;
+    case E_V2DImode:
+      emit_insn (gen_andv2di3 (sel, sel, round_reg));
+      emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0));
+      break;
+    case E_V4SFmode:
+      emit_insn (gen_andv4si3 (sel, sel, round_reg));
+      emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0));
+      break;
+    case E_V4SImode:
+      emit_insn (gen_andv4si3 (sel, sel, round_reg));
+      emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0));
+      break;
+    case E_V8HImode:
+      emit_insn (gen_andv8hi3 (sel, sel, round_reg));
+      emit_insn (gen_lsx_vshuf_h (target, sel, op1, op0));
+      break;
+    default:
+      break;
+    }
+}
+
+static bool
+loongarch_try_expand_lsx_vshuf_const (struct expand_vec_perm_d *d)
+{
+  int i;
   rtx target, op0, op1, sel, tmp;
   rtx rperm[MAX_VECT_LEN];
 
@@ -7716,25 +8354,1302 @@ loongarch_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
 	return true;
     }
 
-  if (loongarch_expand_lsx_shuffle (d))
-    return true;
-  return false;
-}
-
-/* Implementation of constant vector permuatation.  This function identifies
- * recognized pattern of permuation selector argument, and use one or more
- * instruction(s) to finish the permutation job correctly.  For unsupported
- * patterns, it will return false.  */
-
-static bool
-loongarch_expand_vec_perm_const_2 (struct expand_vec_perm_d *d)
-{
-  /* Although we have the LSX vec_perm<mode> template, there's still some
-     128bit vector permuatation operations send to vectorize_vec_perm_const.
-     In this case, we just simpliy wrap them by single vshuf.* instruction,
-     because LSX vshuf.* instruction just have the same behavior that GCC
-     expects.  */
-  return loongarch_try_expand_lsx_vshuf_const (d);
+  if (loongarch_expand_lsx_shuffle (d))
+    return true;
+  if (loongarch_expand_vec_perm_even_odd (d))
+    return true;
+  if (loongarch_expand_vec_perm_interleave (d))
+    return true;
+  return false;
+}
+
+/* Following are the assist function for const vector permutation support.  */
+static bool
+loongarch_is_quad_duplicate (struct expand_vec_perm_d *d)
+{
+  if (d->perm[0] >= d->nelt / 2)
+    return false;
+
+  bool result = true;
+  unsigned char lhs = d->perm[0];
+  unsigned char rhs = d->perm[d->nelt / 2];
+
+  if ((rhs - lhs) != d->nelt / 2)
+    return false;
+
+  for (int i = 1; i < d->nelt; i += 1)
+    {
+      if ((i < d->nelt / 2) && (d->perm[i] != lhs))
+	{
+	  result = false;
+	  break;
+	}
+      if ((i > d->nelt / 2) && (d->perm[i] != rhs))
+	{
+	  result = false;
+	  break;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_double_duplicate (struct expand_vec_perm_d *d)
+{
+  if (!d->one_vector_p)
+    return false;
+
+  if (d->nelt < 8)
+    return false;
+
+  bool result = true;
+  unsigned char buf = d->perm[0];
+
+  for (int i = 1; i < d->nelt; i += 2)
+    {
+      if (d->perm[i] != buf)
+	{
+	  result = false;
+	  break;
+	}
+      if (d->perm[i - 1] != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += d->nelt / 4;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_odd_extraction (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = 1;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 2;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_even_extraction (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = 0;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_extraction_permutation (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = d->perm[0];
+
+  if (buf != 0 || buf != d->nelt)
+    return false;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 2;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_center_extraction (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned buf = d->nelt / 2;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_reversing_permutation (struct expand_vec_perm_d *d)
+{
+  if (!d->one_vector_p)
+    return false;
+
+  bool result = true;
+  unsigned char buf = d->nelt - 1;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (d->perm[i] != buf)
+	{
+	  result = false;
+	  break;
+	}
+
+      buf -= 1;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_di_misalign_extract (struct expand_vec_perm_d *d)
+{
+  if (d->nelt != 4 && d->nelt != 8)
+    return false;
+
+  bool result = true;
+  unsigned char buf;
+
+  if (d->nelt == 4)
+    {
+      buf = 1;
+      for (int i = 0; i < d->nelt; i += 1)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+
+	  buf += 1;
+	}
+    }
+  else if (d->nelt == 8)
+    {
+      buf = 2;
+      for (int i = 0; i < d->nelt; i += 1)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+
+	  buf += 1;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_si_misalign_extract (struct expand_vec_perm_d *d)
+{
+  if (d->vmode != E_V8SImode && d->vmode != E_V8SFmode)
+    return false;
+  bool result = true;
+  unsigned char buf = 1;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_lasx_lowpart_interleave (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = 0;
+
+  for (int i = 0;i < d->nelt; i += 2)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  if (result)
+    {
+      buf = d->nelt;
+      for (int i = 1; i < d->nelt; i += 2)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+	  buf += 1;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_lasx_lowpart_interleave_2 (struct expand_vec_perm_d *d)
+{
+  if (d->vmode != E_V32QImode)
+    return false;
+  bool result = true;
+  unsigned char buf = 0;
+
+#define COMPARE_SELECTOR(INIT, BEGIN, END) \
+  buf = INIT; \
+  for (int i = BEGIN; i < END && result; i += 1) \
+    { \
+      if (buf != d->perm[i]) \
+	{ \
+	  result = false; \
+	  break; \
+	} \
+      buf += 1; \
+    }
+
+  COMPARE_SELECTOR (0, 0, 8);
+  COMPARE_SELECTOR (32, 8, 16);
+  COMPARE_SELECTOR (8, 16, 24);
+  COMPARE_SELECTOR (40, 24, 32);
+
+#undef COMPARE_SELECTOR
+  return result;
+}
+
+static bool
+loongarch_is_lasx_lowpart_extract (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = 0;
+
+  for (int i = 0; i < d->nelt / 2; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  if (result)
+    {
+      buf = d->nelt;
+      for (int i = d->nelt / 2; i < d->nelt; i += 1)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+	  buf += 1;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_lasx_highpart_interleave (expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = d->nelt / 2;
+
+  for (int i = 0; i < d->nelt; i += 2)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  if (result)
+    {
+      buf = d->nelt + d->nelt / 2;
+      for (int i = 1; i < d->nelt;i += 2)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+	  buf += 1;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_lasx_highpart_interleave_2 (struct expand_vec_perm_d *d)
+{
+  if (d->vmode != E_V32QImode)
+    return false;
+
+  bool result = true;
+  unsigned char buf = 0;
+
+#define COMPARE_SELECTOR(INIT, BEGIN, END) \
+  buf = INIT; \
+  for (int i = BEGIN; i < END && result; i += 1) \
+    { \
+      if (buf != d->perm[i]) \
+	{ \
+	  result = false; \
+	  break; \
+	} \
+      buf += 1; \
+    }
+
+  COMPARE_SELECTOR (16, 0, 8);
+  COMPARE_SELECTOR (48, 8, 16);
+  COMPARE_SELECTOR (24, 16, 24);
+  COMPARE_SELECTOR (56, 24, 32);
+
+#undef COMPARE_SELECTOR
+  return result;
+}
+
+static bool
+loongarch_is_elem_duplicate (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = d->perm[0];
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+    }
+
+  return result;
+}
+
+inline bool
+loongarch_is_op_reverse_perm (struct expand_vec_perm_d *d)
+{
+  return (d->vmode == E_V4DFmode)
+    && d->perm[0] == 2 && d->perm[1] == 3
+    && d->perm[2] == 0 && d->perm[3] == 1;
+}
+
+static bool
+loongarch_is_single_op_perm (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (d->perm[i] >= d->nelt)
+	{
+	  result = false;
+	  break;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_divisible_perm (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+
+  for (int i = 0; i < d->nelt / 2; i += 1)
+    {
+      if (d->perm[i] >= d->nelt)
+	{
+	  result = false;
+	  break;
+	}
+    }
+
+  if (result)
+    {
+      for (int i = d->nelt / 2; i < d->nelt; i += 1)
+	{
+	  if (d->perm[i] < d->nelt)
+	    {
+	      result = false;
+	      break;
+	    }
+	}
+    }
+
+  return result;
+}
+
+inline bool
+loongarch_is_triple_stride_extract (struct expand_vec_perm_d *d)
+{
+  return (d->vmode == E_V4DImode || d->vmode == E_V4DFmode)
+    && d->perm[0] == 1 && d->perm[1] == 4
+    && d->perm[2] == 7 && d->perm[3] == 0;
+}
+
+/* In LASX, some permutation insn does not have the behavior that gcc expects
+ * when compiler wants to emit a vector permutation.
+ *
+ * 1. What GCC provides via vectorize_vec_perm_const ()'s paramater:
+ * When GCC wants to performs a vector permutation, it provides two op
+ * reigster, one target register, and a selector.
+ * In const vector permutation case, GCC provides selector as a char array
+ * that contains original value; in variable vector permuatation
+ * (performs via vec_perm<mode> insn template), it provides a vector register.
+ * We assume that nelt is the elements numbers inside single vector in current
+ * 256bit vector mode.
+ *
+ * 2. What GCC expects to perform:
+ * Two op registers (op0, op1) will "combine" into a 512bit temp vector storage
+ * that has 2*nelt elements inside it; the low 256bit is op0, and high 256bit
+ * is op1, then the elements are indexed as below:
+ *		  0 ~ nelt - 1		nelt ~ 2 * nelt - 1
+ *	  |-------------------------|-------------------------|
+ *		Low 256bit (op0)	High 256bit (op1)
+ * For example, the second element in op1 (V8SImode) will be indexed with 9.
+ * Selector is a vector that has the same mode and number of elements  with
+ * op0,op1 and target, it's look like this:
+ *	      0 ~ nelt - 1
+ *	  |-------------------------|
+ *	      256bit (selector)
+ * It describes which element from 512bit temp vector storage will fit into
+ * target's every element slot.
+ * GCC expects that every element in selector can be ANY indices of 512bit
+ * vector storage (Selector can pick literally any element from op0 and op1, and
+ * then fits into any place of target register). This is also what LSX 128bit
+ * vshuf.* instruction do similarly, so we can handle 128bit vector permutation
+ * by single instruction easily.
+ *
+ * 3. What LASX permutation instruction does:
+ * In short, it just execute two independent 128bit vector permuatation, and
+ * it's the reason that we need to do the jobs below.  We will explain it.
+ * op0, op1, target, and selector will be separate into high 128bit and low
+ * 128bit, and do permutation as the description below:
+ *
+ *  a) op0's low 128bit and op1's low 128bit "combines" into a 256bit temp
+ * vector storage (TVS1), elements are indexed as below:
+ *	    0 ~ nelt / 2 - 1	  nelt / 2 ~ nelt - 1
+ *	|---------------------|---------------------| TVS1
+ *	    op0's low 128bit      op1's low 128bit
+ *    op0's high 128bit and op1's high 128bit are "combined" into TVS2 in the
+ *    same way.
+ *	    0 ~ nelt / 2 - 1	  nelt / 2 ~ nelt - 1
+ *	|---------------------|---------------------| TVS2
+ *	    op0's high 128bit	op1's high 128bit
+ *  b) Selector's low 128bit describes which elements from TVS1 will fit into
+ *  target vector's low 128bit.  No TVS2 elements are allowed.
+ *  c) Selector's high 128bit describes which elements from TVS2 will fit into
+ *  target vector's high 128bit.  No TVS1 elements are allowed.
+ *
+ * As we can see, if we want to handle vector permutation correctly, we can
+ * achieve it in three ways:
+ *  a) Modify selector's elements, to make sure that every elements can inform
+ *  correct value that will put into target vector.
+    b) Generate extra instruction before/after permutation instruction, for
+    adjusting op vector or target vector, to make sure target vector's value is
+    what GCC expects.
+    c) Use other instructions to process op and put correct result into target.
+ */
+
+/* Implementation of constant vector permuatation.  This function identifies
+ * recognized pattern of permuation selector argument, and use one or more
+ * instruction(s) to finish the permutation job correctly.  For unsupported
+ * patterns, it will return false.  */
+
+static bool
+loongarch_expand_vec_perm_const_2 (struct expand_vec_perm_d *d)
+{
+  /* Although we have the LSX vec_perm<mode> template, there's still some
+     128bit vector permuatation operations send to vectorize_vec_perm_const.
+     In this case, we just simpliy wrap them by single vshuf.* instruction,
+     because LSX vshuf.* instruction just have the same behavior that GCC
+     expects.  */
+  if (GET_MODE_SIZE (d->vmode) == 16)
+    return loongarch_try_expand_lsx_vshuf_const (d);
+  else
+    return false;
+
+  bool ok = false, reverse_hi_lo = false, extract_ev_od = false,
+       use_alt_op = false;
+  unsigned char idx;
+  int i;
+  rtx target, op0, op1, sel, tmp;
+  rtx op0_alt = NULL_RTX, op1_alt = NULL_RTX;
+  rtx rperm[MAX_VECT_LEN];
+  unsigned int remapped[MAX_VECT_LEN];
+
+  /* Try to figure out whether is a recognized permutation selector pattern, if
+     yes, we will reassign some elements with new value in selector argument,
+     and in some cases we will generate some assist insn to complete the
+     permutation. (Even in some cases, we use other insn to impl permutation
+     instead of xvshuf!)
+
+     Make sure to check d->testing_p is false everytime if you want to emit new
+     insn, unless you want to crash into ICE directly.  */
+  if (loongarch_is_quad_duplicate (d))
+    {
+      /* Selector example: E_V8SImode, { 0, 0, 0, 0, 4, 4, 4, 4 }
+	 copy first elem from original selector to all elem in new selector.  */
+      idx = d->perm[0];
+      for (i = 0; i < d->nelt; i += 1)
+	{
+	  remapped[i] = idx;
+	}
+      /* Selector after: { 0, 0, 0, 0, 0, 0, 0, 0 }.  */
+    }
+  else if (loongarch_is_double_duplicate (d))
+    {
+      /* Selector example: E_V8SImode, { 1, 1, 3, 3, 5, 5, 7, 7 }
+	 one_vector_p == true.  */
+      for (i = 0; i < d->nelt / 2; i += 1)
+	{
+	  idx = d->perm[i];
+	  remapped[i] = idx;
+	  remapped[i + d->nelt / 2] = idx;
+	}
+      /* Selector after: { 1, 1, 3, 3, 1, 1, 3, 3 }.  */
+    }
+  else if (loongarch_is_odd_extraction (d)
+	   || loongarch_is_even_extraction (d))
+    {
+      /* Odd extraction selector sample: E_V4DImode, { 1, 3, 5, 7 }
+	 Selector after: { 1, 3, 1, 3 }.
+	 Even extraction selector sample: E_V4DImode, { 0, 2, 4, 6 }
+	 Selector after: { 0, 2, 0, 2 }.  */
+      for (i = 0; i < d->nelt / 2; i += 1)
+	{
+	  idx = d->perm[i];
+	  remapped[i] = idx;
+	  remapped[i + d->nelt / 2] = idx;
+	}
+      /* Additional insn is required for correct result.  See codes below.  */
+      extract_ev_od = true;
+    }
+  else if (loongarch_is_extraction_permutation (d))
+    {
+      /* Selector sample: E_V8SImode, { 0, 1, 2, 3, 4, 5, 6, 7 }.  */
+      if (d->perm[0] == 0)
+	{
+	  for (i = 0; i < d->nelt / 2; i += 1)
+	    {
+	      remapped[i] = i;
+	      remapped[i + d->nelt / 2] = i;
+	    }
+	}
+      else
+	{
+	  /* { 8, 9, 10, 11, 12, 13, 14, 15 }.  */
+	  for (i = 0; i < d->nelt / 2; i += 1)
+	    {
+	      idx = i + d->nelt / 2;
+	      remapped[i] = idx;
+	      remapped[i + d->nelt / 2] = idx;
+	    }
+	}
+      /* Selector after: { 0, 1, 2, 3, 0, 1, 2, 3 }
+	 { 8, 9, 10, 11, 8, 9, 10, 11 }  */
+    }
+  else if (loongarch_is_center_extraction (d))
+    {
+      /* sample: E_V4DImode, { 2, 3, 4, 5 }
+	 In this condition, we can just copy high 128bit of op0 and low 128bit
+	 of op1 to the target register by using xvpermi.q insn.  */
+      if (!d->testing_p)
+	{
+	  emit_move_insn (d->target, d->op1);
+	  switch (d->vmode)
+	    {
+	      case E_V4DImode:
+		emit_insn (gen_lasx_xvpermi_q_v4di (d->target, d->target,
+						    d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V4DFmode:
+		emit_insn (gen_lasx_xvpermi_q_v4df (d->target, d->target,
+						    d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V8SImode:
+		emit_insn (gen_lasx_xvpermi_q_v8si (d->target, d->target,
+						    d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V8SFmode:
+		emit_insn (gen_lasx_xvpermi_q_v8sf (d->target, d->target,
+						    d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V16HImode:
+		emit_insn (gen_lasx_xvpermi_q_v16hi (d->target, d->target,
+						     d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V32QImode:
+		emit_insn (gen_lasx_xvpermi_q_v32qi (d->target, d->target,
+						     d->op0, GEN_INT (0x21)));
+		break;
+	      default:
+		break;
+	    }
+	}
+      ok = true;
+      /* Finish the funtion directly.  */
+      goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_reversing_permutation (d))
+    {
+      /* Selector sample: E_V8SImode, { 7, 6, 5, 4, 3, 2, 1, 0 }
+	 one_vector_p == true  */
+      idx = d->nelt / 2 - 1;
+      for (i = 0; i < d->nelt / 2; i += 1)
+	{
+	  remapped[i] = idx;
+	  remapped[i + d->nelt / 2] = idx;
+	  idx -= 1;
+	}
+      /* Selector after: { 3, 2, 1, 0, 3, 2, 1, 0 }
+	 Additional insn will be generated to swap hi and lo 128bit of target
+	 register.  */
+      reverse_hi_lo = true;
+    }
+  else if (loongarch_is_di_misalign_extract (d)
+	   || loongarch_is_si_misalign_extract (d))
+    {
+      /* Selector Sample:
+	 DI misalign: E_V4DImode, { 1, 2, 3, 4 }
+	 SI misalign: E_V8SImode, { 1, 2, 3, 4, 5, 6, 7, 8 }  */
+      if (!d->testing_p)
+	{
+	  /* Copy original op0/op1 value to new temp register.
+	     In some cases, operand register may be used in multiple place, so
+	     we need new regiter instead modify original one, to avoid runtime
+	     crashing or wrong value after execution.  */
+	  use_alt_op = true;
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+
+	  /* Adjust op1 for selecting correct value in high 128bit of target
+	     register.
+	     op1: E_V4DImode, { 4, 5, 6, 7 } -> { 2, 3, 4, 5 }.  */
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x21)));
+
+	  for (i = 0; i < d->nelt / 2; i += 1)
+	    {
+	      remapped[i] = d->perm[i];
+	      remapped[i + d->nelt / 2] = d->perm[i];
+	    }
+	  /* Selector after:
+	     DI misalign: { 1, 2, 1, 2 }
+	     SI misalign: { 1, 2, 3, 4, 1, 2, 3, 4 }  */
+	}
+    }
+  else if (loongarch_is_lasx_lowpart_interleave (d))
+    {
+      /* Elements from op0's low 18bit and op1's 128bit are inserted into
+	 target register alternately.
+	 sample: E_V4DImode, { 0, 4, 1, 5 }  */
+      if (!d->testing_p)
+	{
+	  /* Prepare temp register instead of modify original op.  */
+	  use_alt_op = true;
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+	  emit_move_insn (op0_alt, d->op0);
+
+	  /* Generate subreg for fitting into insn gen function.  */
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+
+	  /* Adjust op value in temp register.
+	     op0 = {0,1,2,3}, op1 = {4,5,0,1}  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x02)));
+	  /* op0 = {0,1,4,5}, op1 = {4,5,0,1}  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0, conv_op0,
+					      conv_op1, GEN_INT (0x01)));
+
+	  /* Remap indices in selector based on the location of index inside
+	     selector, and vector element numbers in current vector mode.  */
+
+	  /* Filling low 128bit of new selector.  */
+	  for (i = 0; i < d->nelt / 2; i += 1)
+	    {
+	      /* value in odd-indexed slot of low 128bit part of selector
+		 vector.  */
+	      remapped[i] = i % 2 != 0 ? d->perm[i] - d->nelt / 2 : d->perm[i];
+	    }
+	  /* Then filling the high 128bit.  */
+	  for (i = d->nelt / 2; i < d->nelt; i += 1)
+	    {
+	      /* value in even-indexed slot of high 128bit part of
+		 selector vector.  */
+	      remapped[i] = i % 2 == 0
+		? d->perm[i] + (d->nelt / 2) * 3 : d->perm[i];
+	    }
+	}
+    }
+  else if (loongarch_is_lasx_lowpart_interleave_2 (d))
+    {
+      /* Special lowpart interleave case in V32QI vector mode.  It does the same
+	 thing as we can see in if branch that above this line.
+	 Selector sample: E_V32QImode,
+	 {0, 1, 2, 3, 4, 5, 6, 7, 32, 33, 34, 35, 36, 37, 38, 39, 8,
+	 9, 10, 11, 12, 13, 14, 15, 40, 41, 42, 43, 44, 45, 46, 47}  */
+      if (!d->testing_p)
+	{
+	  /* Solution for this case in very simple - covert op into V4DI mode,
+	     and do same thing as previous if branch.  */
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+	  emit_move_insn (op0_alt, d->op0);
+
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  rtx conv_target = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x02)));
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0, conv_op0,
+					      conv_op1, GEN_INT (0x01)));
+	  remapped[0] = 0;
+	  remapped[1] = 4;
+	  remapped[2] = 1;
+	  remapped[3] = 5;
+
+	  for (i = 0; i < d->nelt; i += 1)
+	    {
+	      rperm[i] = GEN_INT (remapped[i]);
+	    }
+
+	  sel = gen_rtx_CONST_VECTOR (E_V4DImode, gen_rtvec_v (4, rperm));
+	  sel = force_reg (E_V4DImode, sel);
+	  emit_insn (gen_lasx_xvshuf_d (conv_target, sel,
+					conv_op1, conv_op0));
+	}
+
+      ok = true;
+      goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_lasx_lowpart_extract (d))
+    {
+      /* Copy op0's low 128bit to target's low 128bit, and copy op1's low
+	 128bit to target's high 128bit.
+	 Selector sample: E_V4DImode, { 0, 1, 4 ,5 }  */
+      if (!d->testing_p)
+	{
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, d->op1, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  rtx conv_target = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+
+	  /* We can achieve the expectation by using sinple xvpermi.q insn.  */
+	  emit_move_insn (conv_target, conv_op1);
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_target, conv_target,
+					      conv_op0, GEN_INT (0x20)));
+	}
+
+      ok = true;
+      goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_lasx_highpart_interleave (d))
+    {
+      /* Similar to lowpart interleave, elements from op0's high 128bit and
+	 op1's high 128bit are inserted into target regiter alternately.
+	 Selector sample: E_V8SImode, { 4, 12, 5, 13, 6, 14, 7, 15 }  */
+      if (!d->testing_p)
+	{
+	  /* Prepare temp op register.  */
+	  use_alt_op = true;
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+	  emit_move_insn (op0_alt, d->op0);
+
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  /* Adjust op value in temp regiter.
+	     op0 = { 0, 1, 2, 3 }, op1 = { 6, 7, 2, 3 }  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x13)));
+	  /* op0 = { 2, 3, 6, 7 }, op1 = { 6, 7, 2, 3 }  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0, conv_op0,
+					      conv_op1, GEN_INT (0x01)));
+	  /* Remap indices in selector based on the location of index inside
+	     selector, and vector element numbers in current vector mode.  */
+
+	  /* Filling low 128bit of new selector.  */
+	 for (i = 0; i < d->nelt / 2; i += 1)
+	   {
+	     /* value in even-indexed slot of low 128bit part of selector
+		vector.  */
+	     remapped[i] = i % 2 == 0 ? d->perm[i] - d->nelt / 2 : d->perm[i];
+	   }
+	  /* Then filling the high 128bit.  */
+	 for (i = d->nelt / 2; i < d->nelt; i += 1)
+	   {
+	     /* value in odd-indexed slot of high 128bit part of selector
+		vector.  */
+	      remapped[i] = i % 2 != 0
+		? d->perm[i] - (d->nelt / 2) * 3 : d->perm[i];
+	   }
+	}
+    }
+  else if (loongarch_is_lasx_highpart_interleave_2 (d))
+    {
+      /* Special highpart interleave case in V32QI vector mode.  It does the
+	 same thing as the normal version above.
+	 Selector sample: E_V32QImode,
+	 {16, 17, 18, 19, 20, 21, 22, 23, 48, 49, 50, 51, 52, 53, 54, 55,
+	 24, 25, 26, 27, 28, 29, 30, 31, 56, 57, 58, 59, 60, 61, 62, 63}
+      */
+      if (!d->testing_p)
+	{
+	  /* Convert op into V4DImode and do the things.  */
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+	  emit_move_insn (op0_alt, d->op0);
+
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  rtx conv_target = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x13)));
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0, conv_op0,
+					      conv_op1, GEN_INT (0x01)));
+	  remapped[0] = 2;
+	  remapped[1] = 6;
+	  remapped[2] = 3;
+	  remapped[3] = 7;
+
+	  for (i = 0; i < d->nelt; i += 1)
+	    {
+	      rperm[i] = GEN_INT (remapped[i]);
+	    }
+
+	  sel = gen_rtx_CONST_VECTOR (E_V4DImode, gen_rtvec_v (4, rperm));
+	  sel = force_reg (E_V4DImode, sel);
+	  emit_insn (gen_lasx_xvshuf_d (conv_target, sel,
+					conv_op1, conv_op0));
+	}
+
+	ok = true;
+	goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_elem_duplicate (d))
+    {
+      /* Brocast single element (from op0 or op1) to all slot of target
+	 register.
+	 Selector sample:E_V8SImode, { 2, 2, 2, 2, 2, 2, 2, 2 }  */
+      if (!d->testing_p)
+	{
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, d->op1, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  rtx temp_reg = gen_reg_rtx (d->vmode);
+	  rtx conv_temp = gen_rtx_SUBREG (E_V4DImode, temp_reg, 0);
+
+	  emit_move_insn (temp_reg, d->op0);
+
+	  idx = d->perm[0];
+	  /* We will use xvrepl128vei.* insn to achieve the result, but we need
+	     to make the high/low 128bit has the same contents that contain the
+	     value that we need to broardcast, because xvrepl128vei does the
+	     broardcast job from every 128bit of source register to
+	     corresponded part of target register! (A deep sigh.)  */
+	  if (/*idx >= 0 &&*/ idx < d->nelt / 2)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (conv_temp, conv_temp,
+						  conv_op0, GEN_INT (0x0)));
+	    }
+	  else if (idx >= d->nelt / 2 && idx < d->nelt)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (conv_temp, conv_temp,
+						  conv_op0, GEN_INT (0x11)));
+	      idx -= d->nelt / 2;
+	    }
+	  else if (idx >= d->nelt && idx < (d->nelt + d->nelt / 2))
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (conv_temp, conv_temp,
+						  conv_op1, GEN_INT (0x0)));
+	    }
+	  else if (idx >= (d->nelt + d->nelt / 2) && idx < d->nelt * 2)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (conv_temp, conv_temp,
+						  conv_op1, GEN_INT (0x11)));
+	      idx -= d->nelt / 2;
+	    }
+
+	  /* Then we can finally generate this insn.  */
+	  switch (d->vmode)
+	    {
+	    case E_V4DImode:
+	      emit_insn (gen_lasx_xvrepl128vei_d (d->target, temp_reg,
+						  GEN_INT (idx)));
+	      break;
+	    case E_V4DFmode:
+	      emit_insn (gen_lasx_xvrepl128vei_d_f (d->target, temp_reg,
+						    GEN_INT (idx)));
+	      break;
+	    case E_V8SImode:
+	      emit_insn (gen_lasx_xvrepl128vei_w (d->target, temp_reg,
+						  GEN_INT (idx)));
+	      break;
+	    case E_V8SFmode:
+	      emit_insn (gen_lasx_xvrepl128vei_w_f (d->target, temp_reg,
+						    GEN_INT (idx)));
+	      break;
+	    case E_V16HImode:
+	      emit_insn (gen_lasx_xvrepl128vei_h (d->target, temp_reg,
+						  GEN_INT (idx)));
+	      break;
+	    case E_V32QImode:
+	      emit_insn (gen_lasx_xvrepl128vei_b (d->target, temp_reg,
+						  GEN_INT (idx)));
+	      break;
+	    default:
+	      gcc_unreachable ();
+	      break;
+	    }
+
+	  /* finish func directly.  */
+	  ok = true;
+	  goto expand_perm_const_2_end;
+	}
+    }
+  else if (loongarch_is_op_reverse_perm (d))
+    {
+      /* reverse high 128bit and low 128bit in op0.
+	 Selector sample: E_V4DFmode, { 2, 3, 0, 1 }
+	 Use xvpermi.q for doing this job.  */
+      if (!d->testing_p)
+	{
+	  if (d->vmode == E_V4DImode)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (d->target, d->target, d->op0,
+						  GEN_INT (0x01)));
+	    }
+	  else if (d->vmode == E_V4DFmode)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4df (d->target, d->target, d->op0,
+						  GEN_INT (0x01)));
+	    }
+	  else
+	    {
+	      gcc_unreachable ();
+	    }
+	}
+
+      ok = true;
+      goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_single_op_perm (d))
+    {
+      /* Permutation that only select elements from op0.  */
+      if (!d->testing_p)
+	{
+	  /* Prepare temp register instead of modify original op.  */
+	  use_alt_op = true;
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  op1_alt = gen_reg_rtx (d->vmode);
+
+	  emit_move_insn (op0_alt, d->op0);
+	  emit_move_insn (op1_alt, d->op1);
+
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  rtx conv_op0a = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  rtx conv_op1a = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+
+	  /* Duplicate op0's low 128bit in op0, then duplicate high 128bit
+	     in op1.  After this, xvshuf.* insn's selector argument can
+	     access all elements we need for correct permutation result.  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0a, conv_op0a, conv_op0,
+					      GEN_INT (0x00)));
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1a, conv_op1a, conv_op0,
+					      GEN_INT (0x11)));
+
+	  /* In this case, there's no need to remap selector's indices.  */
+	  for (i = 0; i < d->nelt; i += 1)
+	    {
+	      remapped[i] = d->perm[i];
+	    }
+	}
+    }
+  else if (loongarch_is_divisible_perm (d))
+    {
+      /* Divisible perm:
+	 Low 128bit of selector only selects elements of op0,
+	 and high 128bit of selector only selects elements of op1.  */
+
+      if (!d->testing_p)
+	{
+	  /* Prepare temp register instead of modify original op.  */
+	  use_alt_op = true;
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  op1_alt = gen_reg_rtx (d->vmode);
+
+	  emit_move_insn (op0_alt, d->op0);
+	  emit_move_insn (op1_alt, d->op1);
+
+	  rtx conv_op0a = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  rtx conv_op1a = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, d->op1, 0);
+
+	  /* Reorganize op0's hi/lo 128bit and op1's hi/lo 128bit, to make sure
+	     that selector's low 128bit can access all op0's elements, and
+	     selector's high 128bit can access all op1's elements.  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0a, conv_op0a, conv_op1,
+					      GEN_INT (0x02)));
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1a, conv_op1a, conv_op0,
+					      GEN_INT (0x31)));
+
+	  /* No need to modify indices.  */
+	  for (i = 0; i < d->nelt;i += 1)
+	    {
+	      remapped[i] = d->perm[i];
+	    }
+	}
+    }
+  else if (loongarch_is_triple_stride_extract (d))
+    {
+      /* Selector sample: E_V4DFmode, { 1, 4, 7, 0 }.  */
+      if (!d->testing_p)
+	{
+	  /* Resolve it with brute force modification.  */
+	  remapped[0] = 1;
+	  remapped[1] = 2;
+	  remapped[2] = 3;
+	  remapped[3] = 0;
+	}
+    }
+  else
+    {
+      /* When all of the detections above are failed, we will try last
+	 strategy.
+	 The for loop tries to detect following rules based on indices' value,
+	 its position inside of selector vector ,and strange behavior of
+	 xvshuf.* insn; Then we take corresponding action. (Replace with new
+	 value, or give up whole permutation expansion.)  */
+      for (i = 0; i < d->nelt; i += 1)
+	{
+	  /* % (2 * d->nelt)  */
+	  idx = d->perm[i];
+
+	  /* if index is located in low 128bit of selector vector.  */
+	  if (i < d->nelt / 2)
+	    {
+	      /* Fail case 1: index tries to reach element that located in op0's
+		 high 128bit.  */
+	      if (idx >= d->nelt / 2 && idx < d->nelt)
+		{
+		  goto expand_perm_const_2_end;
+		}
+	      /* Fail case 2: index tries to reach element that located in
+		 op1's high 128bit.  */
+	      if (idx >= (d->nelt + d->nelt / 2))
+		{
+		  goto expand_perm_const_2_end;
+		}
+
+	      /* Success case: index tries to reach elements that located in
+		 op1's low 128bit.  Apply - (nelt / 2) offset to original
+		 value.  */
+	      if (idx >= d->nelt && idx < (d->nelt + d->nelt / 2))
+		{
+		  idx -= d->nelt / 2;
+		}
+	    }
+	  /* if index is located in high 128bit of selector vector.  */
+	  else
+	    {
+	      /* Fail case 1: index tries to reach element that located in
+		 op1's low 128bit.  */
+	      if (idx >= d->nelt && idx < (d->nelt + d->nelt / 2))
+		{
+		  goto expand_perm_const_2_end;
+		}
+	      /* Fail case 2: index tries to reach element that located in
+		 op0's low 128bit.  */
+	      if (idx < (d->nelt / 2))
+		{
+		  goto expand_perm_const_2_end;
+		}
+	      /* Success case: index tries to reach element that located in
+		 op0's high 128bit.  */
+	      if (idx >= d->nelt / 2 && idx < d->nelt)
+		{
+		  idx -= d->nelt / 2;
+		}
+	    }
+	  /* No need to process other case that we did not mentioned.  */
+
+	  /* Assign with original or processed value.  */
+	  remapped[i] = idx;
+	}
+    }
+
+  ok = true;
+  /* If testing_p is true, compiler is trying to figure out that backend can
+     handle this permutation, but doesn't want to generate actual insn.  So
+     if true, exit directly.  */
+  if (d->testing_p)
+    {
+      goto expand_perm_const_2_end;
+    }
+
+  /* Convert remapped selector array to RTL array.  */
+  for (i = 0; i < d->nelt; i += 1)
+    {
+      rperm[i] = GEN_INT (remapped[i]);
+    }
+
+  /* Copy selector vector from memory to vector regiter for later insn gen
+     function.
+     If vector's element in floating point value, we cannot fit selector
+     argument into insn gen function directly, because of the insn template
+     definition.  As a solution, generate a integral mode subreg of target,
+     then copy selector vector (that is in integral mode) to this subreg.  */
+  switch (d->vmode)
+    {
+    case E_V4DFmode:
+      sel = gen_rtx_CONST_VECTOR (E_V4DImode, gen_rtvec_v (d->nelt, rperm));
+      tmp = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+      emit_move_insn (tmp, sel);
+      break;
+    case E_V8SFmode:
+      sel = gen_rtx_CONST_VECTOR (E_V8SImode, gen_rtvec_v (d->nelt, rperm));
+      tmp = gen_rtx_SUBREG (E_V8SImode, d->target, 0);
+      emit_move_insn (tmp, sel);
+      break;
+    default:
+      sel = gen_rtx_CONST_VECTOR (d->vmode, gen_rtvec_v (d->nelt, rperm));
+      emit_move_insn (d->target, sel);
+      break;
+    }
+
+  target = d->target;
+  /* If temp op registers are requested in previous if branch, then use temp
+     register intead of original one.  */
+  if (use_alt_op)
+    {
+      op0 = op0_alt != NULL_RTX ? op0_alt : d->op0;
+      op1 = op1_alt != NULL_RTX ? op1_alt : d->op1;
+    }
+  else
+    {
+      op0 = d->op0;
+      op1 = d->one_vector_p ? d->op0 : d->op1;
+    }
+
+  /* We FINALLY can generate xvshuf.* insn.  */
+  switch (d->vmode)
+    {
+    case E_V4DFmode:
+      emit_insn (gen_lasx_xvshuf_d_f (target, target, op1, op0));
+      break;
+    case E_V4DImode:
+      emit_insn (gen_lasx_xvshuf_d (target, target, op1, op0));
+      break;
+    case E_V8SFmode:
+      emit_insn (gen_lasx_xvshuf_w_f (target, target, op1, op0));
+      break;
+    case E_V8SImode:
+      emit_insn (gen_lasx_xvshuf_w (target, target, op1, op0));
+      break;
+    case E_V16HImode:
+      emit_insn (gen_lasx_xvshuf_h (target, target, op1, op0));
+      break;
+    case E_V32QImode:
+      emit_insn (gen_lasx_xvshuf_b (target, op1, op0, target));
+      break;
+    default:
+      gcc_unreachable ();
+      break;
+    }
+
+  /* Extra insn for swapping the hi/lo 128bit of target vector register.  */
+  if (reverse_hi_lo)
+    {
+      switch (d->vmode)
+	{
+	case E_V4DFmode:
+	  emit_insn (gen_lasx_xvpermi_q_v4df (d->target, d->target,
+					      d->target, GEN_INT (0x1)));
+	  break;
+	case E_V4DImode:
+	  emit_insn (gen_lasx_xvpermi_q_v4di (d->target, d->target,
+					      d->target, GEN_INT (0x1)));
+	  break;
+	case E_V8SFmode:
+	  emit_insn (gen_lasx_xvpermi_q_v8sf (d->target, d->target,
+					      d->target, GEN_INT (0x1)));
+	  break;
+	case E_V8SImode:
+	  emit_insn (gen_lasx_xvpermi_q_v8si (d->target, d->target,
+					      d->target, GEN_INT (0x1)));
+	  break;
+	case E_V16HImode:
+	  emit_insn (gen_lasx_xvpermi_q_v16hi (d->target, d->target,
+					       d->target, GEN_INT (0x1)));
+	  break;
+	case E_V32QImode:
+	  emit_insn (gen_lasx_xvpermi_q_v32qi (d->target, d->target,
+					       d->target, GEN_INT (0x1)));
+	  break;
+	default:
+	  break;
+	}
+    }
+  /* Extra insn required by odd/even extraction.  Swapping the second and third
+     64bit in target vector register.  */
+  else if (extract_ev_od)
+    {
+      rtx converted = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+      emit_insn (gen_lasx_xvpermi_d_v4di (converted, converted,
+					  GEN_INT (0xD8)));
+    }
+
+expand_perm_const_2_end:
+  return ok;
 }
 
 /* Implement TARGET_VECTORIZE_VEC_PERM_CONST.  */
@@ -7805,6 +9720,12 @@ loongarch_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode,
       break;
     }
 
+  // Do rounding for selector to avoid vshuf undefined behavior.
+  for (i = 0; i < d.nelt; i += 1)
+    {
+      d.perm[i] %= (d.nelt * 2);
+    }
+
   if (d.testing_p)
     {
       d.target = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 1);
@@ -7857,7 +9778,7 @@ loongarch_cpu_sched_reassociation_width (struct loongarch_target *target,
     case CPU_LOONGARCH64:
     case CPU_LA464:
       /* Vector part.  */
-      if (LSX_SUPPORTED_MODE_P (mode))
+      if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode))
 	{
 	  /* Integer vector instructions execute in FP unit.
 	     The width of integer/float-point vector instructions is 3.  */
@@ -7907,6 +9828,44 @@ loongarch_expand_vector_extract (rtx target, rtx vec, int elt)
     case E_V16QImode:
       break;
 
+    case E_V32QImode:
+      if (ISA_HAS_LASX)
+	{
+	  if (elt >= 16)
+	    {
+	      tmp = gen_reg_rtx (V32QImode);
+	      emit_insn (gen_lasx_xvpermi_d_v32qi (tmp, vec, GEN_INT (0xe)));
+	      loongarch_expand_vector_extract (target,
+					       gen_lowpart (V16QImode, tmp),
+					       elt & 15);
+	    }
+	  else
+	    loongarch_expand_vector_extract (target,
+					     gen_lowpart (V16QImode, vec),
+					     elt & 15);
+	  return;
+	}
+      break;
+
+    case E_V16HImode:
+      if (ISA_HAS_LASX)
+	{
+	  if (elt >= 8)
+	    {
+	      tmp = gen_reg_rtx (V16HImode);
+	      emit_insn (gen_lasx_xvpermi_d_v16hi (tmp, vec, GEN_INT (0xe)));
+	      loongarch_expand_vector_extract (target,
+					       gen_lowpart (V8HImode, tmp),
+					       elt & 7);
+	    }
+	  else
+	    loongarch_expand_vector_extract (target,
+					     gen_lowpart (V8HImode, vec),
+					     elt & 7);
+	  return;
+	}
+      break;
+
     default:
       break;
     }
@@ -7945,6 +9904,31 @@ emit_reduc_half (rtx dest, rtx src, int i)
     case E_V2DFmode:
       tem = gen_lsx_vbsrl_d_f (dest, src, GEN_INT (8));
       break;
+    case E_V8SFmode:
+      if (i == 256)
+	tem = gen_lasx_xvpermi_d_v8sf (dest, src, GEN_INT (0xe));
+      else
+	tem = gen_lasx_xvshuf4i_w_f (dest, src,
+				     GEN_INT (i == 128 ? 2 + (3 << 2) : 1));
+      break;
+    case E_V4DFmode:
+      if (i == 256)
+	tem = gen_lasx_xvpermi_d_v4df (dest, src, GEN_INT (0xe));
+      else
+	tem = gen_lasx_xvpermi_d_v4df (dest, src, const1_rtx);
+      break;
+    case E_V32QImode:
+    case E_V16HImode:
+    case E_V8SImode:
+    case E_V4DImode:
+      d = gen_reg_rtx (V4DImode);
+      if (i == 256)
+	tem = gen_lasx_xvpermi_d_v4di (d, gen_lowpart (V4DImode, src),
+				       GEN_INT (0xe));
+      else
+	tem = gen_lasx_xvbsrl_d (d, gen_lowpart (V4DImode, src),
+				 GEN_INT (i/16));
+      break;
     case E_V16QImode:
     case E_V8HImode:
     case E_V4SImode:
@@ -7992,10 +9976,57 @@ loongarch_expand_vec_unpack (rtx operands[2], bool unsigned_p, bool high_p)
 {
   machine_mode imode = GET_MODE (operands[1]);
   rtx (*unpack) (rtx, rtx, rtx);
+  rtx (*extend) (rtx, rtx);
   rtx (*cmpFunc) (rtx, rtx, rtx);
+  rtx (*swap_hi_lo) (rtx, rtx, rtx, rtx);
   rtx tmp, dest;
 
-  if (ISA_HAS_LSX)
+  if (ISA_HAS_LASX && GET_MODE_SIZE (imode) == 32)
+    {
+      switch (imode)
+	{
+	case E_V8SImode:
+	  if (unsigned_p)
+	    extend = gen_lasx_vext2xv_du_wu;
+	  else
+	    extend = gen_lasx_vext2xv_d_w;
+	  swap_hi_lo = gen_lasx_xvpermi_q_v8si;
+	  break;
+
+	case E_V16HImode:
+	  if (unsigned_p)
+	    extend = gen_lasx_vext2xv_wu_hu;
+	  else
+	    extend = gen_lasx_vext2xv_w_h;
+	  swap_hi_lo = gen_lasx_xvpermi_q_v16hi;
+	  break;
+
+	case E_V32QImode:
+	  if (unsigned_p)
+	    extend = gen_lasx_vext2xv_hu_bu;
+	  else
+	    extend = gen_lasx_vext2xv_h_b;
+	  swap_hi_lo = gen_lasx_xvpermi_q_v32qi;
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	  break;
+	}
+
+      if (high_p)
+	{
+	  tmp = gen_reg_rtx (imode);
+	  emit_insn (swap_hi_lo (tmp, tmp, operands[1], const1_rtx));
+	  emit_insn (extend (operands[0], tmp));
+	  return;
+	}
+
+      emit_insn (extend (operands[0], operands[1]));
+      return;
+
+    }
+  else if (ISA_HAS_LSX)
     {
       switch (imode)
 	{
@@ -8096,8 +10127,17 @@ loongarch_gen_const_int_vector_shuffle (machine_mode mode, int val)
   return gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (nunits, elts));
 }
 
+
 /* Expand a vector initialization.  */
 
+void
+loongarch_expand_vector_group_init (rtx target, rtx vals)
+{
+  rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
+  emit_insn (gen_rtx_SET (target, gen_rtx_VEC_CONCAT (E_V32QImode, ops[0],
+						      ops[1])));
+}
+
 void
 loongarch_expand_vector_init (rtx target, rtx vals)
 {
@@ -8117,6 +10157,285 @@ loongarch_expand_vector_init (rtx target, rtx vals)
 	all_same = false;
     }
 
+  if (ISA_HAS_LASX && GET_MODE_SIZE (vmode) == 32)
+    {
+      if (all_same)
+	{
+	  rtx same = XVECEXP (vals, 0, 0);
+	  rtx temp, temp2;
+
+	  if (CONST_INT_P (same) && nvar == 0
+	      && loongarch_signed_immediate_p (INTVAL (same), 10, 0))
+	    {
+	      switch (vmode)
+		{
+		case E_V32QImode:
+		case E_V16HImode:
+		case E_V8SImode:
+		case E_V4DImode:
+		  temp = gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0));
+		  emit_move_insn (target, temp);
+		  return;
+
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+
+	  temp = gen_reg_rtx (imode);
+	  if (imode == GET_MODE (same))
+	    temp2 = same;
+	  else if (GET_MODE_SIZE (imode) >= UNITS_PER_WORD)
+	    {
+	      if (GET_CODE (same) == MEM)
+		{
+		  rtx reg_tmp = gen_reg_rtx (GET_MODE (same));
+		  loongarch_emit_move (reg_tmp, same);
+		  temp2 = simplify_gen_subreg (imode, reg_tmp,
+					       GET_MODE (reg_tmp), 0);
+		}
+	      else
+		temp2 = simplify_gen_subreg (imode, same,
+					     GET_MODE (same), 0);
+	    }
+	  else
+	    {
+	      if (GET_CODE (same) == MEM)
+		{
+		  rtx reg_tmp = gen_reg_rtx (GET_MODE (same));
+		  loongarch_emit_move (reg_tmp, same);
+		  temp2 = lowpart_subreg (imode, reg_tmp,
+					  GET_MODE (reg_tmp));
+		}
+	      else
+		temp2 = lowpart_subreg (imode, same, GET_MODE (same));
+	    }
+	  emit_move_insn (temp, temp2);
+
+	  switch (vmode)
+	    {
+	    case E_V32QImode:
+	    case E_V16HImode:
+	    case E_V8SImode:
+	    case E_V4DImode:
+	      loongarch_emit_move (target,
+				   gen_rtx_VEC_DUPLICATE (vmode, temp));
+	      break;
+
+	    case E_V8SFmode:
+	      emit_insn (gen_lasx_xvreplve0_w_f_scalar (target, temp));
+	      break;
+
+	    case E_V4DFmode:
+	      emit_insn (gen_lasx_xvreplve0_d_f_scalar (target, temp));
+	      break;
+
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+      else
+	{
+	  rtvec vec = shallow_copy_rtvec (XVEC (vals, 0));
+
+	  for (i = 0; i < nelt; ++i)
+	    RTVEC_ELT (vec, i) = CONST0_RTX (imode);
+
+	  emit_move_insn (target, gen_rtx_CONST_VECTOR (vmode, vec));
+
+	  machine_mode half_mode = VOIDmode;
+	  rtx target_hi, target_lo;
+
+	  switch (vmode)
+	    {
+	    case E_V32QImode:
+	      half_mode=E_V16QImode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_b_scalar (target_hi,
+							    temp_hi));
+		      emit_insn (gen_lsx_vreplvei_b_scalar (target_lo,
+							    temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv16qi (target_hi, temp_hi,
+						   GEN_INT (i)));
+		      emit_insn (gen_vec_setv16qi (target_lo, temp_lo,
+						   GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V16HImode:
+	      half_mode=E_V8HImode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_h_scalar (target_hi,
+							    temp_hi));
+		      emit_insn (gen_lsx_vreplvei_h_scalar (target_lo,
+							    temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv8hi (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv8hi (target_lo, temp_lo,
+						  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V8SImode:
+	      half_mode=V4SImode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_w_scalar (target_hi,
+							    temp_hi));
+		      emit_insn (gen_lsx_vreplvei_w_scalar (target_lo,
+							    temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv4si (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv4si (target_lo, temp_lo,
+			  			  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V4DImode:
+	      half_mode=E_V2DImode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_d_scalar (target_hi,
+							    temp_hi));
+		      emit_insn (gen_lsx_vreplvei_d_scalar (target_lo,
+							    temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv2di (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv2di (target_lo, temp_lo,
+						  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V8SFmode:
+	      half_mode=E_V4SFmode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_w_f_scalar (target_hi,
+							      temp_hi));
+		      emit_insn (gen_lsx_vreplvei_w_f_scalar (target_lo,
+							      temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv4sf (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv4sf (target_lo, temp_lo,
+						  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V4DFmode:
+	      half_mode=E_V2DFmode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_d_f_scalar (target_hi,
+							      temp_hi));
+		      emit_insn (gen_lsx_vreplvei_d_f_scalar (target_lo,
+							      temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv2df (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv2df (target_lo, temp_lo,
+						  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    default:
+	      gcc_unreachable ();
+	    }
+
+	}
+      return;
+    }
+
   if (ISA_HAS_LSX)
     {
       if (all_same)
@@ -8364,6 +10683,38 @@ loongarch_expand_lsx_cmp (rtx dest, enum rtx_code cond, rtx op0, rtx op1)
 	}
       break;
 
+    case E_V8SFmode:
+    case E_V4DFmode:
+      switch (cond)
+	{
+	case UNORDERED:
+	case ORDERED:
+	case EQ:
+	case NE:
+	case UNEQ:
+	case UNLE:
+	case UNLT:
+	  break;
+	case LTGT: cond = NE; break;
+	case UNGE: cond = UNLE; std::swap (op0, op1); break;
+	case UNGT: cond = UNLT; std::swap (op0, op1); break;
+	case LE: unspec = UNSPEC_LASX_XVFCMP_SLE; break;
+	case LT: unspec = UNSPEC_LASX_XVFCMP_SLT; break;
+	case GE: unspec = UNSPEC_LASX_XVFCMP_SLE; std::swap (op0, op1); break;
+	case GT: unspec = UNSPEC_LASX_XVFCMP_SLT; std::swap (op0, op1); break;
+	default:
+		 gcc_unreachable ();
+	}
+      if (unspec < 0)
+	loongarch_emit_binary (cond, dest, op0, op1);
+      else
+	{
+	  rtx x = gen_rtx_UNSPEC (GET_MODE (dest),
+				  gen_rtvec (2, op0, op1), unspec);
+	  emit_insn (gen_rtx_SET (dest, x));
+	}
+      break;
+
     default:
       gcc_unreachable ();
       break;
@@ -8701,7 +11052,7 @@ loongarch_builtin_support_vector_misalignment (machine_mode mode,
 					       int misalignment,
 					       bool is_packed)
 {
-  if (ISA_HAS_LSX && STRICT_ALIGNMENT)
+  if ((ISA_HAS_LSX || ISA_HAS_LASX) && STRICT_ALIGNMENT)
     {
       if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing)
 	return false;
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index e939dd826d1..39852d2bb12 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -186,6 +186,11 @@ along with GCC; see the file COPYING3.  If not see
 /* Width of a LSX vector register in bits.  */
 #define BITS_PER_LSX_REG (UNITS_PER_LSX_REG * BITS_PER_UNIT)
 
+/* Width of a LASX vector register in bytes.  */
+#define UNITS_PER_LASX_REG 32
+/* Width of a LASX vector register in bits.  */
+#define BITS_PER_LASX_REG (UNITS_PER_LASX_REG * BITS_PER_UNIT)
+
 /* For LARCH, width of a floating point register.  */
 #define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4)
 
@@ -248,10 +253,11 @@ along with GCC; see the file COPYING3.  If not see
 #define STRUCTURE_SIZE_BOUNDARY 8
 
 /* There is no point aligning anything to a rounder boundary than
-   LONG_DOUBLE_TYPE_SIZE, unless under LSX the bigggest alignment is
-   BITS_PER_LSX_REG/..  */
+   LONG_DOUBLE_TYPE_SIZE, unless under LSX/LASX the bigggest alignment is
+   BITS_PER_LSX_REG/BITS_PER_LASX_REG/..  */
 #define BIGGEST_ALIGNMENT \
-  (ISA_HAS_LSX ? BITS_PER_LSX_REG : LONG_DOUBLE_TYPE_SIZE)
+  (ISA_HAS_LASX? BITS_PER_LASX_REG \
+   : (ISA_HAS_LSX ? BITS_PER_LSX_REG : LONG_DOUBLE_TYPE_SIZE))
 
 /* All accesses must be aligned.  */
 #define STRICT_ALIGNMENT (TARGET_STRICT_ALIGN)
@@ -391,6 +397,10 @@ along with GCC; see the file COPYING3.  If not see
 #define LSX_REG_LAST  FP_REG_LAST
 #define LSX_REG_NUM   FP_REG_NUM
 
+#define LASX_REG_FIRST FP_REG_FIRST
+#define LASX_REG_LAST  FP_REG_LAST
+#define LASX_REG_NUM   FP_REG_NUM
+
 /* The DWARF 2 CFA column which tracks the return address from a
    signal handler context.  This means that to maintain backwards
    compatibility, no hard register can be assigned this column if it
@@ -409,9 +419,12 @@ along with GCC; see the file COPYING3.  If not see
   ((unsigned int) ((int) (REGNO) - FCC_REG_FIRST) < FCC_REG_NUM)
 #define LSX_REG_P(REGNO) \
   ((unsigned int) ((int) (REGNO) - LSX_REG_FIRST) < LSX_REG_NUM)
+#define LASX_REG_P(REGNO) \
+  ((unsigned int) ((int) (REGNO) - LASX_REG_FIRST) < LASX_REG_NUM)
 
 #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
 #define LSX_REG_RTX_P(X) (REG_P (X) && LSX_REG_P (REGNO (X)))
+#define LASX_REG_RTX_P(X) (REG_P (X) && LASX_REG_P (REGNO (X)))
 
 /* Select a register mode required for caller save of hard regno REGNO.  */
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
@@ -733,6 +746,13 @@ enum reg_class
    && (GET_MODE_CLASS (MODE) == MODE_VECTOR_INT		\
        || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT))
 
+#define LASX_SUPPORTED_MODE_P(MODE)			\
+  (ISA_HAS_LASX						\
+   && (GET_MODE_SIZE (MODE) == UNITS_PER_LSX_REG	\
+       ||GET_MODE_SIZE (MODE) == UNITS_PER_LASX_REG)	\
+   && (GET_MODE_CLASS (MODE) == MODE_VECTOR_INT		\
+       || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT))
+
 /* 1 if N is a possible register number for function argument passing.
    We have no FP argument registers when soft-float.  */
 
@@ -985,7 +1005,39 @@ typedef struct {
   { "vr28",	28 + FP_REG_FIRST },					\
   { "vr29",	29 + FP_REG_FIRST },					\
   { "vr30",	30 + FP_REG_FIRST },					\
-  { "vr31",	31 + FP_REG_FIRST }					\
+  { "vr31",	31 + FP_REG_FIRST },					\
+  { "xr0",	 0 + FP_REG_FIRST },					\
+  { "xr1",	 1 + FP_REG_FIRST },					\
+  { "xr2",	 2 + FP_REG_FIRST },					\
+  { "xr3",	 3 + FP_REG_FIRST },					\
+  { "xr4",	 4 + FP_REG_FIRST },					\
+  { "xr5",	 5 + FP_REG_FIRST },					\
+  { "xr6",	 6 + FP_REG_FIRST },					\
+  { "xr7",	 7 + FP_REG_FIRST },					\
+  { "xr8",	 8 + FP_REG_FIRST },					\
+  { "xr9",	 9 + FP_REG_FIRST },					\
+  { "xr10",	10 + FP_REG_FIRST },					\
+  { "xr11",	11 + FP_REG_FIRST },					\
+  { "xr12",	12 + FP_REG_FIRST },					\
+  { "xr13",	13 + FP_REG_FIRST },					\
+  { "xr14",	14 + FP_REG_FIRST },					\
+  { "xr15",	15 + FP_REG_FIRST },					\
+  { "xr16",	16 + FP_REG_FIRST },					\
+  { "xr17",	17 + FP_REG_FIRST },					\
+  { "xr18",	18 + FP_REG_FIRST },					\
+  { "xr19",	19 + FP_REG_FIRST },					\
+  { "xr20",	20 + FP_REG_FIRST },					\
+  { "xr21",	21 + FP_REG_FIRST },					\
+  { "xr22",	22 + FP_REG_FIRST },					\
+  { "xr23",	23 + FP_REG_FIRST },					\
+  { "xr24",	24 + FP_REG_FIRST },					\
+  { "xr25",	25 + FP_REG_FIRST },					\
+  { "xr26",	26 + FP_REG_FIRST },					\
+  { "xr27",	27 + FP_REG_FIRST },					\
+  { "xr28",	28 + FP_REG_FIRST },					\
+  { "xr29",	29 + FP_REG_FIRST },					\
+  { "xr30",	30 + FP_REG_FIRST },					\
+  { "xr31",	31 + FP_REG_FIRST }					\
 }
 
 /* Globalizing directive for a label.  */
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index 7b8978e2533..30b2cb91e9a 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -163,7 +163,7 @@ (define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor,simd_add"
 
 ;; Main data type used by the insn
 (define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC,
-  V2DI,V4SI,V8HI,V16QI,V2DF,V4SF"
+  V2DI,V4SI,V8HI,V16QI,V2DF,V4SF,V4DI,V8SI,V16HI,V32QI,V4DF,V8SF"
   (const_string "unknown"))
 
 ;; True if the main data type is twice the size of a word.
@@ -422,12 +422,14 @@ (define_mode_attr ifmt [(SI "w") (DI "l")])
 ;; floating-point mode or vector mode.
 (define_mode_attr UNITMODE [(SF "SF") (DF "DF") (V2SF "SF") (V4SF "SF")
 			    (V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI")
-			    (V2DF "DF")])
+			    (V2DF "DF")(V8SF "SF")(V32QI "QI")(V16HI "HI")(V8SI "SI")(V4DI "DI")(V4DF "DF")])
 
 ;; As above, but in lower case.
 (define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf")
 			    (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi")
-			    (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")])
+			    (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")
+			    (V8SI "si") (V4DI "di") (V32QI "qi") (V16HI "hi")
+			    (V8SF "sf") (V4DF "df")])
 
 ;; This attribute gives the integer mode that has half the size of
 ;; the controlling mode.
@@ -711,16 +713,17 @@ (define_insn "sub<mode>3"
   [(set_attr "alu_type" "sub")
    (set_attr "mode" "<MODE>")])
 
+
 (define_insn "*subsi3_extended"
-  [(set (match_operand:DI 0 "register_operand" "= r")
+  [(set (match_operand:DI 0 "register_operand" "=r")
 	(sign_extend:DI
-	    (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
-		      (match_operand:SI 2 "register_operand" "  r"))))]
+	    (minus:SI (match_operand:SI 1 "reg_or_0_operand" "rJ")
+		      (match_operand:SI 2 "register_operand" "r"))))]
   "TARGET_64BIT"
   "sub.w\t%0,%z1,%2"
   [(set_attr "type" "arith")
    (set_attr "mode" "SI")])
-\f
+
 ;;
 ;;  ....................
 ;;
@@ -3634,6 +3637,9 @@ (define_insn "loongarch_crcc_w_<size>_w"
 ; The LoongArch SX Instructions.
 (include "lsx.md")
 
+; The LoongArch ASX Instructions.
+(include "lasx.md")
+
 (define_c_enum "unspec" [
   UNSPEC_ADDRESS_FIRST
 ])
-- 
2.36.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 4/4] LoongArch: Add Loongson ASX directive builtin function support.
  2023-08-31  9:08 [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
                   ` (2 preceding siblings ...)
  2023-08-31  9:08 ` [PATCH v6 3/4] LoongArch: Add Loongson ASX base instruction support Chenghui Pan
@ 2023-08-31  9:08 ` Chenghui Pan
  2023-08-31  9:41 ` [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Xi Ruoyao
  2023-09-05 11:34 ` chenglulu
  5 siblings, 0 replies; 8+ messages in thread
From: Chenghui Pan @ 2023-08-31  9:08 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config.gcc: Export the header file lasxintrin.h.
	* config/loongarch/loongarch-builtins.cc (enum loongarch_builtin_type):
	Add Loongson ASX builtin functions support.
	(AVAIL_ALL): Ditto.
	(LASX_BUILTIN): Ditto.
	(LASX_NO_TARGET_BUILTIN): Ditto.
	(LASX_BUILTIN_TEST_BRANCH): Ditto.
	(CODE_FOR_lasx_xvsadd_b): Ditto.
	(CODE_FOR_lasx_xvsadd_h): Ditto.
	(CODE_FOR_lasx_xvsadd_w): Ditto.
	(CODE_FOR_lasx_xvsadd_d): Ditto.
	(CODE_FOR_lasx_xvsadd_bu): Ditto.
	(CODE_FOR_lasx_xvsadd_hu): Ditto.
	(CODE_FOR_lasx_xvsadd_wu): Ditto.
	(CODE_FOR_lasx_xvsadd_du): Ditto.
	(CODE_FOR_lasx_xvadd_b): Ditto.
	(CODE_FOR_lasx_xvadd_h): Ditto.
	(CODE_FOR_lasx_xvadd_w): Ditto.
	(CODE_FOR_lasx_xvadd_d): Ditto.
	(CODE_FOR_lasx_xvaddi_bu): Ditto.
	(CODE_FOR_lasx_xvaddi_hu): Ditto.
	(CODE_FOR_lasx_xvaddi_wu): Ditto.
	(CODE_FOR_lasx_xvaddi_du): Ditto.
	(CODE_FOR_lasx_xvand_v): Ditto.
	(CODE_FOR_lasx_xvandi_b): Ditto.
	(CODE_FOR_lasx_xvbitsel_v): Ditto.
	(CODE_FOR_lasx_xvseqi_b): Ditto.
	(CODE_FOR_lasx_xvseqi_h): Ditto.
	(CODE_FOR_lasx_xvseqi_w): Ditto.
	(CODE_FOR_lasx_xvseqi_d): Ditto.
	(CODE_FOR_lasx_xvslti_b): Ditto.
	(CODE_FOR_lasx_xvslti_h): Ditto.
	(CODE_FOR_lasx_xvslti_w): Ditto.
	(CODE_FOR_lasx_xvslti_d): Ditto.
	(CODE_FOR_lasx_xvslti_bu): Ditto.
	(CODE_FOR_lasx_xvslti_hu): Ditto.
	(CODE_FOR_lasx_xvslti_wu): Ditto.
	(CODE_FOR_lasx_xvslti_du): Ditto.
	(CODE_FOR_lasx_xvslei_b): Ditto.
	(CODE_FOR_lasx_xvslei_h): Ditto.
	(CODE_FOR_lasx_xvslei_w): Ditto.
	(CODE_FOR_lasx_xvslei_d): Ditto.
	(CODE_FOR_lasx_xvslei_bu): Ditto.
	(CODE_FOR_lasx_xvslei_hu): Ditto.
	(CODE_FOR_lasx_xvslei_wu): Ditto.
	(CODE_FOR_lasx_xvslei_du): Ditto.
	(CODE_FOR_lasx_xvdiv_b): Ditto.
	(CODE_FOR_lasx_xvdiv_h): Ditto.
	(CODE_FOR_lasx_xvdiv_w): Ditto.
	(CODE_FOR_lasx_xvdiv_d): Ditto.
	(CODE_FOR_lasx_xvdiv_bu): Ditto.
	(CODE_FOR_lasx_xvdiv_hu): Ditto.
	(CODE_FOR_lasx_xvdiv_wu): Ditto.
	(CODE_FOR_lasx_xvdiv_du): Ditto.
	(CODE_FOR_lasx_xvfadd_s): Ditto.
	(CODE_FOR_lasx_xvfadd_d): Ditto.
	(CODE_FOR_lasx_xvftintrz_w_s): Ditto.
	(CODE_FOR_lasx_xvftintrz_l_d): Ditto.
	(CODE_FOR_lasx_xvftintrz_wu_s): Ditto.
	(CODE_FOR_lasx_xvftintrz_lu_d): Ditto.
	(CODE_FOR_lasx_xvffint_s_w): Ditto.
	(CODE_FOR_lasx_xvffint_d_l): Ditto.
	(CODE_FOR_lasx_xvffint_s_wu): Ditto.
	(CODE_FOR_lasx_xvffint_d_lu): Ditto.
	(CODE_FOR_lasx_xvfsub_s): Ditto.
	(CODE_FOR_lasx_xvfsub_d): Ditto.
	(CODE_FOR_lasx_xvfmul_s): Ditto.
	(CODE_FOR_lasx_xvfmul_d): Ditto.
	(CODE_FOR_lasx_xvfdiv_s): Ditto.
	(CODE_FOR_lasx_xvfdiv_d): Ditto.
	(CODE_FOR_lasx_xvfmax_s): Ditto.
	(CODE_FOR_lasx_xvfmax_d): Ditto.
	(CODE_FOR_lasx_xvfmin_s): Ditto.
	(CODE_FOR_lasx_xvfmin_d): Ditto.
	(CODE_FOR_lasx_xvfsqrt_s): Ditto.
	(CODE_FOR_lasx_xvfsqrt_d): Ditto.
	(CODE_FOR_lasx_xvflogb_s): Ditto.
	(CODE_FOR_lasx_xvflogb_d): Ditto.
	(CODE_FOR_lasx_xvmax_b): Ditto.
	(CODE_FOR_lasx_xvmax_h): Ditto.
	(CODE_FOR_lasx_xvmax_w): Ditto.
	(CODE_FOR_lasx_xvmax_d): Ditto.
	(CODE_FOR_lasx_xvmaxi_b): Ditto.
	(CODE_FOR_lasx_xvmaxi_h): Ditto.
	(CODE_FOR_lasx_xvmaxi_w): Ditto.
	(CODE_FOR_lasx_xvmaxi_d): Ditto.
	(CODE_FOR_lasx_xvmax_bu): Ditto.
	(CODE_FOR_lasx_xvmax_hu): Ditto.
	(CODE_FOR_lasx_xvmax_wu): Ditto.
	(CODE_FOR_lasx_xvmax_du): Ditto.
	(CODE_FOR_lasx_xvmaxi_bu): Ditto.
	(CODE_FOR_lasx_xvmaxi_hu): Ditto.
	(CODE_FOR_lasx_xvmaxi_wu): Ditto.
	(CODE_FOR_lasx_xvmaxi_du): Ditto.
	(CODE_FOR_lasx_xvmin_b): Ditto.
	(CODE_FOR_lasx_xvmin_h): Ditto.
	(CODE_FOR_lasx_xvmin_w): Ditto.
	(CODE_FOR_lasx_xvmin_d): Ditto.
	(CODE_FOR_lasx_xvmini_b): Ditto.
	(CODE_FOR_lasx_xvmini_h): Ditto.
	(CODE_FOR_lasx_xvmini_w): Ditto.
	(CODE_FOR_lasx_xvmini_d): Ditto.
	(CODE_FOR_lasx_xvmin_bu): Ditto.
	(CODE_FOR_lasx_xvmin_hu): Ditto.
	(CODE_FOR_lasx_xvmin_wu): Ditto.
	(CODE_FOR_lasx_xvmin_du): Ditto.
	(CODE_FOR_lasx_xvmini_bu): Ditto.
	(CODE_FOR_lasx_xvmini_hu): Ditto.
	(CODE_FOR_lasx_xvmini_wu): Ditto.
	(CODE_FOR_lasx_xvmini_du): Ditto.
	(CODE_FOR_lasx_xvmod_b): Ditto.
	(CODE_FOR_lasx_xvmod_h): Ditto.
	(CODE_FOR_lasx_xvmod_w): Ditto.
	(CODE_FOR_lasx_xvmod_d): Ditto.
	(CODE_FOR_lasx_xvmod_bu): Ditto.
	(CODE_FOR_lasx_xvmod_hu): Ditto.
	(CODE_FOR_lasx_xvmod_wu): Ditto.
	(CODE_FOR_lasx_xvmod_du): Ditto.
	(CODE_FOR_lasx_xvmul_b): Ditto.
	(CODE_FOR_lasx_xvmul_h): Ditto.
	(CODE_FOR_lasx_xvmul_w): Ditto.
	(CODE_FOR_lasx_xvmul_d): Ditto.
	(CODE_FOR_lasx_xvclz_b): Ditto.
	(CODE_FOR_lasx_xvclz_h): Ditto.
	(CODE_FOR_lasx_xvclz_w): Ditto.
	(CODE_FOR_lasx_xvclz_d): Ditto.
	(CODE_FOR_lasx_xvnor_v): Ditto.
	(CODE_FOR_lasx_xvor_v): Ditto.
	(CODE_FOR_lasx_xvori_b): Ditto.
	(CODE_FOR_lasx_xvnori_b): Ditto.
	(CODE_FOR_lasx_xvpcnt_b): Ditto.
	(CODE_FOR_lasx_xvpcnt_h): Ditto.
	(CODE_FOR_lasx_xvpcnt_w): Ditto.
	(CODE_FOR_lasx_xvpcnt_d): Ditto.
	(CODE_FOR_lasx_xvxor_v): Ditto.
	(CODE_FOR_lasx_xvxori_b): Ditto.
	(CODE_FOR_lasx_xvsll_b): Ditto.
	(CODE_FOR_lasx_xvsll_h): Ditto.
	(CODE_FOR_lasx_xvsll_w): Ditto.
	(CODE_FOR_lasx_xvsll_d): Ditto.
	(CODE_FOR_lasx_xvslli_b): Ditto.
	(CODE_FOR_lasx_xvslli_h): Ditto.
	(CODE_FOR_lasx_xvslli_w): Ditto.
	(CODE_FOR_lasx_xvslli_d): Ditto.
	(CODE_FOR_lasx_xvsra_b): Ditto.
	(CODE_FOR_lasx_xvsra_h): Ditto.
	(CODE_FOR_lasx_xvsra_w): Ditto.
	(CODE_FOR_lasx_xvsra_d): Ditto.
	(CODE_FOR_lasx_xvsrai_b): Ditto.
	(CODE_FOR_lasx_xvsrai_h): Ditto.
	(CODE_FOR_lasx_xvsrai_w): Ditto.
	(CODE_FOR_lasx_xvsrai_d): Ditto.
	(CODE_FOR_lasx_xvsrl_b): Ditto.
	(CODE_FOR_lasx_xvsrl_h): Ditto.
	(CODE_FOR_lasx_xvsrl_w): Ditto.
	(CODE_FOR_lasx_xvsrl_d): Ditto.
	(CODE_FOR_lasx_xvsrli_b): Ditto.
	(CODE_FOR_lasx_xvsrli_h): Ditto.
	(CODE_FOR_lasx_xvsrli_w): Ditto.
	(CODE_FOR_lasx_xvsrli_d): Ditto.
	(CODE_FOR_lasx_xvsub_b): Ditto.
	(CODE_FOR_lasx_xvsub_h): Ditto.
	(CODE_FOR_lasx_xvsub_w): Ditto.
	(CODE_FOR_lasx_xvsub_d): Ditto.
	(CODE_FOR_lasx_xvsubi_bu): Ditto.
	(CODE_FOR_lasx_xvsubi_hu): Ditto.
	(CODE_FOR_lasx_xvsubi_wu): Ditto.
	(CODE_FOR_lasx_xvsubi_du): Ditto.
	(CODE_FOR_lasx_xvpackod_d): Ditto.
	(CODE_FOR_lasx_xvpackev_d): Ditto.
	(CODE_FOR_lasx_xvpickod_d): Ditto.
	(CODE_FOR_lasx_xvpickev_d): Ditto.
	(CODE_FOR_lasx_xvrepli_b): Ditto.
	(CODE_FOR_lasx_xvrepli_h): Ditto.
	(CODE_FOR_lasx_xvrepli_w): Ditto.
	(CODE_FOR_lasx_xvrepli_d): Ditto.
	(CODE_FOR_lasx_xvandn_v): Ditto.
	(CODE_FOR_lasx_xvorn_v): Ditto.
	(CODE_FOR_lasx_xvneg_b): Ditto.
	(CODE_FOR_lasx_xvneg_h): Ditto.
	(CODE_FOR_lasx_xvneg_w): Ditto.
	(CODE_FOR_lasx_xvneg_d): Ditto.
	(CODE_FOR_lasx_xvbsrl_v): Ditto.
	(CODE_FOR_lasx_xvbsll_v): Ditto.
	(CODE_FOR_lasx_xvfmadd_s): Ditto.
	(CODE_FOR_lasx_xvfmadd_d): Ditto.
	(CODE_FOR_lasx_xvfmsub_s): Ditto.
	(CODE_FOR_lasx_xvfmsub_d): Ditto.
	(CODE_FOR_lasx_xvfnmadd_s): Ditto.
	(CODE_FOR_lasx_xvfnmadd_d): Ditto.
	(CODE_FOR_lasx_xvfnmsub_s): Ditto.
	(CODE_FOR_lasx_xvfnmsub_d): Ditto.
	(CODE_FOR_lasx_xvpermi_q): Ditto.
	(CODE_FOR_lasx_xvpermi_d): Ditto.
	(CODE_FOR_lasx_xbnz_v): Ditto.
	(CODE_FOR_lasx_xbz_v): Ditto.
	(CODE_FOR_lasx_xvssub_b): Ditto.
	(CODE_FOR_lasx_xvssub_h): Ditto.
	(CODE_FOR_lasx_xvssub_w): Ditto.
	(CODE_FOR_lasx_xvssub_d): Ditto.
	(CODE_FOR_lasx_xvssub_bu): Ditto.
	(CODE_FOR_lasx_xvssub_hu): Ditto.
	(CODE_FOR_lasx_xvssub_wu): Ditto.
	(CODE_FOR_lasx_xvssub_du): Ditto.
	(CODE_FOR_lasx_xvabsd_b): Ditto.
	(CODE_FOR_lasx_xvabsd_h): Ditto.
	(CODE_FOR_lasx_xvabsd_w): Ditto.
	(CODE_FOR_lasx_xvabsd_d): Ditto.
	(CODE_FOR_lasx_xvabsd_bu): Ditto.
	(CODE_FOR_lasx_xvabsd_hu): Ditto.
	(CODE_FOR_lasx_xvabsd_wu): Ditto.
	(CODE_FOR_lasx_xvabsd_du): Ditto.
	(CODE_FOR_lasx_xvavg_b): Ditto.
	(CODE_FOR_lasx_xvavg_h): Ditto.
	(CODE_FOR_lasx_xvavg_w): Ditto.
	(CODE_FOR_lasx_xvavg_d): Ditto.
	(CODE_FOR_lasx_xvavg_bu): Ditto.
	(CODE_FOR_lasx_xvavg_hu): Ditto.
	(CODE_FOR_lasx_xvavg_wu): Ditto.
	(CODE_FOR_lasx_xvavg_du): Ditto.
	(CODE_FOR_lasx_xvavgr_b): Ditto.
	(CODE_FOR_lasx_xvavgr_h): Ditto.
	(CODE_FOR_lasx_xvavgr_w): Ditto.
	(CODE_FOR_lasx_xvavgr_d): Ditto.
	(CODE_FOR_lasx_xvavgr_bu): Ditto.
	(CODE_FOR_lasx_xvavgr_hu): Ditto.
	(CODE_FOR_lasx_xvavgr_wu): Ditto.
	(CODE_FOR_lasx_xvavgr_du): Ditto.
	(CODE_FOR_lasx_xvmuh_b): Ditto.
	(CODE_FOR_lasx_xvmuh_h): Ditto.
	(CODE_FOR_lasx_xvmuh_w): Ditto.
	(CODE_FOR_lasx_xvmuh_d): Ditto.
	(CODE_FOR_lasx_xvmuh_bu): Ditto.
	(CODE_FOR_lasx_xvmuh_hu): Ditto.
	(CODE_FOR_lasx_xvmuh_wu): Ditto.
	(CODE_FOR_lasx_xvmuh_du): Ditto.
	(CODE_FOR_lasx_xvssran_b_h): Ditto.
	(CODE_FOR_lasx_xvssran_h_w): Ditto.
	(CODE_FOR_lasx_xvssran_w_d): Ditto.
	(CODE_FOR_lasx_xvssran_bu_h): Ditto.
	(CODE_FOR_lasx_xvssran_hu_w): Ditto.
	(CODE_FOR_lasx_xvssran_wu_d): Ditto.
	(CODE_FOR_lasx_xvssrarn_b_h): Ditto.
	(CODE_FOR_lasx_xvssrarn_h_w): Ditto.
	(CODE_FOR_lasx_xvssrarn_w_d): Ditto.
	(CODE_FOR_lasx_xvssrarn_bu_h): Ditto.
	(CODE_FOR_lasx_xvssrarn_hu_w): Ditto.
	(CODE_FOR_lasx_xvssrarn_wu_d): Ditto.
	(CODE_FOR_lasx_xvssrln_bu_h): Ditto.
	(CODE_FOR_lasx_xvssrln_hu_w): Ditto.
	(CODE_FOR_lasx_xvssrln_wu_d): Ditto.
	(CODE_FOR_lasx_xvssrlrn_bu_h): Ditto.
	(CODE_FOR_lasx_xvssrlrn_hu_w): Ditto.
	(CODE_FOR_lasx_xvssrlrn_wu_d): Ditto.
	(CODE_FOR_lasx_xvftint_w_s): Ditto.
	(CODE_FOR_lasx_xvftint_l_d): Ditto.
	(CODE_FOR_lasx_xvftint_wu_s): Ditto.
	(CODE_FOR_lasx_xvftint_lu_d): Ditto.
	(CODE_FOR_lasx_xvsllwil_h_b): Ditto.
	(CODE_FOR_lasx_xvsllwil_w_h): Ditto.
	(CODE_FOR_lasx_xvsllwil_d_w): Ditto.
	(CODE_FOR_lasx_xvsllwil_hu_bu): Ditto.
	(CODE_FOR_lasx_xvsllwil_wu_hu): Ditto.
	(CODE_FOR_lasx_xvsllwil_du_wu): Ditto.
	(CODE_FOR_lasx_xvsat_b): Ditto.
	(CODE_FOR_lasx_xvsat_h): Ditto.
	(CODE_FOR_lasx_xvsat_w): Ditto.
	(CODE_FOR_lasx_xvsat_d): Ditto.
	(CODE_FOR_lasx_xvsat_bu): Ditto.
	(CODE_FOR_lasx_xvsat_hu): Ditto.
	(CODE_FOR_lasx_xvsat_wu): Ditto.
	(CODE_FOR_lasx_xvsat_du): Ditto.
	(loongarch_builtin_vectorized_function): Ditto.
	(loongarch_expand_builtin_insn): Ditto.
	(loongarch_expand_builtin): Ditto.
	* config/loongarch/loongarch-ftypes.def (1): Ditto.
	(2): Ditto.
	(3): Ditto.
	(4): Ditto.
	* config/loongarch/lasxintrin.h: New file.
---
 gcc/config.gcc                             |    2 +-
 gcc/config/loongarch/lasxintrin.h          | 5338 ++++++++++++++++++++
 gcc/config/loongarch/loongarch-builtins.cc | 1180 ++++-
 gcc/config/loongarch/loongarch-ftypes.def  |  271 +-
 4 files changed, 6788 insertions(+), 3 deletions(-)
 create mode 100644 gcc/config/loongarch/lasxintrin.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 8fda51e157c..ae6fe50590c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -469,7 +469,7 @@ mips*-*-*)
 	;;
 loongarch*-*-*)
 	cpu_type=loongarch
-	extra_headers="larchintrin.h lsxintrin.h"
+	extra_headers="larchintrin.h lsxintrin.h lasxintrin.h"
 	extra_objs="loongarch-c.o loongarch-builtins.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
 	extra_gcc_objs="loongarch-driver.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
 	extra_options="${extra_options} g.opt fused-madd.opt"
diff --git a/gcc/config/loongarch/lasxintrin.h b/gcc/config/loongarch/lasxintrin.h
new file mode 100644
index 00000000000..d3937992746
--- /dev/null
+++ b/gcc/config/loongarch/lasxintrin.h
@@ -0,0 +1,5338 @@
+/* LARCH Loongson ASX intrinsics include file.
+
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _GCC_LOONGSON_ASXINTRIN_H
+#define _GCC_LOONGSON_ASXINTRIN_H 1
+
+#if defined(__loongarch_asx)
+
+typedef signed char v32i8 __attribute__ ((vector_size(32), aligned(32)));
+typedef signed char v32i8_b __attribute__ ((vector_size(32), aligned(1)));
+typedef unsigned char v32u8 __attribute__ ((vector_size(32), aligned(32)));
+typedef unsigned char v32u8_b __attribute__ ((vector_size(32), aligned(1)));
+typedef short v16i16 __attribute__ ((vector_size(32), aligned(32)));
+typedef short v16i16_h __attribute__ ((vector_size(32), aligned(2)));
+typedef unsigned short v16u16 __attribute__ ((vector_size(32), aligned(32)));
+typedef unsigned short v16u16_h __attribute__ ((vector_size(32), aligned(2)));
+typedef int v8i32 __attribute__ ((vector_size(32), aligned(32)));
+typedef int v8i32_w __attribute__ ((vector_size(32), aligned(4)));
+typedef unsigned int v8u32 __attribute__ ((vector_size(32), aligned(32)));
+typedef unsigned int v8u32_w __attribute__ ((vector_size(32), aligned(4)));
+typedef long long v4i64 __attribute__ ((vector_size(32), aligned(32)));
+typedef long long v4i64_d __attribute__ ((vector_size(32), aligned(8)));
+typedef unsigned long long v4u64 __attribute__ ((vector_size(32), aligned(32)));
+typedef unsigned long long v4u64_d __attribute__ ((vector_size(32), aligned(8)));
+typedef float v8f32 __attribute__ ((vector_size(32), aligned(32)));
+typedef float v8f32_w __attribute__ ((vector_size(32), aligned(4)));
+typedef double v4f64 __attribute__ ((vector_size(32), aligned(32)));
+typedef double v4f64_d __attribute__ ((vector_size(32), aligned(8)));
+typedef float __m256 __attribute__ ((__vector_size__ (32),
+				     __may_alias__));
+typedef long long __m256i __attribute__ ((__vector_size__ (32),
+					  __may_alias__));
+typedef double __m256d __attribute__ ((__vector_size__ (32),
+				       __may_alias__));
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsll_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsll_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsll_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsll_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsll_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsll_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsll_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsll_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvslli_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvslli_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvslli_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvslli_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvslli_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslli_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvslli_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvslli_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsra_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsra_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsra_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsra_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsra_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsra_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsra_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsra_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvsrai_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrai_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvsrai_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrai_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvsrai_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrai_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvsrai_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrai_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrar_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrar_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrar_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrar_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrar_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrar_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrar_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrar_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvsrari_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrari_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvsrari_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrari_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvsrari_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrari_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvsrari_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrari_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrl_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrl_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrl_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrl_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrl_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrl_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrl_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrl_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvsrli_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrli_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvsrli_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrli_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvsrli_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrli_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvsrli_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrli_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrlr_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrlr_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrlr_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrlr_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrlr_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrlr_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrlr_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrlr_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvsrlri_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrlri_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvsrlri_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrlri_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvsrlri_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrlri_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvsrlri_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrlri_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitclr_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitclr_b ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitclr_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitclr_h ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitclr_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitclr_w ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitclr_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitclr_d ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvbitclri_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitclri_b ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UQI.  */
+#define __lasx_xvbitclri_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitclri_h ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UQI.  */
+#define __lasx_xvbitclri_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitclri_w ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UQI.  */
+#define __lasx_xvbitclri_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitclri_d ((v4u64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitset_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitset_b ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitset_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitset_h ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitset_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitset_w ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitset_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitset_d ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvbitseti_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitseti_b ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UQI.  */
+#define __lasx_xvbitseti_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitseti_h ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UQI.  */
+#define __lasx_xvbitseti_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitseti_w ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UQI.  */
+#define __lasx_xvbitseti_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitseti_d ((v4u64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitrev_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitrev_b ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitrev_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitrev_h ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitrev_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitrev_w ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitrev_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvbitrev_d ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvbitrevi_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitrevi_b ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UQI.  */
+#define __lasx_xvbitrevi_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitrevi_h ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UQI.  */
+#define __lasx_xvbitrevi_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitrevi_w ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UQI.  */
+#define __lasx_xvbitrevi_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvbitrevi_d ((v4u64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadd_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadd_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadd_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadd_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadd_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadd_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadd_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadd_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvaddi_bu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvaddi_bu ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvaddi_hu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvaddi_hu ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvaddi_wu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvaddi_wu ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvaddi_du(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvaddi_du ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsub_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsub_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsub_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsub_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsub_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsub_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsub_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsub_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvsubi_bu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsubi_bu ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvsubi_hu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsubi_hu ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvsubi_wu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsubi_wu ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvsubi_du(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsubi_du ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmax_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmax_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmax_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmax_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmax_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmax_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmax_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmax_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V32QI, V32QI, QI.  */
+#define __lasx_xvmaxi_b(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmaxi_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V16HI, V16HI, QI.  */
+#define __lasx_xvmaxi_h(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmaxi_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V8SI, V8SI, QI.  */
+#define __lasx_xvmaxi_w(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmaxi_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V4DI, V4DI, QI.  */
+#define __lasx_xvmaxi_d(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmaxi_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmax_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmax_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmax_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmax_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmax_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmax_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmax_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmax_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvmaxi_bu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmaxi_bu ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UQI.  */
+#define __lasx_xvmaxi_hu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmaxi_hu ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UQI.  */
+#define __lasx_xvmaxi_wu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmaxi_wu ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UQI.  */
+#define __lasx_xvmaxi_du(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmaxi_du ((v4u64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmin_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmin_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmin_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmin_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmin_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmin_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmin_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmin_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V32QI, V32QI, QI.  */
+#define __lasx_xvmini_b(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmini_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V16HI, V16HI, QI.  */
+#define __lasx_xvmini_h(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmini_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V8SI, V8SI, QI.  */
+#define __lasx_xvmini_w(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmini_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V4DI, V4DI, QI.  */
+#define __lasx_xvmini_d(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmini_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmin_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmin_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmin_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmin_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmin_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmin_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmin_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmin_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvmini_bu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmini_bu ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UQI.  */
+#define __lasx_xvmini_hu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmini_hu ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UQI.  */
+#define __lasx_xvmini_wu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmini_wu ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UQI.  */
+#define __lasx_xvmini_du(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvmini_du ((v4u64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvseq_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvseq_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvseq_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvseq_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvseq_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvseq_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvseq_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvseq_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V32QI, V32QI, QI.  */
+#define __lasx_xvseqi_b(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvseqi_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V16HI, V16HI, QI.  */
+#define __lasx_xvseqi_h(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvseqi_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V8SI, V8SI, QI.  */
+#define __lasx_xvseqi_w(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvseqi_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V4DI, V4DI, QI.  */
+#define __lasx_xvseqi_d(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvseqi_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvslt_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvslt_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvslt_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvslt_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvslt_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvslt_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvslt_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvslt_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V32QI, V32QI, QI.  */
+#define __lasx_xvslti_b(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslti_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V16HI, V16HI, QI.  */
+#define __lasx_xvslti_h(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslti_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V8SI, V8SI, QI.  */
+#define __lasx_xvslti_w(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslti_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V4DI, V4DI, QI.  */
+#define __lasx_xvslti_d(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslti_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvslt_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvslt_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvslt_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvslt_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvslt_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvslt_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvslt_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvslt_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V32QI, UV32QI, UQI.  */
+#define __lasx_xvslti_bu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslti_bu ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, UV16HI, UQI.  */
+#define __lasx_xvslti_hu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslti_hu ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, UV8SI, UQI.  */
+#define __lasx_xvslti_wu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslti_wu ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UQI.  */
+#define __lasx_xvslti_du(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslti_du ((v4u64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsle_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsle_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsle_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsle_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsle_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsle_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsle_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsle_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V32QI, V32QI, QI.  */
+#define __lasx_xvslei_b(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslei_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V16HI, V16HI, QI.  */
+#define __lasx_xvslei_h(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslei_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V8SI, V8SI, QI.  */
+#define __lasx_xvslei_w(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslei_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, si5.  */
+/* Data types in instruction templates:  V4DI, V4DI, QI.  */
+#define __lasx_xvslei_d(/*__m256i*/ _1, /*si5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslei_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsle_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsle_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsle_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsle_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsle_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsle_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsle_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsle_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V32QI, UV32QI, UQI.  */
+#define __lasx_xvslei_bu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslei_bu ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, UV16HI, UQI.  */
+#define __lasx_xvslei_hu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslei_hu ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, UV8SI, UQI.  */
+#define __lasx_xvslei_wu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslei_wu ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UQI.  */
+#define __lasx_xvslei_du(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslei_du ((v4u64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvsat_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsat_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvsat_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsat_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvsat_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsat_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvsat_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvsat_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvsat_bu(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsat_bu ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UQI.  */
+#define __lasx_xvsat_hu(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsat_hu ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UQI.  */
+#define __lasx_xvsat_wu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsat_wu ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UQI.  */
+#define __lasx_xvsat_du(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvsat_du ((v4u64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadda_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadda_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadda_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadda_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadda_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadda_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadda_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadda_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsadd_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsadd_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsadd_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsadd_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsadd_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsadd_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsadd_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsadd_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsadd_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsadd_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsadd_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsadd_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsadd_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsadd_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsadd_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsadd_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavg_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavg_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavg_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavg_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavg_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavg_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavg_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavg_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavg_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavg_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavg_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavg_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavg_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavg_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavg_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavg_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavgr_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavgr_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavgr_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavgr_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavgr_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavgr_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavgr_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavgr_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavgr_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavgr_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavgr_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavgr_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavgr_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavgr_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvavgr_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvavgr_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssub_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssub_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssub_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssub_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssub_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssub_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssub_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssub_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssub_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssub_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssub_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssub_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssub_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssub_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssub_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssub_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvabsd_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvabsd_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvabsd_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvabsd_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvabsd_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvabsd_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvabsd_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvabsd_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvabsd_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvabsd_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvabsd_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvabsd_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvabsd_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvabsd_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvabsd_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvabsd_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmul_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmul_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmul_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmul_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmul_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmul_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmul_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmul_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmadd_b (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmadd_b ((v32i8)_1, (v32i8)_2, (v32i8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmadd_h (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmadd_h ((v16i16)_1, (v16i16)_2, (v16i16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmadd_w (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmadd_w ((v8i32)_1, (v8i32)_2, (v8i32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmadd_d (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmadd_d ((v4i64)_1, (v4i64)_2, (v4i64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmsub_b (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmsub_b ((v32i8)_1, (v32i8)_2, (v32i8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmsub_h (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmsub_h ((v16i16)_1, (v16i16)_2, (v16i16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmsub_w (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmsub_w ((v8i32)_1, (v8i32)_2, (v8i32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmsub_d (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmsub_d ((v4i64)_1, (v4i64)_2, (v4i64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvdiv_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvdiv_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvdiv_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvdiv_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvdiv_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvdiv_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvdiv_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvdiv_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvdiv_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvdiv_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvdiv_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvdiv_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvdiv_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvdiv_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvdiv_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvdiv_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhaddw_h_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhaddw_h_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhaddw_w_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhaddw_w_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhaddw_d_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhaddw_d_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhaddw_hu_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhaddw_hu_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhaddw_wu_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhaddw_wu_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhaddw_du_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhaddw_du_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhsubw_h_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhsubw_h_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhsubw_w_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhsubw_w_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhsubw_d_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhsubw_d_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhsubw_hu_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhsubw_hu_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhsubw_wu_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhsubw_wu_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhsubw_du_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhsubw_du_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmod_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmod_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmod_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmod_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmod_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmod_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmod_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmod_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmod_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmod_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmod_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmod_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmod_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmod_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmod_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmod_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvrepl128vei_b(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvrepl128vei_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvrepl128vei_h(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvrepl128vei_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui2.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvrepl128vei_w(/*__m256i*/ _1, /*ui2*/ _2) \
+  ((__m256i)__builtin_lasx_xvrepl128vei_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui1.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvrepl128vei_d(/*__m256i*/ _1, /*ui1*/ _2) \
+  ((__m256i)__builtin_lasx_xvrepl128vei_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpickev_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpickev_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpickev_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpickev_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpickev_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpickev_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpickev_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpickev_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpickod_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpickod_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpickod_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpickod_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpickod_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpickod_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpickod_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpickod_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvilvh_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvilvh_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvilvh_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvilvh_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvilvh_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvilvh_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvilvh_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvilvh_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvilvl_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvilvl_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvilvl_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvilvl_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvilvl_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvilvl_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvilvl_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvilvl_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpackev_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpackev_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpackev_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpackev_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpackev_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpackev_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpackev_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpackev_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpackod_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpackod_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpackod_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpackod_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpackod_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpackod_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpackod_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvpackod_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvshuf_b (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvshuf_b ((v32i8)_1, (v32i8)_2, (v32i8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvshuf_h (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvshuf_h ((v16i16)_1, (v16i16)_2, (v16i16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvshuf_w (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvshuf_w ((v8i32)_1, (v8i32)_2, (v8i32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvshuf_d (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvshuf_d ((v4i64)_1, (v4i64)_2, (v4i64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvand_v (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvand_v ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvandi_b(/*__m256i*/ _1, /*ui8*/ _2) \
+  ((__m256i)__builtin_lasx_xvandi_b ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvor_v (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvor_v ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvori_b(/*__m256i*/ _1, /*ui8*/ _2) \
+  ((__m256i)__builtin_lasx_xvori_b ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvnor_v (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvnor_v ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvnori_b(/*__m256i*/ _1, /*ui8*/ _2) \
+  ((__m256i)__builtin_lasx_xvnori_b ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvxor_v (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvxor_v ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UQI.  */
+#define __lasx_xvxori_b(/*__m256i*/ _1, /*ui8*/ _2) \
+  ((__m256i)__builtin_lasx_xvxori_b ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvbitsel_v (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvbitsel_v ((v32u8)_1, (v32u8)_2, (v32u8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI, USI.  */
+#define __lasx_xvbitseli_b(/*__m256i*/ _1, /*__m256i*/ _2, /*ui8*/ _3) \
+  ((__m256i)__builtin_lasx_xvbitseli_b ((v32u8)(_1), (v32u8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V32QI, V32QI, USI.  */
+#define __lasx_xvshuf4i_b(/*__m256i*/ _1, /*ui8*/ _2) \
+  ((__m256i)__builtin_lasx_xvshuf4i_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V16HI, V16HI, USI.  */
+#define __lasx_xvshuf4i_h(/*__m256i*/ _1, /*ui8*/ _2) \
+  ((__m256i)__builtin_lasx_xvshuf4i_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V8SI, V8SI, USI.  */
+#define __lasx_xvshuf4i_w(/*__m256i*/ _1, /*ui8*/ _2) \
+  ((__m256i)__builtin_lasx_xvshuf4i_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, rj.  */
+/* Data types in instruction templates:  V32QI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplgr2vr_b (int _1)
+{
+  return (__m256i)__builtin_lasx_xvreplgr2vr_b ((int)_1);
+}
+
+/* Assembly instruction format:	xd, rj.  */
+/* Data types in instruction templates:  V16HI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplgr2vr_h (int _1)
+{
+  return (__m256i)__builtin_lasx_xvreplgr2vr_h ((int)_1);
+}
+
+/* Assembly instruction format:	xd, rj.  */
+/* Data types in instruction templates:  V8SI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplgr2vr_w (int _1)
+{
+  return (__m256i)__builtin_lasx_xvreplgr2vr_w ((int)_1);
+}
+
+/* Assembly instruction format:	xd, rj.  */
+/* Data types in instruction templates:  V4DI, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplgr2vr_d (long int _1)
+{
+  return (__m256i)__builtin_lasx_xvreplgr2vr_d ((long int)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpcnt_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvpcnt_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpcnt_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvpcnt_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpcnt_w (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvpcnt_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvpcnt_d (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvpcnt_d ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvclo_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvclo_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvclo_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvclo_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvclo_w (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvclo_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvclo_d (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvclo_d ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvclz_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvclz_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvclz_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvclz_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvclz_w (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvclz_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvclz_d (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvclz_d ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfadd_s (__m256 _1, __m256 _2)
+{
+  return (__m256)__builtin_lasx_xvfadd_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfadd_d (__m256d _1, __m256d _2)
+{
+  return (__m256d)__builtin_lasx_xvfadd_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfsub_s (__m256 _1, __m256 _2)
+{
+  return (__m256)__builtin_lasx_xvfsub_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfsub_d (__m256d _1, __m256d _2)
+{
+  return (__m256d)__builtin_lasx_xvfsub_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfmul_s (__m256 _1, __m256 _2)
+{
+  return (__m256)__builtin_lasx_xvfmul_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfmul_d (__m256d _1, __m256d _2)
+{
+  return (__m256d)__builtin_lasx_xvfmul_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfdiv_s (__m256 _1, __m256 _2)
+{
+  return (__m256)__builtin_lasx_xvfdiv_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfdiv_d (__m256d _1, __m256d _2)
+{
+  return (__m256d)__builtin_lasx_xvfdiv_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcvt_h_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcvt_h_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfcvt_s_d (__m256d _1, __m256d _2)
+{
+  return (__m256)__builtin_lasx_xvfcvt_s_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfmin_s (__m256 _1, __m256 _2)
+{
+  return (__m256)__builtin_lasx_xvfmin_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfmin_d (__m256d _1, __m256d _2)
+{
+  return (__m256d)__builtin_lasx_xvfmin_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfmina_s (__m256 _1, __m256 _2)
+{
+  return (__m256)__builtin_lasx_xvfmina_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfmina_d (__m256d _1, __m256d _2)
+{
+  return (__m256d)__builtin_lasx_xvfmina_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfmax_s (__m256 _1, __m256 _2)
+{
+  return (__m256)__builtin_lasx_xvfmax_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfmax_d (__m256d _1, __m256d _2)
+{
+  return (__m256d)__builtin_lasx_xvfmax_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfmaxa_s (__m256 _1, __m256 _2)
+{
+  return (__m256)__builtin_lasx_xvfmaxa_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfmaxa_d (__m256d _1, __m256d _2)
+{
+  return (__m256d)__builtin_lasx_xvfmaxa_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfclass_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvfclass_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfclass_d (__m256d _1)
+{
+  return (__m256i)__builtin_lasx_xvfclass_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfsqrt_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfsqrt_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfsqrt_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfsqrt_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrecip_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrecip_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrecip_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrecip_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrint_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrint_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrint_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrint_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrsqrt_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrsqrt_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrsqrt_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrsqrt_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvflogb_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvflogb_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvflogb_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvflogb_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfcvth_s_h (__m256i _1)
+{
+  return (__m256)__builtin_lasx_xvfcvth_s_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfcvth_d_s (__m256 _1)
+{
+  return (__m256d)__builtin_lasx_xvfcvth_d_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfcvtl_s_h (__m256i _1)
+{
+  return (__m256)__builtin_lasx_xvfcvtl_s_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfcvtl_d_s (__m256 _1)
+{
+  return (__m256d)__builtin_lasx_xvfcvtl_d_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftint_w_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftint_w_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftint_l_d (__m256d _1)
+{
+  return (__m256i)__builtin_lasx_xvftint_l_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftint_wu_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftint_wu_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftint_lu_d (__m256d _1)
+{
+  return (__m256i)__builtin_lasx_xvftint_lu_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrz_w_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrz_w_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrz_l_d (__m256d _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrz_l_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrz_wu_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrz_wu_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrz_lu_d (__m256d _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrz_lu_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvffint_s_w (__m256i _1)
+{
+  return (__m256)__builtin_lasx_xvffint_s_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvffint_d_l (__m256i _1)
+{
+  return (__m256d)__builtin_lasx_xvffint_d_l ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SF, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvffint_s_wu (__m256i _1)
+{
+  return (__m256)__builtin_lasx_xvffint_s_wu ((v8u32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvffint_d_lu (__m256i _1)
+{
+  return (__m256d)__builtin_lasx_xvffint_d_lu ((v4u64)_1);
+}
+
+/* Assembly instruction format:	xd, xj, rk.  */
+/* Data types in instruction templates:  V32QI, V32QI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve_b (__m256i _1, int _2)
+{
+  return (__m256i)__builtin_lasx_xvreplve_b ((v32i8)_1, (int)_2);
+}
+
+/* Assembly instruction format:	xd, xj, rk.  */
+/* Data types in instruction templates:  V16HI, V16HI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve_h (__m256i _1, int _2)
+{
+  return (__m256i)__builtin_lasx_xvreplve_h ((v16i16)_1, (int)_2);
+}
+
+/* Assembly instruction format:	xd, xj, rk.  */
+/* Data types in instruction templates:  V8SI, V8SI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve_w (__m256i _1, int _2)
+{
+  return (__m256i)__builtin_lasx_xvreplve_w ((v8i32)_1, (int)_2);
+}
+
+/* Assembly instruction format:	xd, xj, rk.  */
+/* Data types in instruction templates:  V4DI, V4DI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve_d (__m256i _1, int _2)
+{
+  return (__m256i)__builtin_lasx_xvreplve_d ((v4i64)_1, (int)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvpermi_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui8*/ _3) \
+  ((__m256i)__builtin_lasx_xvpermi_w ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvandn_v (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvandn_v ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvneg_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvneg_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvneg_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvneg_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvneg_w (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvneg_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvneg_d (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvneg_d ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmuh_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmuh_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmuh_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmuh_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmuh_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmuh_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmuh_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmuh_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmuh_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmuh_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmuh_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmuh_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmuh_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmuh_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmuh_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmuh_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V16HI, V32QI, UQI.  */
+#define __lasx_xvsllwil_h_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsllwil_h_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V8SI, V16HI, UQI.  */
+#define __lasx_xvsllwil_w_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsllwil_w_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V4DI, V8SI, UQI.  */
+#define __lasx_xvsllwil_d_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsllwil_d_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  UV16HI, UV32QI, UQI.  */
+#define __lasx_xvsllwil_hu_bu(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsllwil_hu_bu ((v32u8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV8SI, UV16HI, UQI.  */
+#define __lasx_xvsllwil_wu_hu(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsllwil_wu_hu ((v16u16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV4DI, UV8SI, UQI.  */
+#define __lasx_xvsllwil_du_wu(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsllwil_du_wu ((v8u32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsran_b_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsran_b_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsran_h_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsran_h_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsran_w_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsran_w_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssran_b_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssran_b_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssran_h_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssran_h_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssran_w_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssran_w_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssran_bu_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssran_bu_h ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssran_hu_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssran_hu_w ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssran_wu_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssran_wu_d ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrarn_b_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrarn_b_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrarn_h_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrarn_h_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrarn_w_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrarn_w_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrarn_b_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrarn_b_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrarn_h_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrarn_h_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrarn_w_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrarn_w_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrarn_bu_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrarn_bu_h ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrarn_hu_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrarn_hu_w ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrarn_wu_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrarn_wu_d ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrln_b_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrln_b_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrln_h_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrln_h_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrln_w_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrln_w_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrln_bu_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrln_bu_h ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrln_hu_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrln_hu_w ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrln_wu_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrln_wu_d ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrlrn_b_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrlrn_b_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrlrn_h_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrlrn_h_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrlrn_w_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrlrn_w_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV32QI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrlrn_bu_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrlrn_bu_h ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrlrn_hu_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrlrn_hu_w ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrlrn_wu_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrlrn_wu_d ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, UQI.  */
+#define __lasx_xvfrstpi_b(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvfrstpi_b ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, UQI.  */
+#define __lasx_xvfrstpi_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvfrstpi_h ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfrstp_b (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvfrstp_b ((v32i8)_1, (v32i8)_2, (v32i8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfrstp_h (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvfrstp_h ((v16i16)_1, (v16i16)_2, (v16i16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvshuf4i_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui8*/ _3) \
+  ((__m256i)__builtin_lasx_xvshuf4i_d ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvbsrl_v(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvbsrl_v ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvbsll_v(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvbsll_v ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvextrins_b(/*__m256i*/ _1, /*__m256i*/ _2, /*ui8*/ _3) \
+  ((__m256i)__builtin_lasx_xvextrins_b ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvextrins_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui8*/ _3) \
+  ((__m256i)__builtin_lasx_xvextrins_h ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvextrins_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui8*/ _3) \
+  ((__m256i)__builtin_lasx_xvextrins_w ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvextrins_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui8*/ _3) \
+  ((__m256i)__builtin_lasx_xvextrins_d ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmskltz_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvmskltz_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmskltz_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvmskltz_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmskltz_w (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvmskltz_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmskltz_d (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvmskltz_d ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsigncov_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsigncov_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsigncov_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsigncov_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsigncov_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsigncov_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsigncov_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsigncov_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfmadd_s (__m256 _1, __m256 _2, __m256 _3)
+{
+  return (__m256)__builtin_lasx_xvfmadd_s ((v8f32)_1, (v8f32)_2, (v8f32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfmadd_d (__m256d _1, __m256d _2, __m256d _3)
+{
+  return (__m256d)__builtin_lasx_xvfmadd_d ((v4f64)_1, (v4f64)_2, (v4f64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfmsub_s (__m256 _1, __m256 _2, __m256 _3)
+{
+  return (__m256)__builtin_lasx_xvfmsub_s ((v8f32)_1, (v8f32)_2, (v8f32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfmsub_d (__m256d _1, __m256d _2, __m256d _3)
+{
+  return (__m256d)__builtin_lasx_xvfmsub_d ((v4f64)_1, (v4f64)_2, (v4f64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfnmadd_s (__m256 _1, __m256 _2, __m256 _3)
+{
+  return (__m256)__builtin_lasx_xvfnmadd_s ((v8f32)_1, (v8f32)_2, (v8f32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfnmadd_d (__m256d _1, __m256d _2, __m256d _3)
+{
+  return (__m256d)__builtin_lasx_xvfnmadd_d ((v4f64)_1, (v4f64)_2, (v4f64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V8SF, V8SF, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfnmsub_s (__m256 _1, __m256 _2, __m256 _3)
+{
+  return (__m256)__builtin_lasx_xvfnmsub_s ((v8f32)_1, (v8f32)_2, (v8f32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk, xa.  */
+/* Data types in instruction templates:  V4DF, V4DF, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfnmsub_d (__m256d _1, __m256d _2, __m256d _3)
+{
+  return (__m256d)__builtin_lasx_xvfnmsub_d ((v4f64)_1, (v4f64)_2, (v4f64)_3);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrne_w_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrne_w_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrne_l_d (__m256d _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrne_l_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrp_w_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrp_w_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrp_l_d (__m256d _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrp_l_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrm_w_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrm_w_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrm_l_d (__m256d _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrm_l_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftint_w_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvftint_w_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SF, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvffint_s_l (__m256i _1, __m256i _2)
+{
+  return (__m256)__builtin_lasx_xvffint_s_l ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrz_w_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvftintrz_w_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrp_w_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvftintrp_w_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrm_w_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvftintrm_w_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrne_w_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvftintrne_w_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftinth_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftinth_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintl_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintl_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvffinth_d_w (__m256i _1)
+{
+  return (__m256d)__builtin_lasx_xvffinth_d_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DF, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvffintl_d_w (__m256i _1)
+{
+  return (__m256d)__builtin_lasx_xvffintl_d_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrzh_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrzh_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrzl_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrzl_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrph_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrph_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrpl_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrpl_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrmh_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrmh_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrml_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrml_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrneh_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrneh_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvftintrnel_l_s (__m256 _1)
+{
+  return (__m256i)__builtin_lasx_xvftintrnel_l_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrintrne_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrintrne_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrintrne_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrintrne_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrintrz_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrintrz_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrintrz_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrintrz_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrintrp_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrintrp_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrintrp_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrintrp_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrintrm_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrintrm_s ((v8f32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrintrm_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrintrm_d ((v4f64)_1);
+}
+
+/* Assembly instruction format:	xd, rj, si12.  */
+/* Data types in instruction templates:  V32QI, CVPOINTER, SI.  */
+#define __lasx_xvld(/*void **/ _1, /*si12*/ _2) \
+  ((__m256i)__builtin_lasx_xvld ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	xd, rj, si12.  */
+/* Data types in instruction templates:  VOID, V32QI, CVPOINTER, SI.  */
+#define __lasx_xvst(/*__m256i*/ _1, /*void **/ _2, /*si12*/ _3) \
+  ((void)__builtin_lasx_xvst ((v32i8)(_1), (void *)(_2), (_3)))
+
+/* Assembly instruction format:	xd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V32QI, CVPOINTER, SI, UQI.  */
+#define __lasx_xvstelm_b(/*__m256i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lasx_xvstelm_b ((v32i8)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	xd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V16HI, CVPOINTER, SI, UQI.  */
+#define __lasx_xvstelm_h(/*__m256i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lasx_xvstelm_h ((v16i16)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	xd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V8SI, CVPOINTER, SI, UQI.  */
+#define __lasx_xvstelm_w(/*__m256i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lasx_xvstelm_w ((v8i32)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	xd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V4DI, CVPOINTER, SI, UQI.  */
+#define __lasx_xvstelm_d(/*__m256i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lasx_xvstelm_d ((v4i64)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, UQI.  */
+#define __lasx_xvinsve0_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui3*/ _3) \
+  ((__m256i)__builtin_lasx_xvinsve0_w ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui2.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, UQI.  */
+#define __lasx_xvinsve0_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui2*/ _3) \
+  ((__m256i)__builtin_lasx_xvinsve0_d ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvpickve_w(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvpickve_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui2.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvpickve_d(/*__m256i*/ _1, /*ui2*/ _2) \
+  ((__m256i)__builtin_lasx_xvpickve_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrlrn_b_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrlrn_b_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrlrn_h_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrlrn_h_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrlrn_w_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrlrn_w_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrln_b_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrln_b_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrln_h_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrln_h_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvssrln_w_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvssrln_w_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvorn_v (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvorn_v ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, i13.  */
+/* Data types in instruction templates:  V4DI, HI.  */
+#define __lasx_xvldi(/*i13*/ _1) \
+  ((__m256i)__builtin_lasx_xvldi ((_1)))
+
+/* Assembly instruction format:	xd, rj, rk.  */
+/* Data types in instruction templates:  V32QI, CVPOINTER, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvldx (void * _1, long int _2)
+{
+  return (__m256i)__builtin_lasx_xvldx ((void *)_1, (long int)_2);
+}
+
+/* Assembly instruction format:	xd, rj, rk.  */
+/* Data types in instruction templates:  VOID, V32QI, CVPOINTER, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+void __lasx_xvstx (__m256i _1, void * _2, long int _3)
+{
+  return (void)__builtin_lasx_xvstx ((v32i8)_1, (void *)_2, (long int)_3);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvextl_qu_du (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvextl_qu_du ((v4u64)_1);
+}
+
+/* Assembly instruction format:	xd, rj, ui3.  */
+/* Data types in instruction templates:  V8SI, V8SI, SI, UQI.  */
+#define __lasx_xvinsgr2vr_w(/*__m256i*/ _1, /*int*/ _2, /*ui3*/ _3) \
+  ((__m256i)__builtin_lasx_xvinsgr2vr_w ((v8i32)(_1), (int)(_2), (_3)))
+
+/* Assembly instruction format:	xd, rj, ui2.  */
+/* Data types in instruction templates:  V4DI, V4DI, DI, UQI.  */
+#define __lasx_xvinsgr2vr_d(/*__m256i*/ _1, /*long int*/ _2, /*ui2*/ _3) \
+  ((__m256i)__builtin_lasx_xvinsgr2vr_d ((v4i64)(_1), (long int)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve0_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvreplve0_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve0_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvreplve0_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve0_w (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvreplve0_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve0_d (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvreplve0_d ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvreplve0_q (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvreplve0_q ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_h_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_h_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_w_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_w_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_d_w (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_d_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_w_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_w_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_d_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_d_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_d_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_d_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_hu_bu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_hu_bu ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_wu_hu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_wu_hu ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_du_wu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_du_wu ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_wu_bu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_wu_bu ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_du_hu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_du_hu ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_vext2xv_du_bu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_vext2xv_du_bu ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvpermi_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui8*/ _3) \
+  ((__m256i)__builtin_lasx_xvpermi_q ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui8.  */
+/* Data types in instruction templates:  V4DI, V4DI, USI.  */
+#define __lasx_xvpermi_d(/*__m256i*/ _1, /*ui8*/ _2) \
+  ((__m256i)__builtin_lasx_xvpermi_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvperm_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvperm_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, rj, si12.  */
+/* Data types in instruction templates:  V32QI, CVPOINTER, SI.  */
+#define __lasx_xvldrepl_b(/*void **/ _1, /*si12*/ _2) \
+  ((__m256i)__builtin_lasx_xvldrepl_b ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	xd, rj, si11.  */
+/* Data types in instruction templates:  V16HI, CVPOINTER, SI.  */
+#define __lasx_xvldrepl_h(/*void **/ _1, /*si11*/ _2) \
+  ((__m256i)__builtin_lasx_xvldrepl_h ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	xd, rj, si10.  */
+/* Data types in instruction templates:  V8SI, CVPOINTER, SI.  */
+#define __lasx_xvldrepl_w(/*void **/ _1, /*si10*/ _2) \
+  ((__m256i)__builtin_lasx_xvldrepl_w ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	xd, rj, si9.  */
+/* Data types in instruction templates:  V4DI, CVPOINTER, SI.  */
+#define __lasx_xvldrepl_d(/*void **/ _1, /*si9*/ _2) \
+  ((__m256i)__builtin_lasx_xvldrepl_d ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	rd, xj, ui3.  */
+/* Data types in instruction templates:  SI, V8SI, UQI.  */
+#define __lasx_xvpickve2gr_w(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((int)__builtin_lasx_xvpickve2gr_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	rd, xj, ui3.  */
+/* Data types in instruction templates:  USI, V8SI, UQI.  */
+#define __lasx_xvpickve2gr_wu(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((unsigned int)__builtin_lasx_xvpickve2gr_wu ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	rd, xj, ui2.  */
+/* Data types in instruction templates:  DI, V4DI, UQI.  */
+#define __lasx_xvpickve2gr_d(/*__m256i*/ _1, /*ui2*/ _2) \
+  ((long int)__builtin_lasx_xvpickve2gr_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	rd, xj, ui2.  */
+/* Data types in instruction templates:  UDI, V4DI, UQI.  */
+#define __lasx_xvpickve2gr_du(/*__m256i*/ _1, /*ui2*/ _2) \
+  ((unsigned long int)__builtin_lasx_xvpickve2gr_du ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_q_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_q_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_d_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_d_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_w_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_w_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_h_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_h_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_q_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_q_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_d_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_d_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_w_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_w_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_h_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_h_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwev_q_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwev_q_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwev_d_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwev_d_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwev_w_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwev_w_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwev_h_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwev_h_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwev_q_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwev_q_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwev_d_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwev_d_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwev_w_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwev_w_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwev_h_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwev_h_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_q_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_q_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_d_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_d_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_w_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_w_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_h_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_h_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_q_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_q_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_d_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_d_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_w_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_w_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_h_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_h_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_q_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_q_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_d_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_d_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_w_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_w_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_h_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_h_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_q_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_q_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_d_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_d_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_w_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_w_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_h_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_h_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwod_q_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwod_q_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwod_d_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwod_d_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwod_w_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwod_w_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwod_h_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwod_h_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwod_q_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwod_q_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwod_d_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwod_d_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwod_w_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwod_w_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsubwod_h_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsubwod_h_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_q_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_q_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_d_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_d_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_w_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_w_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_h_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_h_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_q_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_q_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_d_wu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_d_wu ((v8u32)_1, (v8u32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_w_hu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_w_hu ((v16u16)_1, (v16u16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_h_bu (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_h_bu ((v32u8)_1, (v32u8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_d_wu_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_d_wu_w ((v8u32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_w_hu_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_w_hu_h ((v16u16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_h_bu_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_h_bu_b ((v32u8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_d_wu_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_d_wu_w ((v8u32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_w_hu_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_w_hu_h ((v16u16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_h_bu_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_h_bu_b ((v32u8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_d_wu_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_d_wu_w ((v8u32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_w_hu_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_w_hu_h ((v16u16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_h_bu_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_h_bu_b ((v32u8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_d_wu_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_d_wu_w ((v8u32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, UV16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_w_hu_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_w_hu_h ((v16u16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, UV32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_h_bu_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_h_bu_b ((v32u8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhaddw_q_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhaddw_q_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhaddw_qu_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhaddw_qu_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhsubw_q_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhsubw_q_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvhsubw_qu_du (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvhsubw_qu_du ((v4u64)_1, (v4u64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_q_d (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_q_d ((v4i64)_1, (v4i64)_2, (v4i64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_d_w (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_d_w ((v4i64)_1, (v8i32)_2, (v8i32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_w_h (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_w_h ((v8i32)_1, (v16i16)_2, (v16i16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_h_b (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_h_b ((v16i16)_1, (v32i8)_2, (v32i8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_q_du (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_q_du ((v4u64)_1, (v4u64)_2, (v4u64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_d_wu (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_d_wu ((v4u64)_1, (v8u32)_2, (v8u32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_w_hu (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_w_hu ((v8u32)_1, (v16u16)_2, (v16u16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_h_bu (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_h_bu ((v16u16)_1, (v32u8)_2, (v32u8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_q_d (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_q_d ((v4i64)_1, (v4i64)_2, (v4i64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_d_w (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_d_w ((v4i64)_1, (v8i32)_2, (v8i32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_w_h (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_w_h ((v8i32)_1, (v16i16)_2, (v16i16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_h_b (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_h_b ((v16i16)_1, (v32i8)_2, (v32i8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_q_du (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_q_du ((v4u64)_1, (v4u64)_2, (v4u64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, UV8SI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_d_wu (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_d_wu ((v4u64)_1, (v8u32)_2, (v8u32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, UV16HI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_w_hu (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_w_hu ((v8u32)_1, (v16u16)_2, (v16u16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, UV32QI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_h_bu (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_h_bu ((v16u16)_1, (v32u8)_2, (v32u8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, UV4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_q_du_d (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_q_du_d ((v4i64)_1, (v4u64)_2, (v4i64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, UV8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_d_wu_w (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_d_wu_w ((v4i64)_1, (v8u32)_2, (v8i32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, UV16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_w_hu_h (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_w_hu_h ((v8i32)_1, (v16u16)_2, (v16i16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, UV32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwev_h_bu_b (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwev_h_bu_b ((v16i16)_1, (v32u8)_2, (v32i8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, UV4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_q_du_d (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_q_du_d ((v4i64)_1, (v4u64)_2, (v4i64)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, UV8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_d_wu_w (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_d_wu_w ((v4i64)_1, (v8u32)_2, (v8i32)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, UV16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_w_hu_h (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_w_hu_h ((v8i32)_1, (v16u16)_2, (v16i16)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, UV32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmaddwod_h_bu_b (__m256i _1, __m256i _2, __m256i _3)
+{
+  return (__m256i)__builtin_lasx_xvmaddwod_h_bu_b ((v16i16)_1, (v32u8)_2, (v32i8)_3);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvrotr_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvrotr_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvrotr_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvrotr_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvrotr_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvrotr_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvrotr_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvrotr_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvadd_q (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvadd_q ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsub_q (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsub_q ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwev_q_du_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwev_q_du_d ((v4u64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvaddwod_q_du_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvaddwod_q_du_d ((v4u64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwev_q_du_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwev_q_du_d ((v4u64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, UV4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmulwod_q_du_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvmulwod_q_du_d ((v4u64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmskgez_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvmskgez_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvmsknz_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvmsknz_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V16HI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvexth_h_b (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvexth_h_b ((v32i8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V8SI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvexth_w_h (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvexth_w_h ((v16i16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvexth_d_w (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvexth_d_w ((v8i32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvexth_q_d (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvexth_q_d ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV16HI, UV32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvexth_hu_bu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvexth_hu_bu ((v32u8)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV8SI, UV16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvexth_wu_hu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvexth_wu_hu ((v16u16)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV4DI, UV8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvexth_du_wu (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvexth_du_wu ((v8u32)_1);
+}
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  UV4DI, UV4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvexth_qu_du (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvexth_qu_du ((v4u64)_1);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvrotri_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvrotri_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvrotri_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvrotri_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvrotri_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvrotri_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvrotri_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvrotri_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj.  */
+/* Data types in instruction templates:  V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvextl_q_d (__m256i _1)
+{
+  return (__m256i)__builtin_lasx_xvextl_q_d ((v4i64)_1);
+}
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvsrlni_b_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrlni_b_h ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvsrlni_h_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrlni_h_w ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvsrlni_w_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrlni_w_d ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvsrlni_d_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrlni_d_q ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvsrlrni_b_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrlrni_b_h ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvsrlrni_h_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrlrni_h_w ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvsrlrni_w_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrlrni_w_d ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvsrlrni_d_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrlrni_d_q ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvssrlni_b_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlni_b_h ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvssrlni_h_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlni_h_w ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvssrlni_w_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlni_w_d ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvssrlni_d_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlni_d_q ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, V32QI, USI.  */
+#define __lasx_xvssrlni_bu_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlni_bu_h ((v32u8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, V16HI, USI.  */
+#define __lasx_xvssrlni_hu_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlni_hu_w ((v16u16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, V8SI, USI.  */
+#define __lasx_xvssrlni_wu_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlni_wu_d ((v8u32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, V4DI, USI.  */
+#define __lasx_xvssrlni_du_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlni_du_q ((v4u64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvssrlrni_b_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlrni_b_h ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvssrlrni_h_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlrni_h_w ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvssrlrni_w_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlrni_w_d ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvssrlrni_d_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlrni_d_q ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, V32QI, USI.  */
+#define __lasx_xvssrlrni_bu_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlrni_bu_h ((v32u8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, V16HI, USI.  */
+#define __lasx_xvssrlrni_hu_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlrni_hu_w ((v16u16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, V8SI, USI.  */
+#define __lasx_xvssrlrni_wu_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlrni_wu_d ((v8u32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, V4DI, USI.  */
+#define __lasx_xvssrlrni_du_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrlrni_du_q ((v4u64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvsrani_b_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrani_b_h ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvsrani_h_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrani_h_w ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvsrani_w_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrani_w_d ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvsrani_d_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrani_d_q ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvsrarni_b_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrarni_b_h ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvsrarni_h_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrarni_h_w ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvsrarni_w_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrarni_w_d ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvsrarni_d_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvsrarni_d_q ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvssrani_b_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrani_b_h ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvssrani_h_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrani_h_w ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvssrani_w_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrani_w_d ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvssrani_d_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrani_d_q ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, V32QI, USI.  */
+#define __lasx_xvssrani_bu_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrani_bu_h ((v32u8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, V16HI, USI.  */
+#define __lasx_xvssrani_hu_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrani_hu_w ((v16u16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, V8SI, USI.  */
+#define __lasx_xvssrani_wu_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrani_wu_d ((v8u32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, V4DI, USI.  */
+#define __lasx_xvssrani_du_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrani_du_q ((v4u64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI, USI.  */
+#define __lasx_xvssrarni_b_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrarni_b_h ((v32i8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI, USI.  */
+#define __lasx_xvssrarni_h_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrarni_h_w ((v16i16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI, USI.  */
+#define __lasx_xvssrarni_w_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrarni_w_d ((v8i32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI, USI.  */
+#define __lasx_xvssrarni_d_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrarni_d_q ((v4i64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  UV32QI, UV32QI, V32QI, USI.  */
+#define __lasx_xvssrarni_bu_h(/*__m256i*/ _1, /*__m256i*/ _2, /*ui4*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrarni_bu_h ((v32u8)(_1), (v32i8)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  UV16HI, UV16HI, V16HI, USI.  */
+#define __lasx_xvssrarni_hu_w(/*__m256i*/ _1, /*__m256i*/ _2, /*ui5*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrarni_hu_w ((v16u16)(_1), (v16i16)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  UV8SI, UV8SI, V8SI, USI.  */
+#define __lasx_xvssrarni_wu_d(/*__m256i*/ _1, /*__m256i*/ _2, /*ui6*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrarni_wu_d ((v8u32)(_1), (v8i32)(_2), (_3)))
+
+/* Assembly instruction format:	xd, xj, ui7.  */
+/* Data types in instruction templates:  UV4DI, UV4DI, V4DI, USI.  */
+#define __lasx_xvssrarni_du_q(/*__m256i*/ _1, /*__m256i*/ _2, /*ui7*/ _3) \
+  ((__m256i)__builtin_lasx_xvssrarni_du_q ((v4u64)(_1), (v4i64)(_2), (_3)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV32QI.  */
+#define __lasx_xbnz_b(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbnz_b ((v32u8)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV4DI.  */
+#define __lasx_xbnz_d(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbnz_d ((v4u64)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV16HI.  */
+#define __lasx_xbnz_h(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbnz_h ((v16u16)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV32QI.  */
+#define __lasx_xbnz_v(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbnz_v ((v32u8)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV8SI.  */
+#define __lasx_xbnz_w(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbnz_w ((v8u32)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV32QI.  */
+#define __lasx_xbz_b(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbz_b ((v32u8)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV4DI.  */
+#define __lasx_xbz_d(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbz_d ((v4u64)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV16HI.  */
+#define __lasx_xbz_h(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbz_h ((v16u16)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV32QI.  */
+#define __lasx_xbz_v(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbz_v ((v32u8)(_1)))
+
+/* Assembly instruction format:	cd, xj.  */
+/* Data types in instruction templates:  SI, UV8SI.  */
+#define __lasx_xbz_w(/*__m256i*/ _1) \
+  ((int)__builtin_lasx_xbz_w ((v8u32)(_1)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_caf_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_caf_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_caf_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_caf_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_ceq_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_ceq_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_ceq_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_ceq_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cle_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cle_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cle_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cle_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_clt_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_clt_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_clt_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_clt_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cne_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cne_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cne_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cne_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cor_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cor_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cor_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cor_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cueq_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cueq_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cueq_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cueq_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cule_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cule_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cule_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cule_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cult_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cult_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cult_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cult_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cun_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cun_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cune_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cune_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cune_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cune_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_cun_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_cun_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_saf_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_saf_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_saf_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_saf_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_seq_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_seq_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_seq_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_seq_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sle_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sle_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sle_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sle_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_slt_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_slt_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_slt_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_slt_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sne_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sne_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sne_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sne_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sor_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sor_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sor_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sor_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sueq_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sueq_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sueq_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sueq_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sule_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sule_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sule_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sule_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sult_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sult_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sult_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sult_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sun_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sun_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sune_d (__m256d _1, __m256d _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sune_d ((v4f64)_1, (v4f64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sune_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sune_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvfcmp_sun_s (__m256 _1, __m256 _2)
+{
+  return (__m256i)__builtin_lasx_xvfcmp_sun_s ((v8f32)_1, (v8f32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui2.  */
+/* Data types in instruction templates:  V4DF, V4DF, UQI.  */
+#define __lasx_xvpickve_d_f(/*__m256d*/ _1, /*ui2*/ _2) \
+  ((__m256d)__builtin_lasx_xvpickve_d_f ((v4f64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V8SF, V8SF, UQI.  */
+#define __lasx_xvpickve_w_f(/*__m256*/ _1, /*ui3*/ _2) \
+  ((__m256)__builtin_lasx_xvpickve_w_f ((v8f32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, si10.  */
+/* Data types in instruction templates:  V32QI, HI.  */
+#define __lasx_xvrepli_b(/*si10*/ _1) \
+  ((__m256i)__builtin_lasx_xvrepli_b ((_1)))
+
+/* Assembly instruction format:	xd, si10.  */
+/* Data types in instruction templates:  V4DI, HI.  */
+#define __lasx_xvrepli_d(/*si10*/ _1) \
+  ((__m256i)__builtin_lasx_xvrepli_d ((_1)))
+
+/* Assembly instruction format:	xd, si10.  */
+/* Data types in instruction templates:  V16HI, HI.  */
+#define __lasx_xvrepli_h(/*si10*/ _1) \
+  ((__m256i)__builtin_lasx_xvrepli_h ((_1)))
+
+/* Assembly instruction format:	xd, si10.  */
+/* Data types in instruction templates:  V8SI, HI.  */
+#define __lasx_xvrepli_w(/*si10*/ _1) \
+  ((__m256i)__builtin_lasx_xvrepli_w ((_1)))
+
+#endif /* defined(__loongarch_asx).  */
+#endif /* _GCC_LOONGSON_ASXINTRIN_H.  */
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
index 5958f5b7fbe..064fee7dfa2 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -74,6 +74,13 @@ enum loongarch_builtin_type
   /* The function corresponds to an LSX conditional branch instruction
      combined with a compare instruction.  */
   LARCH_BUILTIN_LSX_TEST_BRANCH,
+
+  /* For generating LoongArch LASX.  */
+  LARCH_BUILTIN_LASX,
+
+  /* The function corresponds to an LASX conditional branch instruction
+     combined with a compare instruction.  */
+  LARCH_BUILTIN_LASX_TEST_BRANCH,
 };
 
 /* Declare an availability predicate for built-in functions that require
@@ -112,6 +119,7 @@ struct loongarch_builtin_description
 
 AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
 AVAIL_ALL (lsx, ISA_HAS_LSX)
+AVAIL_ALL (lasx, ISA_HAS_LASX)
 
 /* Construct a loongarch_builtin_description from the given arguments.
 
@@ -173,6 +181,30 @@ AVAIL_ALL (lsx, ISA_HAS_LSX)
     "__builtin_lsx_" #INSN,  LARCH_BUILTIN_DIRECT_NO_TARGET,		\
     FUNCTION_TYPE, loongarch_builtin_avail_lsx }
 
+/* Define an LASX LARCH_BUILTIN_DIRECT function __builtin_lasx_<INSN>
+   for instruction CODE_FOR_lasx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LASX_BUILTIN(INSN, FUNCTION_TYPE)				\
+  { CODE_FOR_lasx_ ## INSN,						\
+    "__builtin_lasx_" #INSN,  LARCH_BUILTIN_LASX,			\
+    FUNCTION_TYPE, loongarch_builtin_avail_lasx }
+
+/* Define an LASX LARCH_BUILTIN_DIRECT_NO_TARGET function __builtin_lasx_<INSN>
+   for instruction CODE_FOR_lasx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LASX_NO_TARGET_BUILTIN(INSN, FUNCTION_TYPE)			\
+  { CODE_FOR_lasx_ ## INSN,						\
+    "__builtin_lasx_" #INSN,  LARCH_BUILTIN_DIRECT_NO_TARGET,		\
+    FUNCTION_TYPE, loongarch_builtin_avail_lasx }
+
+/* Define an LASX LARCH_BUILTIN_LASX_TEST_BRANCH function __builtin_lasx_<INSN>
+   for instruction CODE_FOR_lasx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LASX_BUILTIN_TEST_BRANCH(INSN, FUNCTION_TYPE)			\
+  { CODE_FOR_lasx_ ## INSN,						\
+    "__builtin_lasx_" #INSN, LARCH_BUILTIN_LASX_TEST_BRANCH,		\
+    FUNCTION_TYPE, loongarch_builtin_avail_lasx }
+
 /* LoongArch SX define CODE_FOR_lsx_xxx */
 #define CODE_FOR_lsx_vsadd_b CODE_FOR_ssaddv16qi3
 #define CODE_FOR_lsx_vsadd_h CODE_FOR_ssaddv8hi3
@@ -442,6 +474,276 @@ AVAIL_ALL (lsx, ISA_HAS_LSX)
 #define CODE_FOR_lsx_vssrlrn_hu_w CODE_FOR_lsx_vssrlrn_u_hu_w
 #define CODE_FOR_lsx_vssrlrn_wu_d CODE_FOR_lsx_vssrlrn_u_wu_d
 
+/* LoongArch ASX define CODE_FOR_lasx_mxxx */
+#define CODE_FOR_lasx_xvsadd_b CODE_FOR_ssaddv32qi3
+#define CODE_FOR_lasx_xvsadd_h CODE_FOR_ssaddv16hi3
+#define CODE_FOR_lasx_xvsadd_w CODE_FOR_ssaddv8si3
+#define CODE_FOR_lasx_xvsadd_d CODE_FOR_ssaddv4di3
+#define CODE_FOR_lasx_xvsadd_bu CODE_FOR_usaddv32qi3
+#define CODE_FOR_lasx_xvsadd_hu CODE_FOR_usaddv16hi3
+#define CODE_FOR_lasx_xvsadd_wu CODE_FOR_usaddv8si3
+#define CODE_FOR_lasx_xvsadd_du CODE_FOR_usaddv4di3
+#define CODE_FOR_lasx_xvadd_b CODE_FOR_addv32qi3
+#define CODE_FOR_lasx_xvadd_h CODE_FOR_addv16hi3
+#define CODE_FOR_lasx_xvadd_w CODE_FOR_addv8si3
+#define CODE_FOR_lasx_xvadd_d CODE_FOR_addv4di3
+#define CODE_FOR_lasx_xvaddi_bu CODE_FOR_addv32qi3
+#define CODE_FOR_lasx_xvaddi_hu CODE_FOR_addv16hi3
+#define CODE_FOR_lasx_xvaddi_wu CODE_FOR_addv8si3
+#define CODE_FOR_lasx_xvaddi_du CODE_FOR_addv4di3
+#define CODE_FOR_lasx_xvand_v CODE_FOR_andv32qi3
+#define CODE_FOR_lasx_xvandi_b CODE_FOR_andv32qi3
+#define CODE_FOR_lasx_xvbitsel_v CODE_FOR_lasx_xvbitsel_b
+#define CODE_FOR_lasx_xvseqi_b CODE_FOR_lasx_xvseq_b
+#define CODE_FOR_lasx_xvseqi_h CODE_FOR_lasx_xvseq_h
+#define CODE_FOR_lasx_xvseqi_w CODE_FOR_lasx_xvseq_w
+#define CODE_FOR_lasx_xvseqi_d CODE_FOR_lasx_xvseq_d
+#define CODE_FOR_lasx_xvslti_b CODE_FOR_lasx_xvslt_b
+#define CODE_FOR_lasx_xvslti_h CODE_FOR_lasx_xvslt_h
+#define CODE_FOR_lasx_xvslti_w CODE_FOR_lasx_xvslt_w
+#define CODE_FOR_lasx_xvslti_d CODE_FOR_lasx_xvslt_d
+#define CODE_FOR_lasx_xvslti_bu CODE_FOR_lasx_xvslt_bu
+#define CODE_FOR_lasx_xvslti_hu CODE_FOR_lasx_xvslt_hu
+#define CODE_FOR_lasx_xvslti_wu CODE_FOR_lasx_xvslt_wu
+#define CODE_FOR_lasx_xvslti_du CODE_FOR_lasx_xvslt_du
+#define CODE_FOR_lasx_xvslei_b CODE_FOR_lasx_xvsle_b
+#define CODE_FOR_lasx_xvslei_h CODE_FOR_lasx_xvsle_h
+#define CODE_FOR_lasx_xvslei_w CODE_FOR_lasx_xvsle_w
+#define CODE_FOR_lasx_xvslei_d CODE_FOR_lasx_xvsle_d
+#define CODE_FOR_lasx_xvslei_bu CODE_FOR_lasx_xvsle_bu
+#define CODE_FOR_lasx_xvslei_hu CODE_FOR_lasx_xvsle_hu
+#define CODE_FOR_lasx_xvslei_wu CODE_FOR_lasx_xvsle_wu
+#define CODE_FOR_lasx_xvslei_du CODE_FOR_lasx_xvsle_du
+#define CODE_FOR_lasx_xvdiv_b CODE_FOR_divv32qi3
+#define CODE_FOR_lasx_xvdiv_h CODE_FOR_divv16hi3
+#define CODE_FOR_lasx_xvdiv_w CODE_FOR_divv8si3
+#define CODE_FOR_lasx_xvdiv_d CODE_FOR_divv4di3
+#define CODE_FOR_lasx_xvdiv_bu CODE_FOR_udivv32qi3
+#define CODE_FOR_lasx_xvdiv_hu CODE_FOR_udivv16hi3
+#define CODE_FOR_lasx_xvdiv_wu CODE_FOR_udivv8si3
+#define CODE_FOR_lasx_xvdiv_du CODE_FOR_udivv4di3
+#define CODE_FOR_lasx_xvfadd_s CODE_FOR_addv8sf3
+#define CODE_FOR_lasx_xvfadd_d CODE_FOR_addv4df3
+#define CODE_FOR_lasx_xvftintrz_w_s CODE_FOR_fix_truncv8sfv8si2
+#define CODE_FOR_lasx_xvftintrz_l_d CODE_FOR_fix_truncv4dfv4di2
+#define CODE_FOR_lasx_xvftintrz_wu_s CODE_FOR_fixuns_truncv8sfv8si2
+#define CODE_FOR_lasx_xvftintrz_lu_d CODE_FOR_fixuns_truncv4dfv4di2
+#define CODE_FOR_lasx_xvffint_s_w CODE_FOR_floatv8siv8sf2
+#define CODE_FOR_lasx_xvffint_d_l CODE_FOR_floatv4div4df2
+#define CODE_FOR_lasx_xvffint_s_wu CODE_FOR_floatunsv8siv8sf2
+#define CODE_FOR_lasx_xvffint_d_lu CODE_FOR_floatunsv4div4df2
+#define CODE_FOR_lasx_xvfsub_s CODE_FOR_subv8sf3
+#define CODE_FOR_lasx_xvfsub_d CODE_FOR_subv4df3
+#define CODE_FOR_lasx_xvfmul_s CODE_FOR_mulv8sf3
+#define CODE_FOR_lasx_xvfmul_d CODE_FOR_mulv4df3
+#define CODE_FOR_lasx_xvfdiv_s CODE_FOR_divv8sf3
+#define CODE_FOR_lasx_xvfdiv_d CODE_FOR_divv4df3
+#define CODE_FOR_lasx_xvfmax_s CODE_FOR_smaxv8sf3
+#define CODE_FOR_lasx_xvfmax_d CODE_FOR_smaxv4df3
+#define CODE_FOR_lasx_xvfmin_s CODE_FOR_sminv8sf3
+#define CODE_FOR_lasx_xvfmin_d CODE_FOR_sminv4df3
+#define CODE_FOR_lasx_xvfsqrt_s CODE_FOR_sqrtv8sf2
+#define CODE_FOR_lasx_xvfsqrt_d CODE_FOR_sqrtv4df2
+#define CODE_FOR_lasx_xvflogb_s CODE_FOR_logbv8sf2
+#define CODE_FOR_lasx_xvflogb_d CODE_FOR_logbv4df2
+#define CODE_FOR_lasx_xvmax_b CODE_FOR_smaxv32qi3
+#define CODE_FOR_lasx_xvmax_h CODE_FOR_smaxv16hi3
+#define CODE_FOR_lasx_xvmax_w CODE_FOR_smaxv8si3
+#define CODE_FOR_lasx_xvmax_d CODE_FOR_smaxv4di3
+#define CODE_FOR_lasx_xvmaxi_b CODE_FOR_smaxv32qi3
+#define CODE_FOR_lasx_xvmaxi_h CODE_FOR_smaxv16hi3
+#define CODE_FOR_lasx_xvmaxi_w CODE_FOR_smaxv8si3
+#define CODE_FOR_lasx_xvmaxi_d CODE_FOR_smaxv4di3
+#define CODE_FOR_lasx_xvmax_bu CODE_FOR_umaxv32qi3
+#define CODE_FOR_lasx_xvmax_hu CODE_FOR_umaxv16hi3
+#define CODE_FOR_lasx_xvmax_wu CODE_FOR_umaxv8si3
+#define CODE_FOR_lasx_xvmax_du CODE_FOR_umaxv4di3
+#define CODE_FOR_lasx_xvmaxi_bu CODE_FOR_umaxv32qi3
+#define CODE_FOR_lasx_xvmaxi_hu CODE_FOR_umaxv16hi3
+#define CODE_FOR_lasx_xvmaxi_wu CODE_FOR_umaxv8si3
+#define CODE_FOR_lasx_xvmaxi_du CODE_FOR_umaxv4di3
+#define CODE_FOR_lasx_xvmin_b CODE_FOR_sminv32qi3
+#define CODE_FOR_lasx_xvmin_h CODE_FOR_sminv16hi3
+#define CODE_FOR_lasx_xvmin_w CODE_FOR_sminv8si3
+#define CODE_FOR_lasx_xvmin_d CODE_FOR_sminv4di3
+#define CODE_FOR_lasx_xvmini_b CODE_FOR_sminv32qi3
+#define CODE_FOR_lasx_xvmini_h CODE_FOR_sminv16hi3
+#define CODE_FOR_lasx_xvmini_w CODE_FOR_sminv8si3
+#define CODE_FOR_lasx_xvmini_d CODE_FOR_sminv4di3
+#define CODE_FOR_lasx_xvmin_bu CODE_FOR_uminv32qi3
+#define CODE_FOR_lasx_xvmin_hu CODE_FOR_uminv16hi3
+#define CODE_FOR_lasx_xvmin_wu CODE_FOR_uminv8si3
+#define CODE_FOR_lasx_xvmin_du CODE_FOR_uminv4di3
+#define CODE_FOR_lasx_xvmini_bu CODE_FOR_uminv32qi3
+#define CODE_FOR_lasx_xvmini_hu CODE_FOR_uminv16hi3
+#define CODE_FOR_lasx_xvmini_wu CODE_FOR_uminv8si3
+#define CODE_FOR_lasx_xvmini_du CODE_FOR_uminv4di3
+#define CODE_FOR_lasx_xvmod_b CODE_FOR_modv32qi3
+#define CODE_FOR_lasx_xvmod_h CODE_FOR_modv16hi3
+#define CODE_FOR_lasx_xvmod_w CODE_FOR_modv8si3
+#define CODE_FOR_lasx_xvmod_d CODE_FOR_modv4di3
+#define CODE_FOR_lasx_xvmod_bu CODE_FOR_umodv32qi3
+#define CODE_FOR_lasx_xvmod_hu CODE_FOR_umodv16hi3
+#define CODE_FOR_lasx_xvmod_wu CODE_FOR_umodv8si3
+#define CODE_FOR_lasx_xvmod_du CODE_FOR_umodv4di3
+#define CODE_FOR_lasx_xvmul_b CODE_FOR_mulv32qi3
+#define CODE_FOR_lasx_xvmul_h CODE_FOR_mulv16hi3
+#define CODE_FOR_lasx_xvmul_w CODE_FOR_mulv8si3
+#define CODE_FOR_lasx_xvmul_d CODE_FOR_mulv4di3
+#define CODE_FOR_lasx_xvclz_b CODE_FOR_clzv32qi2
+#define CODE_FOR_lasx_xvclz_h CODE_FOR_clzv16hi2
+#define CODE_FOR_lasx_xvclz_w CODE_FOR_clzv8si2
+#define CODE_FOR_lasx_xvclz_d CODE_FOR_clzv4di2
+#define CODE_FOR_lasx_xvnor_v CODE_FOR_lasx_xvnor_b
+#define CODE_FOR_lasx_xvor_v CODE_FOR_iorv32qi3
+#define CODE_FOR_lasx_xvori_b CODE_FOR_iorv32qi3
+#define CODE_FOR_lasx_xvnori_b CODE_FOR_lasx_xvnor_b
+#define CODE_FOR_lasx_xvpcnt_b CODE_FOR_popcountv32qi2
+#define CODE_FOR_lasx_xvpcnt_h CODE_FOR_popcountv16hi2
+#define CODE_FOR_lasx_xvpcnt_w CODE_FOR_popcountv8si2
+#define CODE_FOR_lasx_xvpcnt_d CODE_FOR_popcountv4di2
+#define CODE_FOR_lasx_xvxor_v CODE_FOR_xorv32qi3
+#define CODE_FOR_lasx_xvxori_b CODE_FOR_xorv32qi3
+#define CODE_FOR_lasx_xvsll_b CODE_FOR_vashlv32qi3
+#define CODE_FOR_lasx_xvsll_h CODE_FOR_vashlv16hi3
+#define CODE_FOR_lasx_xvsll_w CODE_FOR_vashlv8si3
+#define CODE_FOR_lasx_xvsll_d CODE_FOR_vashlv4di3
+#define CODE_FOR_lasx_xvslli_b CODE_FOR_vashlv32qi3
+#define CODE_FOR_lasx_xvslli_h CODE_FOR_vashlv16hi3
+#define CODE_FOR_lasx_xvslli_w CODE_FOR_vashlv8si3
+#define CODE_FOR_lasx_xvslli_d CODE_FOR_vashlv4di3
+#define CODE_FOR_lasx_xvsra_b CODE_FOR_vashrv32qi3
+#define CODE_FOR_lasx_xvsra_h CODE_FOR_vashrv16hi3
+#define CODE_FOR_lasx_xvsra_w CODE_FOR_vashrv8si3
+#define CODE_FOR_lasx_xvsra_d CODE_FOR_vashrv4di3
+#define CODE_FOR_lasx_xvsrai_b CODE_FOR_vashrv32qi3
+#define CODE_FOR_lasx_xvsrai_h CODE_FOR_vashrv16hi3
+#define CODE_FOR_lasx_xvsrai_w CODE_FOR_vashrv8si3
+#define CODE_FOR_lasx_xvsrai_d CODE_FOR_vashrv4di3
+#define CODE_FOR_lasx_xvsrl_b CODE_FOR_vlshrv32qi3
+#define CODE_FOR_lasx_xvsrl_h CODE_FOR_vlshrv16hi3
+#define CODE_FOR_lasx_xvsrl_w CODE_FOR_vlshrv8si3
+#define CODE_FOR_lasx_xvsrl_d CODE_FOR_vlshrv4di3
+#define CODE_FOR_lasx_xvsrli_b CODE_FOR_vlshrv32qi3
+#define CODE_FOR_lasx_xvsrli_h CODE_FOR_vlshrv16hi3
+#define CODE_FOR_lasx_xvsrli_w CODE_FOR_vlshrv8si3
+#define CODE_FOR_lasx_xvsrli_d CODE_FOR_vlshrv4di3
+#define CODE_FOR_lasx_xvsub_b CODE_FOR_subv32qi3
+#define CODE_FOR_lasx_xvsub_h CODE_FOR_subv16hi3
+#define CODE_FOR_lasx_xvsub_w CODE_FOR_subv8si3
+#define CODE_FOR_lasx_xvsub_d CODE_FOR_subv4di3
+#define CODE_FOR_lasx_xvsubi_bu CODE_FOR_subv32qi3
+#define CODE_FOR_lasx_xvsubi_hu CODE_FOR_subv16hi3
+#define CODE_FOR_lasx_xvsubi_wu CODE_FOR_subv8si3
+#define CODE_FOR_lasx_xvsubi_du CODE_FOR_subv4di3
+#define CODE_FOR_lasx_xvpackod_d CODE_FOR_lasx_xvilvh_d
+#define CODE_FOR_lasx_xvpackev_d CODE_FOR_lasx_xvilvl_d
+#define CODE_FOR_lasx_xvpickod_d CODE_FOR_lasx_xvilvh_d
+#define CODE_FOR_lasx_xvpickev_d CODE_FOR_lasx_xvilvl_d
+#define CODE_FOR_lasx_xvrepli_b CODE_FOR_lasx_xvrepliv32qi
+#define CODE_FOR_lasx_xvrepli_h CODE_FOR_lasx_xvrepliv16hi
+#define CODE_FOR_lasx_xvrepli_w CODE_FOR_lasx_xvrepliv8si
+#define CODE_FOR_lasx_xvrepli_d CODE_FOR_lasx_xvrepliv4di
+
+#define CODE_FOR_lasx_xvandn_v CODE_FOR_xvandnv32qi3
+#define CODE_FOR_lasx_xvorn_v CODE_FOR_xvornv32qi3
+#define CODE_FOR_lasx_xvneg_b CODE_FOR_negv32qi2
+#define CODE_FOR_lasx_xvneg_h CODE_FOR_negv16hi2
+#define CODE_FOR_lasx_xvneg_w CODE_FOR_negv8si2
+#define CODE_FOR_lasx_xvneg_d CODE_FOR_negv4di2
+#define CODE_FOR_lasx_xvbsrl_v CODE_FOR_lasx_xvbsrl_b
+#define CODE_FOR_lasx_xvbsll_v CODE_FOR_lasx_xvbsll_b
+#define CODE_FOR_lasx_xvfmadd_s CODE_FOR_fmav8sf4
+#define CODE_FOR_lasx_xvfmadd_d CODE_FOR_fmav4df4
+#define CODE_FOR_lasx_xvfmsub_s CODE_FOR_fmsv8sf4
+#define CODE_FOR_lasx_xvfmsub_d CODE_FOR_fmsv4df4
+#define CODE_FOR_lasx_xvfnmadd_s CODE_FOR_xvfnmaddv8sf4_nmadd4
+#define CODE_FOR_lasx_xvfnmadd_d CODE_FOR_xvfnmaddv4df4_nmadd4
+#define CODE_FOR_lasx_xvfnmsub_s CODE_FOR_xvfnmsubv8sf4_nmsub4
+#define CODE_FOR_lasx_xvfnmsub_d CODE_FOR_xvfnmsubv4df4_nmsub4
+
+#define CODE_FOR_lasx_xvpermi_q CODE_FOR_lasx_xvpermi_q_v32qi
+#define CODE_FOR_lasx_xvpermi_d CODE_FOR_lasx_xvpermi_d_v4di
+#define CODE_FOR_lasx_xbnz_v CODE_FOR_lasx_xbnz_v_b
+#define CODE_FOR_lasx_xbz_v CODE_FOR_lasx_xbz_v_b
+
+#define CODE_FOR_lasx_xvssub_b CODE_FOR_lasx_xvssub_s_b
+#define CODE_FOR_lasx_xvssub_h CODE_FOR_lasx_xvssub_s_h
+#define CODE_FOR_lasx_xvssub_w CODE_FOR_lasx_xvssub_s_w
+#define CODE_FOR_lasx_xvssub_d CODE_FOR_lasx_xvssub_s_d
+#define CODE_FOR_lasx_xvssub_bu CODE_FOR_lasx_xvssub_u_bu
+#define CODE_FOR_lasx_xvssub_hu CODE_FOR_lasx_xvssub_u_hu
+#define CODE_FOR_lasx_xvssub_wu CODE_FOR_lasx_xvssub_u_wu
+#define CODE_FOR_lasx_xvssub_du CODE_FOR_lasx_xvssub_u_du
+#define CODE_FOR_lasx_xvabsd_b CODE_FOR_lasx_xvabsd_s_b
+#define CODE_FOR_lasx_xvabsd_h CODE_FOR_lasx_xvabsd_s_h
+#define CODE_FOR_lasx_xvabsd_w CODE_FOR_lasx_xvabsd_s_w
+#define CODE_FOR_lasx_xvabsd_d CODE_FOR_lasx_xvabsd_s_d
+#define CODE_FOR_lasx_xvabsd_bu CODE_FOR_lasx_xvabsd_u_bu
+#define CODE_FOR_lasx_xvabsd_hu CODE_FOR_lasx_xvabsd_u_hu
+#define CODE_FOR_lasx_xvabsd_wu CODE_FOR_lasx_xvabsd_u_wu
+#define CODE_FOR_lasx_xvabsd_du CODE_FOR_lasx_xvabsd_u_du
+#define CODE_FOR_lasx_xvavg_b CODE_FOR_lasx_xvavg_s_b
+#define CODE_FOR_lasx_xvavg_h CODE_FOR_lasx_xvavg_s_h
+#define CODE_FOR_lasx_xvavg_w CODE_FOR_lasx_xvavg_s_w
+#define CODE_FOR_lasx_xvavg_d CODE_FOR_lasx_xvavg_s_d
+#define CODE_FOR_lasx_xvavg_bu CODE_FOR_lasx_xvavg_u_bu
+#define CODE_FOR_lasx_xvavg_hu CODE_FOR_lasx_xvavg_u_hu
+#define CODE_FOR_lasx_xvavg_wu CODE_FOR_lasx_xvavg_u_wu
+#define CODE_FOR_lasx_xvavg_du CODE_FOR_lasx_xvavg_u_du
+#define CODE_FOR_lasx_xvavgr_b CODE_FOR_lasx_xvavgr_s_b
+#define CODE_FOR_lasx_xvavgr_h CODE_FOR_lasx_xvavgr_s_h
+#define CODE_FOR_lasx_xvavgr_w CODE_FOR_lasx_xvavgr_s_w
+#define CODE_FOR_lasx_xvavgr_d CODE_FOR_lasx_xvavgr_s_d
+#define CODE_FOR_lasx_xvavgr_bu CODE_FOR_lasx_xvavgr_u_bu
+#define CODE_FOR_lasx_xvavgr_hu CODE_FOR_lasx_xvavgr_u_hu
+#define CODE_FOR_lasx_xvavgr_wu CODE_FOR_lasx_xvavgr_u_wu
+#define CODE_FOR_lasx_xvavgr_du CODE_FOR_lasx_xvavgr_u_du
+#define CODE_FOR_lasx_xvmuh_b CODE_FOR_lasx_xvmuh_s_b
+#define CODE_FOR_lasx_xvmuh_h CODE_FOR_lasx_xvmuh_s_h
+#define CODE_FOR_lasx_xvmuh_w CODE_FOR_lasx_xvmuh_s_w
+#define CODE_FOR_lasx_xvmuh_d CODE_FOR_lasx_xvmuh_s_d
+#define CODE_FOR_lasx_xvmuh_bu CODE_FOR_lasx_xvmuh_u_bu
+#define CODE_FOR_lasx_xvmuh_hu CODE_FOR_lasx_xvmuh_u_hu
+#define CODE_FOR_lasx_xvmuh_wu CODE_FOR_lasx_xvmuh_u_wu
+#define CODE_FOR_lasx_xvmuh_du CODE_FOR_lasx_xvmuh_u_du
+#define CODE_FOR_lasx_xvssran_b_h CODE_FOR_lasx_xvssran_s_b_h
+#define CODE_FOR_lasx_xvssran_h_w CODE_FOR_lasx_xvssran_s_h_w
+#define CODE_FOR_lasx_xvssran_w_d CODE_FOR_lasx_xvssran_s_w_d
+#define CODE_FOR_lasx_xvssran_bu_h CODE_FOR_lasx_xvssran_u_bu_h
+#define CODE_FOR_lasx_xvssran_hu_w CODE_FOR_lasx_xvssran_u_hu_w
+#define CODE_FOR_lasx_xvssran_wu_d CODE_FOR_lasx_xvssran_u_wu_d
+#define CODE_FOR_lasx_xvssrarn_b_h CODE_FOR_lasx_xvssrarn_s_b_h
+#define CODE_FOR_lasx_xvssrarn_h_w CODE_FOR_lasx_xvssrarn_s_h_w
+#define CODE_FOR_lasx_xvssrarn_w_d CODE_FOR_lasx_xvssrarn_s_w_d
+#define CODE_FOR_lasx_xvssrarn_bu_h CODE_FOR_lasx_xvssrarn_u_bu_h
+#define CODE_FOR_lasx_xvssrarn_hu_w CODE_FOR_lasx_xvssrarn_u_hu_w
+#define CODE_FOR_lasx_xvssrarn_wu_d CODE_FOR_lasx_xvssrarn_u_wu_d
+#define CODE_FOR_lasx_xvssrln_bu_h CODE_FOR_lasx_xvssrln_u_bu_h
+#define CODE_FOR_lasx_xvssrln_hu_w CODE_FOR_lasx_xvssrln_u_hu_w
+#define CODE_FOR_lasx_xvssrln_wu_d CODE_FOR_lasx_xvssrln_u_wu_d
+#define CODE_FOR_lasx_xvssrlrn_bu_h CODE_FOR_lasx_xvssrlrn_u_bu_h
+#define CODE_FOR_lasx_xvssrlrn_hu_w CODE_FOR_lasx_xvssrlrn_u_hu_w
+#define CODE_FOR_lasx_xvssrlrn_wu_d CODE_FOR_lasx_xvssrlrn_u_wu_d
+#define CODE_FOR_lasx_xvftint_w_s CODE_FOR_lasx_xvftint_s_w_s
+#define CODE_FOR_lasx_xvftint_l_d CODE_FOR_lasx_xvftint_s_l_d
+#define CODE_FOR_lasx_xvftint_wu_s CODE_FOR_lasx_xvftint_u_wu_s
+#define CODE_FOR_lasx_xvftint_lu_d CODE_FOR_lasx_xvftint_u_lu_d
+#define CODE_FOR_lasx_xvsllwil_h_b CODE_FOR_lasx_xvsllwil_s_h_b
+#define CODE_FOR_lasx_xvsllwil_w_h CODE_FOR_lasx_xvsllwil_s_w_h
+#define CODE_FOR_lasx_xvsllwil_d_w CODE_FOR_lasx_xvsllwil_s_d_w
+#define CODE_FOR_lasx_xvsllwil_hu_bu CODE_FOR_lasx_xvsllwil_u_hu_bu
+#define CODE_FOR_lasx_xvsllwil_wu_hu CODE_FOR_lasx_xvsllwil_u_wu_hu
+#define CODE_FOR_lasx_xvsllwil_du_wu CODE_FOR_lasx_xvsllwil_u_du_wu
+#define CODE_FOR_lasx_xvsat_b CODE_FOR_lasx_xvsat_s_b
+#define CODE_FOR_lasx_xvsat_h CODE_FOR_lasx_xvsat_s_h
+#define CODE_FOR_lasx_xvsat_w CODE_FOR_lasx_xvsat_s_w
+#define CODE_FOR_lasx_xvsat_d CODE_FOR_lasx_xvsat_s_d
+#define CODE_FOR_lasx_xvsat_bu CODE_FOR_lasx_xvsat_u_bu
+#define CODE_FOR_lasx_xvsat_hu CODE_FOR_lasx_xvsat_u_hu
+#define CODE_FOR_lasx_xvsat_wu CODE_FOR_lasx_xvsat_u_wu
+#define CODE_FOR_lasx_xvsat_du CODE_FOR_lasx_xvsat_u_du
+
 static const struct loongarch_builtin_description loongarch_builtins[] = {
 #define LARCH_MOVFCSR2GR 0
   DIRECT_BUILTIN (movfcsr2gr, LARCH_USI_FTYPE_UQI, hard_float),
@@ -1209,7 +1511,761 @@ static const struct loongarch_builtin_description loongarch_builtins[] = {
   LSX_BUILTIN (vshuf_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
   LSX_BUILTIN (vldx, LARCH_V16QI_FTYPE_CVPOINTER_DI),
   LSX_NO_TARGET_BUILTIN (vstx, LARCH_VOID_FTYPE_V16QI_CVPOINTER_DI),
-  LSX_BUILTIN (vextl_qu_du, LARCH_UV2DI_FTYPE_UV2DI)
+  LSX_BUILTIN (vextl_qu_du, LARCH_UV2DI_FTYPE_UV2DI),
+
+  /* Built-in functions for LASX */
+  LASX_BUILTIN (xvsll_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsll_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsll_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsll_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvslli_b, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvslli_h, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvslli_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvslli_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvsra_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsra_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsra_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsra_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsrai_b, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvsrai_h, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvsrai_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvsrai_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvsrar_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsrar_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsrar_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsrar_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsrari_b, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvsrari_h, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvsrari_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvsrari_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvsrl_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsrl_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsrl_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsrl_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsrli_b, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvsrli_h, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvsrli_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvsrli_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvsrlr_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsrlr_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsrlr_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsrlr_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsrlri_b, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvsrlri_h, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvsrlri_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvsrlri_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvbitclr_b, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvbitclr_h, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvbitclr_w, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvbitclr_d, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvbitclri_b, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvbitclri_h, LARCH_UV16HI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvbitclri_w, LARCH_UV8SI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvbitclri_d, LARCH_UV4DI_FTYPE_UV4DI_UQI),
+  LASX_BUILTIN (xvbitset_b, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvbitset_h, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvbitset_w, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvbitset_d, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvbitseti_b, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvbitseti_h, LARCH_UV16HI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvbitseti_w, LARCH_UV8SI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvbitseti_d, LARCH_UV4DI_FTYPE_UV4DI_UQI),
+  LASX_BUILTIN (xvbitrev_b, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvbitrev_h, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvbitrev_w, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvbitrev_d, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvbitrevi_b, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvbitrevi_h, LARCH_UV16HI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvbitrevi_w, LARCH_UV8SI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvbitrevi_d, LARCH_UV4DI_FTYPE_UV4DI_UQI),
+  LASX_BUILTIN (xvadd_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvadd_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvadd_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvadd_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvaddi_bu, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvaddi_hu, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvaddi_wu, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvaddi_du, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvsub_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsub_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsub_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsub_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsubi_bu, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvsubi_hu, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvsubi_wu, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvsubi_du, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvmax_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvmax_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvmax_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvmax_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvmaxi_b, LARCH_V32QI_FTYPE_V32QI_QI),
+  LASX_BUILTIN (xvmaxi_h, LARCH_V16HI_FTYPE_V16HI_QI),
+  LASX_BUILTIN (xvmaxi_w, LARCH_V8SI_FTYPE_V8SI_QI),
+  LASX_BUILTIN (xvmaxi_d, LARCH_V4DI_FTYPE_V4DI_QI),
+  LASX_BUILTIN (xvmax_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvmax_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvmax_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmax_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvmaxi_bu, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvmaxi_hu, LARCH_UV16HI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvmaxi_wu, LARCH_UV8SI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvmaxi_du, LARCH_UV4DI_FTYPE_UV4DI_UQI),
+  LASX_BUILTIN (xvmin_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvmin_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvmin_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvmin_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvmini_b, LARCH_V32QI_FTYPE_V32QI_QI),
+  LASX_BUILTIN (xvmini_h, LARCH_V16HI_FTYPE_V16HI_QI),
+  LASX_BUILTIN (xvmini_w, LARCH_V8SI_FTYPE_V8SI_QI),
+  LASX_BUILTIN (xvmini_d, LARCH_V4DI_FTYPE_V4DI_QI),
+  LASX_BUILTIN (xvmin_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvmin_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvmin_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmin_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvmini_bu, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvmini_hu, LARCH_UV16HI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvmini_wu, LARCH_UV8SI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvmini_du, LARCH_UV4DI_FTYPE_UV4DI_UQI),
+  LASX_BUILTIN (xvseq_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvseq_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvseq_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvseq_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvseqi_b, LARCH_V32QI_FTYPE_V32QI_QI),
+  LASX_BUILTIN (xvseqi_h, LARCH_V16HI_FTYPE_V16HI_QI),
+  LASX_BUILTIN (xvseqi_w, LARCH_V8SI_FTYPE_V8SI_QI),
+  LASX_BUILTIN (xvseqi_d, LARCH_V4DI_FTYPE_V4DI_QI),
+  LASX_BUILTIN (xvslt_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvslt_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvslt_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvslt_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvslti_b, LARCH_V32QI_FTYPE_V32QI_QI),
+  LASX_BUILTIN (xvslti_h, LARCH_V16HI_FTYPE_V16HI_QI),
+  LASX_BUILTIN (xvslti_w, LARCH_V8SI_FTYPE_V8SI_QI),
+  LASX_BUILTIN (xvslti_d, LARCH_V4DI_FTYPE_V4DI_QI),
+  LASX_BUILTIN (xvslt_bu, LARCH_V32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvslt_hu, LARCH_V16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvslt_wu, LARCH_V8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvslt_du, LARCH_V4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvslti_bu, LARCH_V32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvslti_hu, LARCH_V16HI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvslti_wu, LARCH_V8SI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvslti_du, LARCH_V4DI_FTYPE_UV4DI_UQI),
+  LASX_BUILTIN (xvsle_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsle_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsle_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsle_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvslei_b, LARCH_V32QI_FTYPE_V32QI_QI),
+  LASX_BUILTIN (xvslei_h, LARCH_V16HI_FTYPE_V16HI_QI),
+  LASX_BUILTIN (xvslei_w, LARCH_V8SI_FTYPE_V8SI_QI),
+  LASX_BUILTIN (xvslei_d, LARCH_V4DI_FTYPE_V4DI_QI),
+  LASX_BUILTIN (xvsle_bu, LARCH_V32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvsle_hu, LARCH_V16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvsle_wu, LARCH_V8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvsle_du, LARCH_V4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvslei_bu, LARCH_V32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvslei_hu, LARCH_V16HI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvslei_wu, LARCH_V8SI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvslei_du, LARCH_V4DI_FTYPE_UV4DI_UQI),
+
+  LASX_BUILTIN (xvsat_b, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvsat_h, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvsat_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvsat_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvsat_bu, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvsat_hu, LARCH_UV16HI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvsat_wu, LARCH_UV8SI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvsat_du, LARCH_UV4DI_FTYPE_UV4DI_UQI),
+
+  LASX_BUILTIN (xvadda_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvadda_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvadda_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvadda_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsadd_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsadd_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsadd_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsadd_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsadd_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvsadd_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvsadd_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvsadd_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+
+  LASX_BUILTIN (xvavg_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvavg_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvavg_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvavg_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvavg_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvavg_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvavg_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvavg_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+
+  LASX_BUILTIN (xvavgr_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvavgr_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvavgr_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvavgr_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvavgr_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvavgr_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvavgr_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvavgr_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+
+  LASX_BUILTIN (xvssub_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvssub_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvssub_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvssub_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvssub_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvssub_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvssub_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvssub_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvabsd_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvabsd_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvabsd_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvabsd_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvabsd_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvabsd_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvabsd_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvabsd_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+
+  LASX_BUILTIN (xvmul_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvmul_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvmul_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvmul_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvmadd_b, LARCH_V32QI_FTYPE_V32QI_V32QI_V32QI),
+  LASX_BUILTIN (xvmadd_h, LARCH_V16HI_FTYPE_V16HI_V16HI_V16HI),
+  LASX_BUILTIN (xvmadd_w, LARCH_V8SI_FTYPE_V8SI_V8SI_V8SI),
+  LASX_BUILTIN (xvmadd_d, LARCH_V4DI_FTYPE_V4DI_V4DI_V4DI),
+  LASX_BUILTIN (xvmsub_b, LARCH_V32QI_FTYPE_V32QI_V32QI_V32QI),
+  LASX_BUILTIN (xvmsub_h, LARCH_V16HI_FTYPE_V16HI_V16HI_V16HI),
+  LASX_BUILTIN (xvmsub_w, LARCH_V8SI_FTYPE_V8SI_V8SI_V8SI),
+  LASX_BUILTIN (xvmsub_d, LARCH_V4DI_FTYPE_V4DI_V4DI_V4DI),
+  LASX_BUILTIN (xvdiv_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvdiv_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvdiv_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvdiv_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvdiv_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvdiv_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvdiv_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvdiv_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvhaddw_h_b, LARCH_V16HI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvhaddw_w_h, LARCH_V8SI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvhaddw_d_w, LARCH_V4DI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvhaddw_hu_bu, LARCH_UV16HI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvhaddw_wu_hu, LARCH_UV8SI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvhaddw_du_wu, LARCH_UV4DI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvhsubw_h_b, LARCH_V16HI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvhsubw_w_h, LARCH_V8SI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvhsubw_d_w, LARCH_V4DI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvhsubw_hu_bu, LARCH_V16HI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvhsubw_wu_hu, LARCH_V8SI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvhsubw_du_wu, LARCH_V4DI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmod_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvmod_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvmod_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvmod_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvmod_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvmod_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvmod_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmod_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+
+  LASX_BUILTIN (xvrepl128vei_b, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvrepl128vei_h, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvrepl128vei_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvrepl128vei_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvpickev_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvpickev_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvpickev_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvpickev_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvpickod_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvpickod_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvpickod_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvpickod_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvilvh_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvilvh_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvilvh_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvilvh_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvilvl_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvilvl_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvilvl_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvilvl_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvpackev_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvpackev_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvpackev_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvpackev_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvpackod_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvpackod_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvpackod_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvpackod_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvshuf_b, LARCH_V32QI_FTYPE_V32QI_V32QI_V32QI),
+  LASX_BUILTIN (xvshuf_h, LARCH_V16HI_FTYPE_V16HI_V16HI_V16HI),
+  LASX_BUILTIN (xvshuf_w, LARCH_V8SI_FTYPE_V8SI_V8SI_V8SI),
+  LASX_BUILTIN (xvshuf_d, LARCH_V4DI_FTYPE_V4DI_V4DI_V4DI),
+  LASX_BUILTIN (xvand_v, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvandi_b, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvor_v, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvori_b, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvnor_v, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvnori_b, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvxor_v, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvxori_b, LARCH_UV32QI_FTYPE_UV32QI_UQI),
+  LASX_BUILTIN (xvbitsel_v, LARCH_UV32QI_FTYPE_UV32QI_UV32QI_UV32QI),
+  LASX_BUILTIN (xvbitseli_b, LARCH_UV32QI_FTYPE_UV32QI_UV32QI_USI),
+
+  LASX_BUILTIN (xvshuf4i_b, LARCH_V32QI_FTYPE_V32QI_USI),
+  LASX_BUILTIN (xvshuf4i_h, LARCH_V16HI_FTYPE_V16HI_USI),
+  LASX_BUILTIN (xvshuf4i_w, LARCH_V8SI_FTYPE_V8SI_USI),
+
+  LASX_BUILTIN (xvreplgr2vr_b, LARCH_V32QI_FTYPE_SI),
+  LASX_BUILTIN (xvreplgr2vr_h, LARCH_V16HI_FTYPE_SI),
+  LASX_BUILTIN (xvreplgr2vr_w, LARCH_V8SI_FTYPE_SI),
+  LASX_BUILTIN (xvreplgr2vr_d, LARCH_V4DI_FTYPE_DI),
+  LASX_BUILTIN (xvpcnt_b, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (xvpcnt_h, LARCH_V16HI_FTYPE_V16HI),
+  LASX_BUILTIN (xvpcnt_w, LARCH_V8SI_FTYPE_V8SI),
+  LASX_BUILTIN (xvpcnt_d, LARCH_V4DI_FTYPE_V4DI),
+  LASX_BUILTIN (xvclo_b, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (xvclo_h, LARCH_V16HI_FTYPE_V16HI),
+  LASX_BUILTIN (xvclo_w, LARCH_V8SI_FTYPE_V8SI),
+  LASX_BUILTIN (xvclo_d, LARCH_V4DI_FTYPE_V4DI),
+  LASX_BUILTIN (xvclz_b, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (xvclz_h, LARCH_V16HI_FTYPE_V16HI),
+  LASX_BUILTIN (xvclz_w, LARCH_V8SI_FTYPE_V8SI),
+  LASX_BUILTIN (xvclz_d, LARCH_V4DI_FTYPE_V4DI),
+
+  LASX_BUILTIN (xvrepli_b, LARCH_V32QI_FTYPE_HI),
+  LASX_BUILTIN (xvrepli_h, LARCH_V16HI_FTYPE_HI),
+  LASX_BUILTIN (xvrepli_w, LARCH_V8SI_FTYPE_HI),
+  LASX_BUILTIN (xvrepli_d, LARCH_V4DI_FTYPE_HI),
+  LASX_BUILTIN (xvfcmp_caf_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_caf_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_cor_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_cor_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_cun_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_cun_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_cune_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_cune_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_cueq_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_cueq_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_ceq_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_ceq_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_cne_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_cne_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_clt_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_clt_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_cult_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_cult_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_cle_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_cle_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_cule_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_cule_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_saf_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_saf_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_sor_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_sor_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_sun_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_sun_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_sune_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_sune_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_sueq_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_sueq_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_seq_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_seq_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_sne_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_sne_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_slt_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_slt_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_sult_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_sult_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_sle_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_sle_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcmp_sule_s, LARCH_V8SI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcmp_sule_d, LARCH_V4DI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfadd_s, LARCH_V8SF_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfadd_d, LARCH_V4DF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfsub_s, LARCH_V8SF_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfsub_d, LARCH_V4DF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfmul_s, LARCH_V8SF_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfmul_d, LARCH_V4DF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfdiv_s, LARCH_V8SF_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfdiv_d, LARCH_V4DF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfcvt_h_s, LARCH_V16HI_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfcvt_s_d, LARCH_V8SF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfmin_s, LARCH_V8SF_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfmin_d, LARCH_V4DF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfmina_s, LARCH_V8SF_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfmina_d, LARCH_V4DF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfmax_s, LARCH_V8SF_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfmax_d, LARCH_V4DF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfmaxa_s, LARCH_V8SF_FTYPE_V8SF_V8SF),
+  LASX_BUILTIN (xvfmaxa_d, LARCH_V4DF_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvfclass_s, LARCH_V8SI_FTYPE_V8SF),
+  LASX_BUILTIN (xvfclass_d, LARCH_V4DI_FTYPE_V4DF),
+  LASX_BUILTIN (xvfsqrt_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfsqrt_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvfrecip_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfrecip_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvfrint_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfrint_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvfrsqrt_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfrsqrt_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvflogb_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvflogb_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvfcvth_s_h, LARCH_V8SF_FTYPE_V16HI),
+  LASX_BUILTIN (xvfcvth_d_s, LARCH_V4DF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfcvtl_s_h, LARCH_V8SF_FTYPE_V16HI),
+  LASX_BUILTIN (xvfcvtl_d_s, LARCH_V4DF_FTYPE_V8SF),
+  LASX_BUILTIN (xvftint_w_s, LARCH_V8SI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftint_l_d, LARCH_V4DI_FTYPE_V4DF),
+  LASX_BUILTIN (xvftint_wu_s, LARCH_UV8SI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftint_lu_d, LARCH_UV4DI_FTYPE_V4DF),
+  LASX_BUILTIN (xvftintrz_w_s, LARCH_V8SI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrz_l_d, LARCH_V4DI_FTYPE_V4DF),
+  LASX_BUILTIN (xvftintrz_wu_s, LARCH_UV8SI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrz_lu_d, LARCH_UV4DI_FTYPE_V4DF),
+  LASX_BUILTIN (xvffint_s_w, LARCH_V8SF_FTYPE_V8SI),
+  LASX_BUILTIN (xvffint_d_l, LARCH_V4DF_FTYPE_V4DI),
+  LASX_BUILTIN (xvffint_s_wu, LARCH_V8SF_FTYPE_UV8SI),
+  LASX_BUILTIN (xvffint_d_lu, LARCH_V4DF_FTYPE_UV4DI),
+
+  LASX_BUILTIN (xvreplve_b, LARCH_V32QI_FTYPE_V32QI_SI),
+  LASX_BUILTIN (xvreplve_h, LARCH_V16HI_FTYPE_V16HI_SI),
+  LASX_BUILTIN (xvreplve_w, LARCH_V8SI_FTYPE_V8SI_SI),
+  LASX_BUILTIN (xvreplve_d, LARCH_V4DI_FTYPE_V4DI_SI),
+  LASX_BUILTIN (xvpermi_w, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+
+  LASX_BUILTIN (xvandn_v, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvneg_b, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (xvneg_h, LARCH_V16HI_FTYPE_V16HI),
+  LASX_BUILTIN (xvneg_w, LARCH_V8SI_FTYPE_V8SI),
+  LASX_BUILTIN (xvneg_d, LARCH_V4DI_FTYPE_V4DI),
+  LASX_BUILTIN (xvmuh_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvmuh_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvmuh_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvmuh_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvmuh_bu, LARCH_UV32QI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvmuh_hu, LARCH_UV16HI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvmuh_wu, LARCH_UV8SI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmuh_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvsllwil_h_b, LARCH_V16HI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvsllwil_w_h, LARCH_V8SI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvsllwil_d_w, LARCH_V4DI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvsllwil_hu_bu, LARCH_UV16HI_FTYPE_UV32QI_UQI), /* FIXME: U? */
+  LASX_BUILTIN (xvsllwil_wu_hu, LARCH_UV8SI_FTYPE_UV16HI_UQI),
+  LASX_BUILTIN (xvsllwil_du_wu, LARCH_UV4DI_FTYPE_UV8SI_UQI),
+  LASX_BUILTIN (xvsran_b_h, LARCH_V32QI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsran_h_w, LARCH_V16HI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsran_w_d, LARCH_V8SI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvssran_b_h, LARCH_V32QI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvssran_h_w, LARCH_V16HI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvssran_w_d, LARCH_V8SI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvssran_bu_h, LARCH_UV32QI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvssran_hu_w, LARCH_UV16HI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvssran_wu_d, LARCH_UV8SI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvsrarn_b_h, LARCH_V32QI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsrarn_h_w, LARCH_V16HI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsrarn_w_d, LARCH_V8SI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvssrarn_b_h, LARCH_V32QI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvssrarn_h_w, LARCH_V16HI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvssrarn_w_d, LARCH_V8SI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvssrarn_bu_h, LARCH_UV32QI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvssrarn_hu_w, LARCH_UV16HI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvssrarn_wu_d, LARCH_UV8SI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvsrln_b_h, LARCH_V32QI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsrln_h_w, LARCH_V16HI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsrln_w_d, LARCH_V8SI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvssrln_bu_h, LARCH_UV32QI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvssrln_hu_w, LARCH_UV16HI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvssrln_wu_d, LARCH_UV8SI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvsrlrn_b_h, LARCH_V32QI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsrlrn_h_w, LARCH_V16HI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsrlrn_w_d, LARCH_V8SI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvssrlrn_bu_h, LARCH_UV32QI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvssrlrn_hu_w, LARCH_UV16HI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvssrlrn_wu_d, LARCH_UV8SI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvfrstpi_b, LARCH_V32QI_FTYPE_V32QI_V32QI_UQI),
+  LASX_BUILTIN (xvfrstpi_h, LARCH_V16HI_FTYPE_V16HI_V16HI_UQI),
+  LASX_BUILTIN (xvfrstp_b, LARCH_V32QI_FTYPE_V32QI_V32QI_V32QI),
+  LASX_BUILTIN (xvfrstp_h, LARCH_V16HI_FTYPE_V16HI_V16HI_V16HI),
+  LASX_BUILTIN (xvshuf4i_d, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvbsrl_v, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvbsll_v, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvextrins_b, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvextrins_h, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvextrins_w, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvextrins_d, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvmskltz_b, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (xvmskltz_h, LARCH_V16HI_FTYPE_V16HI),
+  LASX_BUILTIN (xvmskltz_w, LARCH_V8SI_FTYPE_V8SI),
+  LASX_BUILTIN (xvmskltz_d, LARCH_V4DI_FTYPE_V4DI),
+  LASX_BUILTIN (xvsigncov_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsigncov_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsigncov_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsigncov_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvfmadd_s, LARCH_V8SF_FTYPE_V8SF_V8SF_V8SF),
+  LASX_BUILTIN (xvfmadd_d, LARCH_V4DF_FTYPE_V4DF_V4DF_V4DF),
+  LASX_BUILTIN (xvfmsub_s, LARCH_V8SF_FTYPE_V8SF_V8SF_V8SF),
+  LASX_BUILTIN (xvfmsub_d, LARCH_V4DF_FTYPE_V4DF_V4DF_V4DF),
+  LASX_BUILTIN (xvfnmadd_s, LARCH_V8SF_FTYPE_V8SF_V8SF_V8SF),
+  LASX_BUILTIN (xvfnmadd_d, LARCH_V4DF_FTYPE_V4DF_V4DF_V4DF),
+  LASX_BUILTIN (xvfnmsub_s, LARCH_V8SF_FTYPE_V8SF_V8SF_V8SF),
+  LASX_BUILTIN (xvfnmsub_d, LARCH_V4DF_FTYPE_V4DF_V4DF_V4DF),
+  LASX_BUILTIN (xvftintrne_w_s, LARCH_V8SI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrne_l_d, LARCH_V4DI_FTYPE_V4DF),
+  LASX_BUILTIN (xvftintrp_w_s, LARCH_V8SI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrp_l_d, LARCH_V4DI_FTYPE_V4DF),
+  LASX_BUILTIN (xvftintrm_w_s, LARCH_V8SI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrm_l_d, LARCH_V4DI_FTYPE_V4DF),
+  LASX_BUILTIN (xvftint_w_d, LARCH_V8SI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvffint_s_l, LARCH_V8SF_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvftintrz_w_d, LARCH_V8SI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvftintrp_w_d, LARCH_V8SI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvftintrm_w_d, LARCH_V8SI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvftintrne_w_d, LARCH_V8SI_FTYPE_V4DF_V4DF),
+  LASX_BUILTIN (xvftinth_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintl_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvffinth_d_w, LARCH_V4DF_FTYPE_V8SI),
+  LASX_BUILTIN (xvffintl_d_w, LARCH_V4DF_FTYPE_V8SI),
+  LASX_BUILTIN (xvftintrzh_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrzl_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrph_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrpl_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrmh_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrml_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrneh_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvftintrnel_l_s, LARCH_V4DI_FTYPE_V8SF),
+  LASX_BUILTIN (xvfrintrne_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfrintrne_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvfrintrz_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfrintrz_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvfrintrp_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfrintrp_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvfrintrm_s, LARCH_V8SF_FTYPE_V8SF),
+  LASX_BUILTIN (xvfrintrm_d, LARCH_V4DF_FTYPE_V4DF),
+  LASX_BUILTIN (xvld, LARCH_V32QI_FTYPE_CVPOINTER_SI),
+  LASX_NO_TARGET_BUILTIN (xvst, LARCH_VOID_FTYPE_V32QI_CVPOINTER_SI),
+  LASX_NO_TARGET_BUILTIN (xvstelm_b, LARCH_VOID_FTYPE_V32QI_CVPOINTER_SI_UQI),
+  LASX_NO_TARGET_BUILTIN (xvstelm_h, LARCH_VOID_FTYPE_V16HI_CVPOINTER_SI_UQI),
+  LASX_NO_TARGET_BUILTIN (xvstelm_w, LARCH_VOID_FTYPE_V8SI_CVPOINTER_SI_UQI),
+  LASX_NO_TARGET_BUILTIN (xvstelm_d, LARCH_VOID_FTYPE_V4DI_CVPOINTER_SI_UQI),
+  LASX_BUILTIN (xvinsve0_w, LARCH_V8SI_FTYPE_V8SI_V8SI_UQI),
+  LASX_BUILTIN (xvinsve0_d, LARCH_V4DI_FTYPE_V4DI_V4DI_UQI),
+  LASX_BUILTIN (xvpickve_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvpickve_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvpickve_w_f, LARCH_V8SF_FTYPE_V8SF_UQI),
+  LASX_BUILTIN (xvpickve_d_f, LARCH_V4DF_FTYPE_V4DF_UQI),
+  LASX_BUILTIN (xvssrlrn_b_h, LARCH_V32QI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvssrlrn_h_w, LARCH_V16HI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvssrlrn_w_d, LARCH_V8SI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvssrln_b_h, LARCH_V32QI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvssrln_h_w, LARCH_V16HI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvssrln_w_d, LARCH_V8SI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvorn_v, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvldi, LARCH_V4DI_FTYPE_HI),
+  LASX_BUILTIN (xvldx, LARCH_V32QI_FTYPE_CVPOINTER_DI),
+  LASX_NO_TARGET_BUILTIN (xvstx, LARCH_VOID_FTYPE_V32QI_CVPOINTER_DI),
+  LASX_BUILTIN (xvextl_qu_du, LARCH_UV4DI_FTYPE_UV4DI),
+
+  /* LASX */
+  LASX_BUILTIN (xvinsgr2vr_w, LARCH_V8SI_FTYPE_V8SI_SI_UQI),
+  LASX_BUILTIN (xvinsgr2vr_d, LARCH_V4DI_FTYPE_V4DI_DI_UQI),
+
+  LASX_BUILTIN (xvreplve0_b, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (xvreplve0_h, LARCH_V16HI_FTYPE_V16HI),
+  LASX_BUILTIN (xvreplve0_w, LARCH_V8SI_FTYPE_V8SI),
+  LASX_BUILTIN (xvreplve0_d, LARCH_V4DI_FTYPE_V4DI),
+  LASX_BUILTIN (xvreplve0_q, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (vext2xv_h_b, LARCH_V16HI_FTYPE_V32QI),
+  LASX_BUILTIN (vext2xv_w_h, LARCH_V8SI_FTYPE_V16HI),
+  LASX_BUILTIN (vext2xv_d_w, LARCH_V4DI_FTYPE_V8SI),
+  LASX_BUILTIN (vext2xv_w_b, LARCH_V8SI_FTYPE_V32QI),
+  LASX_BUILTIN (vext2xv_d_h, LARCH_V4DI_FTYPE_V16HI),
+  LASX_BUILTIN (vext2xv_d_b, LARCH_V4DI_FTYPE_V32QI),
+  LASX_BUILTIN (vext2xv_hu_bu, LARCH_V16HI_FTYPE_V32QI),
+  LASX_BUILTIN (vext2xv_wu_hu, LARCH_V8SI_FTYPE_V16HI),
+  LASX_BUILTIN (vext2xv_du_wu, LARCH_V4DI_FTYPE_V8SI),
+  LASX_BUILTIN (vext2xv_wu_bu, LARCH_V8SI_FTYPE_V32QI),
+  LASX_BUILTIN (vext2xv_du_hu, LARCH_V4DI_FTYPE_V16HI),
+  LASX_BUILTIN (vext2xv_du_bu, LARCH_V4DI_FTYPE_V32QI),
+  LASX_BUILTIN (xvpermi_q, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvpermi_d, LARCH_V4DI_FTYPE_V4DI_USI),
+  LASX_BUILTIN (xvperm_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN_TEST_BRANCH (xbz_b, LARCH_SI_FTYPE_UV32QI),
+  LASX_BUILTIN_TEST_BRANCH (xbz_h, LARCH_SI_FTYPE_UV16HI),
+  LASX_BUILTIN_TEST_BRANCH (xbz_w, LARCH_SI_FTYPE_UV8SI),
+  LASX_BUILTIN_TEST_BRANCH (xbz_d, LARCH_SI_FTYPE_UV4DI),
+  LASX_BUILTIN_TEST_BRANCH (xbnz_b, LARCH_SI_FTYPE_UV32QI),
+  LASX_BUILTIN_TEST_BRANCH (xbnz_h, LARCH_SI_FTYPE_UV16HI),
+  LASX_BUILTIN_TEST_BRANCH (xbnz_w, LARCH_SI_FTYPE_UV8SI),
+  LASX_BUILTIN_TEST_BRANCH (xbnz_d, LARCH_SI_FTYPE_UV4DI),
+  LASX_BUILTIN_TEST_BRANCH (xbz_v, LARCH_SI_FTYPE_UV32QI),
+  LASX_BUILTIN_TEST_BRANCH (xbnz_v, LARCH_SI_FTYPE_UV32QI),
+  LASX_BUILTIN (xvldrepl_b, LARCH_V32QI_FTYPE_CVPOINTER_SI),
+  LASX_BUILTIN (xvldrepl_h, LARCH_V16HI_FTYPE_CVPOINTER_SI),
+  LASX_BUILTIN (xvldrepl_w, LARCH_V8SI_FTYPE_CVPOINTER_SI),
+  LASX_BUILTIN (xvldrepl_d, LARCH_V4DI_FTYPE_CVPOINTER_SI),
+  LASX_BUILTIN (xvpickve2gr_w, LARCH_SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvpickve2gr_wu, LARCH_USI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvpickve2gr_d, LARCH_DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvpickve2gr_du, LARCH_UDI_FTYPE_V4DI_UQI),
+
+  LASX_BUILTIN (xvaddwev_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvaddwev_d_w, LARCH_V4DI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvaddwev_w_h, LARCH_V8SI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvaddwev_h_b, LARCH_V16HI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvaddwev_q_du, LARCH_V4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvaddwev_d_wu, LARCH_V4DI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvaddwev_w_hu, LARCH_V8SI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvaddwev_h_bu, LARCH_V16HI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvsubwev_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsubwev_d_w, LARCH_V4DI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsubwev_w_h, LARCH_V8SI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsubwev_h_b, LARCH_V16HI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsubwev_q_du, LARCH_V4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvsubwev_d_wu, LARCH_V4DI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvsubwev_w_hu, LARCH_V8SI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvsubwev_h_bu, LARCH_V16HI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvmulwev_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvmulwev_d_w, LARCH_V4DI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvmulwev_w_h, LARCH_V8SI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvmulwev_h_b, LARCH_V16HI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvmulwev_q_du, LARCH_V4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvmulwev_d_wu, LARCH_V4DI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmulwev_w_hu, LARCH_V8SI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvmulwev_h_bu, LARCH_V16HI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvaddwod_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvaddwod_d_w, LARCH_V4DI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvaddwod_w_h, LARCH_V8SI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvaddwod_h_b, LARCH_V16HI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvaddwod_q_du, LARCH_V4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvaddwod_d_wu, LARCH_V4DI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvaddwod_w_hu, LARCH_V8SI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvaddwod_h_bu, LARCH_V16HI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvsubwod_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsubwod_d_w, LARCH_V4DI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvsubwod_w_h, LARCH_V8SI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvsubwod_h_b, LARCH_V16HI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvsubwod_q_du, LARCH_V4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvsubwod_d_wu, LARCH_V4DI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvsubwod_w_hu, LARCH_V8SI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvsubwod_h_bu, LARCH_V16HI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvmulwod_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvmulwod_d_w, LARCH_V4DI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvmulwod_w_h, LARCH_V8SI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvmulwod_h_b, LARCH_V16HI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvmulwod_q_du, LARCH_V4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvmulwod_d_wu, LARCH_V4DI_FTYPE_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmulwod_w_hu, LARCH_V8SI_FTYPE_UV16HI_UV16HI),
+  LASX_BUILTIN (xvmulwod_h_bu, LARCH_V16HI_FTYPE_UV32QI_UV32QI),
+  LASX_BUILTIN (xvaddwev_d_wu_w, LARCH_V4DI_FTYPE_UV8SI_V8SI),
+  LASX_BUILTIN (xvaddwev_w_hu_h, LARCH_V8SI_FTYPE_UV16HI_V16HI),
+  LASX_BUILTIN (xvaddwev_h_bu_b, LARCH_V16HI_FTYPE_UV32QI_V32QI),
+  LASX_BUILTIN (xvmulwev_d_wu_w, LARCH_V4DI_FTYPE_UV8SI_V8SI),
+  LASX_BUILTIN (xvmulwev_w_hu_h, LARCH_V8SI_FTYPE_UV16HI_V16HI),
+  LASX_BUILTIN (xvmulwev_h_bu_b, LARCH_V16HI_FTYPE_UV32QI_V32QI),
+  LASX_BUILTIN (xvaddwod_d_wu_w, LARCH_V4DI_FTYPE_UV8SI_V8SI),
+  LASX_BUILTIN (xvaddwod_w_hu_h, LARCH_V8SI_FTYPE_UV16HI_V16HI),
+  LASX_BUILTIN (xvaddwod_h_bu_b, LARCH_V16HI_FTYPE_UV32QI_V32QI),
+  LASX_BUILTIN (xvmulwod_d_wu_w, LARCH_V4DI_FTYPE_UV8SI_V8SI),
+  LASX_BUILTIN (xvmulwod_w_hu_h, LARCH_V8SI_FTYPE_UV16HI_V16HI),
+  LASX_BUILTIN (xvmulwod_h_bu_b, LARCH_V16HI_FTYPE_UV32QI_V32QI),
+  LASX_BUILTIN (xvhaddw_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvhaddw_qu_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvhsubw_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvhsubw_qu_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI),
+  LASX_BUILTIN (xvmaddwev_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI_V4DI),
+  LASX_BUILTIN (xvmaddwev_d_w, LARCH_V4DI_FTYPE_V4DI_V8SI_V8SI),
+  LASX_BUILTIN (xvmaddwev_w_h, LARCH_V8SI_FTYPE_V8SI_V16HI_V16HI),
+  LASX_BUILTIN (xvmaddwev_h_b, LARCH_V16HI_FTYPE_V16HI_V32QI_V32QI),
+  LASX_BUILTIN (xvmaddwev_q_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI_UV4DI),
+  LASX_BUILTIN (xvmaddwev_d_wu, LARCH_UV4DI_FTYPE_UV4DI_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmaddwev_w_hu, LARCH_UV8SI_FTYPE_UV8SI_UV16HI_UV16HI),
+  LASX_BUILTIN (xvmaddwev_h_bu, LARCH_UV16HI_FTYPE_UV16HI_UV32QI_UV32QI),
+  LASX_BUILTIN (xvmaddwod_q_d, LARCH_V4DI_FTYPE_V4DI_V4DI_V4DI),
+  LASX_BUILTIN (xvmaddwod_d_w, LARCH_V4DI_FTYPE_V4DI_V8SI_V8SI),
+  LASX_BUILTIN (xvmaddwod_w_h, LARCH_V8SI_FTYPE_V8SI_V16HI_V16HI),
+  LASX_BUILTIN (xvmaddwod_h_b, LARCH_V16HI_FTYPE_V16HI_V32QI_V32QI),
+  LASX_BUILTIN (xvmaddwod_q_du, LARCH_UV4DI_FTYPE_UV4DI_UV4DI_UV4DI),
+  LASX_BUILTIN (xvmaddwod_d_wu, LARCH_UV4DI_FTYPE_UV4DI_UV8SI_UV8SI),
+  LASX_BUILTIN (xvmaddwod_w_hu, LARCH_UV8SI_FTYPE_UV8SI_UV16HI_UV16HI),
+  LASX_BUILTIN (xvmaddwod_h_bu, LARCH_UV16HI_FTYPE_UV16HI_UV32QI_UV32QI),
+  LASX_BUILTIN (xvmaddwev_q_du_d, LARCH_V4DI_FTYPE_V4DI_UV4DI_V4DI),
+  LASX_BUILTIN (xvmaddwev_d_wu_w, LARCH_V4DI_FTYPE_V4DI_UV8SI_V8SI),
+  LASX_BUILTIN (xvmaddwev_w_hu_h, LARCH_V8SI_FTYPE_V8SI_UV16HI_V16HI),
+  LASX_BUILTIN (xvmaddwev_h_bu_b, LARCH_V16HI_FTYPE_V16HI_UV32QI_V32QI),
+  LASX_BUILTIN (xvmaddwod_q_du_d, LARCH_V4DI_FTYPE_V4DI_UV4DI_V4DI),
+  LASX_BUILTIN (xvmaddwod_d_wu_w, LARCH_V4DI_FTYPE_V4DI_UV8SI_V8SI),
+  LASX_BUILTIN (xvmaddwod_w_hu_h, LARCH_V8SI_FTYPE_V8SI_UV16HI_V16HI),
+  LASX_BUILTIN (xvmaddwod_h_bu_b, LARCH_V16HI_FTYPE_V16HI_UV32QI_V32QI),
+  LASX_BUILTIN (xvrotr_b, LARCH_V32QI_FTYPE_V32QI_V32QI),
+  LASX_BUILTIN (xvrotr_h, LARCH_V16HI_FTYPE_V16HI_V16HI),
+  LASX_BUILTIN (xvrotr_w, LARCH_V8SI_FTYPE_V8SI_V8SI),
+  LASX_BUILTIN (xvrotr_d, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvadd_q, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvsub_q, LARCH_V4DI_FTYPE_V4DI_V4DI),
+  LASX_BUILTIN (xvaddwev_q_du_d, LARCH_V4DI_FTYPE_UV4DI_V4DI),
+  LASX_BUILTIN (xvaddwod_q_du_d, LARCH_V4DI_FTYPE_UV4DI_V4DI),
+  LASX_BUILTIN (xvmulwev_q_du_d, LARCH_V4DI_FTYPE_UV4DI_V4DI),
+  LASX_BUILTIN (xvmulwod_q_du_d, LARCH_V4DI_FTYPE_UV4DI_V4DI),
+  LASX_BUILTIN (xvmskgez_b, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (xvmsknz_b, LARCH_V32QI_FTYPE_V32QI),
+  LASX_BUILTIN (xvexth_h_b, LARCH_V16HI_FTYPE_V32QI),
+  LASX_BUILTIN (xvexth_w_h, LARCH_V8SI_FTYPE_V16HI),
+  LASX_BUILTIN (xvexth_d_w, LARCH_V4DI_FTYPE_V8SI),
+  LASX_BUILTIN (xvexth_q_d, LARCH_V4DI_FTYPE_V4DI),
+  LASX_BUILTIN (xvexth_hu_bu, LARCH_UV16HI_FTYPE_UV32QI),
+  LASX_BUILTIN (xvexth_wu_hu, LARCH_UV8SI_FTYPE_UV16HI),
+  LASX_BUILTIN (xvexth_du_wu, LARCH_UV4DI_FTYPE_UV8SI),
+  LASX_BUILTIN (xvexth_qu_du, LARCH_UV4DI_FTYPE_UV4DI),
+  LASX_BUILTIN (xvrotri_b, LARCH_V32QI_FTYPE_V32QI_UQI),
+  LASX_BUILTIN (xvrotri_h, LARCH_V16HI_FTYPE_V16HI_UQI),
+  LASX_BUILTIN (xvrotri_w, LARCH_V8SI_FTYPE_V8SI_UQI),
+  LASX_BUILTIN (xvrotri_d, LARCH_V4DI_FTYPE_V4DI_UQI),
+  LASX_BUILTIN (xvextl_q_d, LARCH_V4DI_FTYPE_V4DI),
+  LASX_BUILTIN (xvsrlni_b_h, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvsrlni_h_w, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvsrlni_w_d, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvsrlni_d_q, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvsrlrni_b_h, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvsrlrni_h_w, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvsrlrni_w_d, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvsrlrni_d_q, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvssrlni_b_h, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvssrlni_h_w, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvssrlni_w_d, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvssrlni_d_q, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvssrlni_bu_h, LARCH_UV32QI_FTYPE_UV32QI_V32QI_USI),
+  LASX_BUILTIN (xvssrlni_hu_w, LARCH_UV16HI_FTYPE_UV16HI_V16HI_USI),
+  LASX_BUILTIN (xvssrlni_wu_d, LARCH_UV8SI_FTYPE_UV8SI_V8SI_USI),
+  LASX_BUILTIN (xvssrlni_du_q, LARCH_UV4DI_FTYPE_UV4DI_V4DI_USI),
+  LASX_BUILTIN (xvssrlrni_b_h, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvssrlrni_h_w, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvssrlrni_w_d, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvssrlrni_d_q, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvssrlrni_bu_h, LARCH_UV32QI_FTYPE_UV32QI_V32QI_USI),
+  LASX_BUILTIN (xvssrlrni_hu_w, LARCH_UV16HI_FTYPE_UV16HI_V16HI_USI),
+  LASX_BUILTIN (xvssrlrni_wu_d, LARCH_UV8SI_FTYPE_UV8SI_V8SI_USI),
+  LASX_BUILTIN (xvssrlrni_du_q, LARCH_UV4DI_FTYPE_UV4DI_V4DI_USI),
+  LASX_BUILTIN (xvsrani_b_h, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvsrani_h_w, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvsrani_w_d, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvsrani_d_q, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvsrarni_b_h, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvsrarni_h_w, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvsrarni_w_d, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvsrarni_d_q, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvssrani_b_h, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvssrani_h_w, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvssrani_w_d, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvssrani_d_q, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvssrani_bu_h, LARCH_UV32QI_FTYPE_UV32QI_V32QI_USI),
+  LASX_BUILTIN (xvssrani_hu_w, LARCH_UV16HI_FTYPE_UV16HI_V16HI_USI),
+  LASX_BUILTIN (xvssrani_wu_d, LARCH_UV8SI_FTYPE_UV8SI_V8SI_USI),
+  LASX_BUILTIN (xvssrani_du_q, LARCH_UV4DI_FTYPE_UV4DI_V4DI_USI),
+  LASX_BUILTIN (xvssrarni_b_h, LARCH_V32QI_FTYPE_V32QI_V32QI_USI),
+  LASX_BUILTIN (xvssrarni_h_w, LARCH_V16HI_FTYPE_V16HI_V16HI_USI),
+  LASX_BUILTIN (xvssrarni_w_d, LARCH_V8SI_FTYPE_V8SI_V8SI_USI),
+  LASX_BUILTIN (xvssrarni_d_q, LARCH_V4DI_FTYPE_V4DI_V4DI_USI),
+  LASX_BUILTIN (xvssrarni_bu_h, LARCH_UV32QI_FTYPE_UV32QI_V32QI_USI),
+  LASX_BUILTIN (xvssrarni_hu_w, LARCH_UV16HI_FTYPE_UV16HI_V16HI_USI),
+  LASX_BUILTIN (xvssrarni_wu_d, LARCH_UV8SI_FTYPE_UV8SI_V8SI_USI),
+  LASX_BUILTIN (xvssrarni_du_q, LARCH_UV4DI_FTYPE_UV4DI_V4DI_USI)
 };
 
 /* Index I is the function declaration for loongarch_builtins[I], or null if
@@ -1441,11 +2497,15 @@ loongarch_builtin_vectorized_function (unsigned int fn, tree type_out,
     {
       if (out_n == 2 && in_n == 2)
 	return LARCH_GET_BUILTIN (lsx_vfrintrp_d);
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lasx_xvfrintrp_d);
     }
       if (out_mode == SFmode && in_mode == SFmode)
     {
       if (out_n == 4 && in_n == 4)
 	return LARCH_GET_BUILTIN (lsx_vfrintrp_s);
+      if (out_n == 8 && in_n == 8)
+	return LARCH_GET_BUILTIN (lasx_xvfrintrp_s);
     }
       break;
 
@@ -1454,11 +2514,15 @@ loongarch_builtin_vectorized_function (unsigned int fn, tree type_out,
     {
       if (out_n == 2 && in_n == 2)
 	return LARCH_GET_BUILTIN (lsx_vfrintrz_d);
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lasx_xvfrintrz_d);
     }
       if (out_mode == SFmode && in_mode == SFmode)
     {
       if (out_n == 4 && in_n == 4)
 	return LARCH_GET_BUILTIN (lsx_vfrintrz_s);
+      if (out_n == 8 && in_n == 8)
+	return LARCH_GET_BUILTIN (lasx_xvfrintrz_s);
     }
       break;
 
@@ -1468,11 +2532,15 @@ loongarch_builtin_vectorized_function (unsigned int fn, tree type_out,
     {
       if (out_n == 2 && in_n == 2)
 	return LARCH_GET_BUILTIN (lsx_vfrint_d);
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lasx_xvfrint_d);
     }
       if (out_mode == SFmode && in_mode == SFmode)
     {
       if (out_n == 4 && in_n == 4)
 	return LARCH_GET_BUILTIN (lsx_vfrint_s);
+      if (out_n == 8 && in_n == 8)
+	return LARCH_GET_BUILTIN (lasx_xvfrint_s);
     }
       break;
 
@@ -1481,11 +2549,15 @@ loongarch_builtin_vectorized_function (unsigned int fn, tree type_out,
     {
       if (out_n == 2 && in_n == 2)
 	return LARCH_GET_BUILTIN (lsx_vfrintrm_d);
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lasx_xvfrintrm_d);
     }
       if (out_mode == SFmode && in_mode == SFmode)
     {
       if (out_n == 4 && in_n == 4)
 	return LARCH_GET_BUILTIN (lsx_vfrintrm_s);
+      if (out_n == 8 && in_n == 8)
+	return LARCH_GET_BUILTIN (lasx_xvfrintrm_s);
     }
       break;
 
@@ -1560,6 +2632,30 @@ loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
     case CODE_FOR_lsx_vsubi_hu:
     case CODE_FOR_lsx_vsubi_wu:
     case CODE_FOR_lsx_vsubi_du:
+    case CODE_FOR_lasx_xvaddi_bu:
+    case CODE_FOR_lasx_xvaddi_hu:
+    case CODE_FOR_lasx_xvaddi_wu:
+    case CODE_FOR_lasx_xvaddi_du:
+    case CODE_FOR_lasx_xvslti_bu:
+    case CODE_FOR_lasx_xvslti_hu:
+    case CODE_FOR_lasx_xvslti_wu:
+    case CODE_FOR_lasx_xvslti_du:
+    case CODE_FOR_lasx_xvslei_bu:
+    case CODE_FOR_lasx_xvslei_hu:
+    case CODE_FOR_lasx_xvslei_wu:
+    case CODE_FOR_lasx_xvslei_du:
+    case CODE_FOR_lasx_xvmaxi_bu:
+    case CODE_FOR_lasx_xvmaxi_hu:
+    case CODE_FOR_lasx_xvmaxi_wu:
+    case CODE_FOR_lasx_xvmaxi_du:
+    case CODE_FOR_lasx_xvmini_bu:
+    case CODE_FOR_lasx_xvmini_hu:
+    case CODE_FOR_lasx_xvmini_wu:
+    case CODE_FOR_lasx_xvmini_du:
+    case CODE_FOR_lasx_xvsubi_bu:
+    case CODE_FOR_lasx_xvsubi_hu:
+    case CODE_FOR_lasx_xvsubi_wu:
+    case CODE_FOR_lasx_xvsubi_du:
       gcc_assert (has_target_p && nops == 3);
       /* We only generate a vector of constants iff the second argument
 	 is an immediate.  We also validate the range of the immediate.  */
@@ -1598,6 +2694,26 @@ loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
     case CODE_FOR_lsx_vmini_h:
     case CODE_FOR_lsx_vmini_w:
     case CODE_FOR_lsx_vmini_d:
+    case CODE_FOR_lasx_xvseqi_b:
+    case CODE_FOR_lasx_xvseqi_h:
+    case CODE_FOR_lasx_xvseqi_w:
+    case CODE_FOR_lasx_xvseqi_d:
+    case CODE_FOR_lasx_xvslti_b:
+    case CODE_FOR_lasx_xvslti_h:
+    case CODE_FOR_lasx_xvslti_w:
+    case CODE_FOR_lasx_xvslti_d:
+    case CODE_FOR_lasx_xvslei_b:
+    case CODE_FOR_lasx_xvslei_h:
+    case CODE_FOR_lasx_xvslei_w:
+    case CODE_FOR_lasx_xvslei_d:
+    case CODE_FOR_lasx_xvmaxi_b:
+    case CODE_FOR_lasx_xvmaxi_h:
+    case CODE_FOR_lasx_xvmaxi_w:
+    case CODE_FOR_lasx_xvmaxi_d:
+    case CODE_FOR_lasx_xvmini_b:
+    case CODE_FOR_lasx_xvmini_h:
+    case CODE_FOR_lasx_xvmini_w:
+    case CODE_FOR_lasx_xvmini_d:
       gcc_assert (has_target_p && nops == 3);
       /* We only generate a vector of constants iff the second argument
 	 is an immediate.  We also validate the range of the immediate.  */
@@ -1620,6 +2736,10 @@ loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
     case CODE_FOR_lsx_vori_b:
     case CODE_FOR_lsx_vnori_b:
     case CODE_FOR_lsx_vxori_b:
+    case CODE_FOR_lasx_xvandi_b:
+    case CODE_FOR_lasx_xvori_b:
+    case CODE_FOR_lasx_xvnori_b:
+    case CODE_FOR_lasx_xvxori_b:
       gcc_assert (has_target_p && nops == 3);
       if (!CONST_INT_P (ops[2].value))
 	break;
@@ -1629,6 +2749,7 @@ loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
       break;
 
     case CODE_FOR_lsx_vbitseli_b:
+    case CODE_FOR_lasx_xvbitseli_b:
       gcc_assert (has_target_p && nops == 4);
       if (!CONST_INT_P (ops[3].value))
 	break;
@@ -1641,6 +2762,10 @@ loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
     case CODE_FOR_lsx_vreplgr2vr_h:
     case CODE_FOR_lsx_vreplgr2vr_w:
     case CODE_FOR_lsx_vreplgr2vr_d:
+    case CODE_FOR_lasx_xvreplgr2vr_b:
+    case CODE_FOR_lasx_xvreplgr2vr_h:
+    case CODE_FOR_lasx_xvreplgr2vr_w:
+    case CODE_FOR_lasx_xvreplgr2vr_d:
       /* Map the built-ins to vector fill operations.  We need fix up the mode
 	 for the element being inserted.  */
       gcc_assert (has_target_p && nops == 2);
@@ -1669,6 +2794,26 @@ loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
     case CODE_FOR_lsx_vpickod_b:
     case CODE_FOR_lsx_vpickod_h:
     case CODE_FOR_lsx_vpickod_w:
+    case CODE_FOR_lasx_xvilvh_b:
+    case CODE_FOR_lasx_xvilvh_h:
+    case CODE_FOR_lasx_xvilvh_w:
+    case CODE_FOR_lasx_xvilvh_d:
+    case CODE_FOR_lasx_xvilvl_b:
+    case CODE_FOR_lasx_xvilvl_h:
+    case CODE_FOR_lasx_xvilvl_w:
+    case CODE_FOR_lasx_xvilvl_d:
+    case CODE_FOR_lasx_xvpackev_b:
+    case CODE_FOR_lasx_xvpackev_h:
+    case CODE_FOR_lasx_xvpackev_w:
+    case CODE_FOR_lasx_xvpackod_b:
+    case CODE_FOR_lasx_xvpackod_h:
+    case CODE_FOR_lasx_xvpackod_w:
+    case CODE_FOR_lasx_xvpickev_b:
+    case CODE_FOR_lasx_xvpickev_h:
+    case CODE_FOR_lasx_xvpickev_w:
+    case CODE_FOR_lasx_xvpickod_b:
+    case CODE_FOR_lasx_xvpickod_h:
+    case CODE_FOR_lasx_xvpickod_w:
       /* Swap the operands 1 and 2 for interleave operations.  Built-ins follow
 	 convention of ISA, which have op1 as higher component and op2 as lower
 	 component.  However, the VEC_PERM op in tree and vec_concat in RTL
@@ -1690,6 +2835,18 @@ loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
     case CODE_FOR_lsx_vsrli_h:
     case CODE_FOR_lsx_vsrli_w:
     case CODE_FOR_lsx_vsrli_d:
+    case CODE_FOR_lasx_xvslli_b:
+    case CODE_FOR_lasx_xvslli_h:
+    case CODE_FOR_lasx_xvslli_w:
+    case CODE_FOR_lasx_xvslli_d:
+    case CODE_FOR_lasx_xvsrai_b:
+    case CODE_FOR_lasx_xvsrai_h:
+    case CODE_FOR_lasx_xvsrai_w:
+    case CODE_FOR_lasx_xvsrai_d:
+    case CODE_FOR_lasx_xvsrli_b:
+    case CODE_FOR_lasx_xvsrli_h:
+    case CODE_FOR_lasx_xvsrli_w:
+    case CODE_FOR_lasx_xvsrli_d:
       gcc_assert (has_target_p && nops == 3);
       if (CONST_INT_P (ops[2].value))
 	{
@@ -1750,6 +2907,25 @@ loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
 							     INTVAL (ops[2].value));
       break;
 
+    case CODE_FOR_lasx_xvinsgr2vr_w:
+    case CODE_FOR_lasx_xvinsgr2vr_d:
+      /* Map the built-ins to insert operations.  We need to swap operands,
+	 fix up the mode for the element being inserted, and generate
+	 a bit mask for vec_merge.  */
+      gcc_assert (has_target_p && nops == 4);
+      std::swap (ops[1], ops[2]);
+      imode = GET_MODE_INNER (ops[0].mode);
+      ops[1].value = lowpart_subreg (imode, ops[1].value, ops[1].mode);
+      ops[1].mode = imode;
+      rangelo = 0;
+      rangehi = GET_MODE_NUNITS (ops[0].mode) - 1;
+      if (CONST_INT_P (ops[3].value)
+	  && IN_RANGE (INTVAL (ops[3].value), rangelo, rangehi))
+	ops[3].value = GEN_INT (1 << INTVAL (ops[3].value));
+      else
+	error_opno = 2;
+      break;
+
     default:
       break;
   }
@@ -1859,12 +3035,14 @@ loongarch_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
     {
     case LARCH_BUILTIN_DIRECT:
     case LARCH_BUILTIN_LSX:
+    case LARCH_BUILTIN_LASX:
       return loongarch_expand_builtin_direct (d->icode, target, exp, true);
 
     case LARCH_BUILTIN_DIRECT_NO_TARGET:
       return loongarch_expand_builtin_direct (d->icode, target, exp, false);
 
     case LARCH_BUILTIN_LSX_TEST_BRANCH:
+    case LARCH_BUILTIN_LASX_TEST_BRANCH:
       return loongarch_expand_builtin_lsx_test_branch (d->icode, exp);
     }
   gcc_unreachable ();
diff --git a/gcc/config/loongarch/loongarch-ftypes.def b/gcc/config/loongarch/loongarch-ftypes.def
index 1ce9d83ccab..72d96878038 100644
--- a/gcc/config/loongarch/loongarch-ftypes.def
+++ b/gcc/config/loongarch/loongarch-ftypes.def
@@ -67,6 +67,7 @@ DEF_LARCH_FTYPE (3, (UDI, UDI, UDI, USI))
 DEF_LARCH_FTYPE (1, (DF, DF))
 DEF_LARCH_FTYPE (2, (DF, DF, DF))
 DEF_LARCH_FTYPE (1, (DF, V2DF))
+DEF_LARCH_FTYPE (1, (DF, V4DF))
 
 DEF_LARCH_FTYPE (1, (DI, DI))
 DEF_LARCH_FTYPE (1, (DI, SI))
@@ -83,6 +84,7 @@ DEF_LARCH_FTYPE (2, (DI, SI, SI))
 DEF_LARCH_FTYPE (2, (DI, USI, USI))
 
 DEF_LARCH_FTYPE (2, (DI, V2DI, UQI))
+DEF_LARCH_FTYPE (2, (DI, V4DI, UQI))
 
 DEF_LARCH_FTYPE (2, (INT, DF, DF))
 DEF_LARCH_FTYPE (2, (INT, SF, SF))
@@ -104,21 +106,31 @@ DEF_LARCH_FTYPE (3, (SI, SI, SI, SI))
 DEF_LARCH_FTYPE (3, (SI, SI, SI, QI))
 DEF_LARCH_FTYPE (1, (SI, UQI))
 DEF_LARCH_FTYPE (1, (SI, UV16QI))
+DEF_LARCH_FTYPE (1, (SI, UV32QI))
 DEF_LARCH_FTYPE (1, (SI, UV2DI))
+DEF_LARCH_FTYPE (1, (SI, UV4DI))
 DEF_LARCH_FTYPE (1, (SI, UV4SI))
+DEF_LARCH_FTYPE (1, (SI, UV8SI))
 DEF_LARCH_FTYPE (1, (SI, UV8HI))
+DEF_LARCH_FTYPE (1, (SI, UV16HI))
 DEF_LARCH_FTYPE (2, (SI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (SI, V32QI, UQI))
 DEF_LARCH_FTYPE (1, (SI, V2HI))
 DEF_LARCH_FTYPE (2, (SI, V2HI, V2HI))
 DEF_LARCH_FTYPE (1, (SI, V4QI))
 DEF_LARCH_FTYPE (2, (SI, V4QI, V4QI))
 DEF_LARCH_FTYPE (2, (SI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (SI, V8SI, UQI))
 DEF_LARCH_FTYPE (2, (SI, V8HI, UQI))
 DEF_LARCH_FTYPE (1, (SI, VOID))
 
 DEF_LARCH_FTYPE (2, (UDI, UDI, UDI))
+DEF_LARCH_FTYPE (2, (USI, V32QI, UQI))
 DEF_LARCH_FTYPE (2, (UDI, UV2SI, UV2SI))
+DEF_LARCH_FTYPE (2, (USI, V8SI, UQI))
 DEF_LARCH_FTYPE (2, (UDI, V2DI, UQI))
+DEF_LARCH_FTYPE (2, (USI, V16HI, UQI))
+DEF_LARCH_FTYPE (2, (UDI, V4DI, UQI))
 
 DEF_LARCH_FTYPE (2, (USI, V16QI, UQI))
 DEF_LARCH_FTYPE (2, (USI, V4SI, UQI))
@@ -142,6 +154,23 @@ DEF_LARCH_FTYPE (2, (UV2DI, UV2DI, V2DI))
 DEF_LARCH_FTYPE (2, (UV2DI, UV4SI, UV4SI))
 DEF_LARCH_FTYPE (1, (UV2DI, V2DF))
 
+DEF_LARCH_FTYPE (2, (UV32QI, UV32QI, UQI))
+DEF_LARCH_FTYPE (2, (UV32QI, UV32QI, USI))
+DEF_LARCH_FTYPE (2, (UV32QI, UV32QI, UV32QI))
+DEF_LARCH_FTYPE (3, (UV32QI, UV32QI, UV32QI, UQI))
+DEF_LARCH_FTYPE (3, (UV32QI, UV32QI, UV32QI, USI))
+DEF_LARCH_FTYPE (3, (UV32QI, UV32QI, UV32QI, UV32QI))
+DEF_LARCH_FTYPE (2, (UV32QI, UV32QI, V32QI))
+
+DEF_LARCH_FTYPE (2, (UV4DI, UV4DI, UQI))
+DEF_LARCH_FTYPE (2, (UV4DI, UV4DI, UV4DI))
+DEF_LARCH_FTYPE (3, (UV4DI, UV4DI, UV4DI, UQI))
+DEF_LARCH_FTYPE (3, (UV4DI, UV4DI, UV4DI, UV4DI))
+DEF_LARCH_FTYPE (3, (UV4DI, UV4DI, UV8SI, UV8SI))
+DEF_LARCH_FTYPE (2, (UV4DI, UV4DI, V4DI))
+DEF_LARCH_FTYPE (2, (UV4DI, UV8SI, UV8SI))
+DEF_LARCH_FTYPE (1, (UV4DI, V4DF))
+
 DEF_LARCH_FTYPE (2, (UV2SI, UV2SI, UQI))
 DEF_LARCH_FTYPE (2, (UV2SI, UV2SI, UV2SI))
 
@@ -170,7 +199,22 @@ DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, UV8HI, UQI))
 DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, UV8HI, UV8HI))
 DEF_LARCH_FTYPE (2, (UV8HI, UV8HI, V8HI))
 
-
+DEF_LARCH_FTYPE (2, (UV8SI, UV8SI, UQI))
+DEF_LARCH_FTYPE (2, (UV8SI, UV8SI, UV8SI))
+DEF_LARCH_FTYPE (3, (UV8SI, UV8SI, UV8SI, UQI))
+DEF_LARCH_FTYPE (3, (UV8SI, UV8SI, UV8SI, UV8SI))
+DEF_LARCH_FTYPE (3, (UV8SI, UV8SI, UV16HI, UV16HI))
+DEF_LARCH_FTYPE (2, (UV8SI, UV8SI, V8SI))
+DEF_LARCH_FTYPE (2, (UV8SI, UV16HI, UV16HI))
+DEF_LARCH_FTYPE (1, (UV8SI, V8SF))
+
+DEF_LARCH_FTYPE (2, (UV16HI, UV32QI, UV32QI))
+DEF_LARCH_FTYPE (2, (UV16HI, UV16HI, UQI))
+DEF_LARCH_FTYPE (3, (UV16HI, UV16HI, UV32QI, UV32QI))
+DEF_LARCH_FTYPE (2, (UV16HI, UV16HI, UV16HI))
+DEF_LARCH_FTYPE (3, (UV16HI, UV16HI, UV16HI, UQI))
+DEF_LARCH_FTYPE (3, (UV16HI, UV16HI, UV16HI, UV16HI))
+DEF_LARCH_FTYPE (2, (UV16HI, UV16HI, V16HI))
 
 DEF_LARCH_FTYPE (2, (UV8QI, UV4HI, UV4HI))
 DEF_LARCH_FTYPE (1, (UV8QI, UV8QI))
@@ -196,6 +240,25 @@ DEF_LARCH_FTYPE (4, (V16QI, V16QI, V16QI, UQI, UQI))
 DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, USI))
 DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, V16QI))
 
+DEF_LARCH_FTYPE (2, (V32QI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (2, (V32QI, CVPOINTER, DI))
+DEF_LARCH_FTYPE (1, (V32QI, HI))
+DEF_LARCH_FTYPE (1, (V32QI, SI))
+DEF_LARCH_FTYPE (2, (V32QI, UV32QI, UQI))
+DEF_LARCH_FTYPE (2, (V32QI, UV32QI, UV32QI))
+DEF_LARCH_FTYPE (1, (V32QI, V32QI))
+DEF_LARCH_FTYPE (2, (V32QI, V32QI, QI))
+DEF_LARCH_FTYPE (2, (V32QI, V32QI, SI))
+DEF_LARCH_FTYPE (2, (V32QI, V32QI, UQI))
+DEF_LARCH_FTYPE (2, (V32QI, V32QI, USI))
+DEF_LARCH_FTYPE (3, (V32QI, V32QI, SI, UQI))
+DEF_LARCH_FTYPE (3, (V32QI, V32QI, UQI, V32QI))
+DEF_LARCH_FTYPE (2, (V32QI, V32QI, V32QI))
+DEF_LARCH_FTYPE (3, (V32QI, V32QI, V32QI, SI))
+DEF_LARCH_FTYPE (3, (V32QI, V32QI, V32QI, UQI))
+DEF_LARCH_FTYPE (4, (V32QI, V32QI, V32QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V32QI, V32QI, V32QI, USI))
+DEF_LARCH_FTYPE (3, (V32QI, V32QI, V32QI, V32QI))
 
 DEF_LARCH_FTYPE (1, (V2DF, DF))
 DEF_LARCH_FTYPE (1, (V2DF, UV2DI))
@@ -207,6 +270,16 @@ DEF_LARCH_FTYPE (1, (V2DF, V2DI))
 DEF_LARCH_FTYPE (1, (V2DF, V4SF))
 DEF_LARCH_FTYPE (1, (V2DF, V4SI))
 
+DEF_LARCH_FTYPE (1, (V4DF, DF))
+DEF_LARCH_FTYPE (1, (V4DF, UV4DI))
+DEF_LARCH_FTYPE (1, (V4DF, V4DF))
+DEF_LARCH_FTYPE (2, (V4DF, V4DF, V4DF))
+DEF_LARCH_FTYPE (3, (V4DF, V4DF, V4DF, V4DF))
+DEF_LARCH_FTYPE (2, (V4DF, V4DF, V4DI))
+DEF_LARCH_FTYPE (1, (V4DF, V4DI))
+DEF_LARCH_FTYPE (1, (V4DF, V8SF))
+DEF_LARCH_FTYPE (1, (V4DF, V8SI))
+
 DEF_LARCH_FTYPE (2, (V2DI, CVPOINTER, SI))
 DEF_LARCH_FTYPE (1, (V2DI, DI))
 DEF_LARCH_FTYPE (1, (V2DI, HI))
@@ -233,6 +306,32 @@ DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, V2DI))
 DEF_LARCH_FTYPE (3, (V2DI, V2DI, V4SI, V4SI))
 DEF_LARCH_FTYPE (2, (V2DI, V4SI, V4SI))
 
+DEF_LARCH_FTYPE (2, (V4DI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V4DI, DI))
+DEF_LARCH_FTYPE (1, (V4DI, HI))
+DEF_LARCH_FTYPE (2, (V4DI, UV4DI, UQI))
+DEF_LARCH_FTYPE (2, (V4DI, UV4DI, UV4DI))
+DEF_LARCH_FTYPE (2, (V4DI, UV8SI, UV8SI))
+DEF_LARCH_FTYPE (1, (V4DI, V4DF))
+DEF_LARCH_FTYPE (2, (V4DI, V4DF, V4DF))
+DEF_LARCH_FTYPE (1, (V4DI, V4DI))
+DEF_LARCH_FTYPE (1, (UV4DI, UV4DI))
+DEF_LARCH_FTYPE (2, (V4DI, V4DI, QI))
+DEF_LARCH_FTYPE (2, (V4DI, V4DI, SI))
+DEF_LARCH_FTYPE (2, (V4DI, V4DI, UQI))
+DEF_LARCH_FTYPE (2, (V4DI, V4DI, USI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, DI, UQI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, UQI, V4DI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, UV8SI, UV8SI))
+DEF_LARCH_FTYPE (2, (V4DI, V4DI, V4DI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, V4DI, SI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, V4DI, USI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, V4DI, UQI))
+DEF_LARCH_FTYPE (4, (V4DI, V4DI, V4DI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, V4DI, V4DI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, V8SI, V8SI))
+DEF_LARCH_FTYPE (2, (V4DI, V8SI, V8SI))
+
 DEF_LARCH_FTYPE (1, (V2HI, SI))
 DEF_LARCH_FTYPE (2, (V2HI, SI, SI))
 DEF_LARCH_FTYPE (3, (V2HI, SI, SI, SI))
@@ -274,6 +373,17 @@ DEF_LARCH_FTYPE (3, (V4SF, V4SF, V4SF, V4SF))
 DEF_LARCH_FTYPE (2, (V4SF, V4SF, V4SI))
 DEF_LARCH_FTYPE (1, (V4SF, V4SI))
 DEF_LARCH_FTYPE (1, (V4SF, V8HI))
+DEF_LARCH_FTYPE (1, (V8SF, V16HI))
+
+DEF_LARCH_FTYPE (1, (V8SF, SF))
+DEF_LARCH_FTYPE (1, (V8SF, UV8SI))
+DEF_LARCH_FTYPE (2, (V8SF, V4DF, V4DF))
+DEF_LARCH_FTYPE (1, (V8SF, V8SF))
+DEF_LARCH_FTYPE (2, (V8SF, V8SF, V8SF))
+DEF_LARCH_FTYPE (3, (V8SF, V8SF, V8SF, V8SF))
+DEF_LARCH_FTYPE (2, (V8SF, V8SF, V8SI))
+DEF_LARCH_FTYPE (1, (V8SF, V8SI))
+DEF_LARCH_FTYPE (1, (V8SF, V8HI))
 
 DEF_LARCH_FTYPE (2, (V4SI, CVPOINTER, SI))
 DEF_LARCH_FTYPE (1, (V4SI, HI))
@@ -282,6 +392,7 @@ DEF_LARCH_FTYPE (2, (V4SI, UV4SI, UQI))
 DEF_LARCH_FTYPE (2, (V4SI, UV4SI, UV4SI))
 DEF_LARCH_FTYPE (2, (V4SI, UV8HI, UV8HI))
 DEF_LARCH_FTYPE (2, (V4SI, V2DF, V2DF))
+DEF_LARCH_FTYPE (2, (V8SI, V4DF, V4DF))
 DEF_LARCH_FTYPE (1, (V4SI, V4SF))
 DEF_LARCH_FTYPE (2, (V4SI, V4SF, V4SF))
 DEF_LARCH_FTYPE (1, (V4SI, V4SI))
@@ -301,6 +412,32 @@ DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, V4SI))
 DEF_LARCH_FTYPE (3, (V4SI, V4SI, V8HI, V8HI))
 DEF_LARCH_FTYPE (2, (V4SI, V8HI, V8HI))
 
+DEF_LARCH_FTYPE (2, (V8SI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V8SI, HI))
+DEF_LARCH_FTYPE (1, (V8SI, SI))
+DEF_LARCH_FTYPE (2, (V8SI, UV8SI, UQI))
+DEF_LARCH_FTYPE (2, (V8SI, UV8SI, UV8SI))
+DEF_LARCH_FTYPE (2, (V8SI, UV16HI, UV16HI))
+DEF_LARCH_FTYPE (2, (V8SI, V2DF, V2DF))
+DEF_LARCH_FTYPE (1, (V8SI, V8SF))
+DEF_LARCH_FTYPE (2, (V8SI, V8SF, V8SF))
+DEF_LARCH_FTYPE (1, (V8SI, V8SI))
+DEF_LARCH_FTYPE (2, (V8SI, V8SI, QI))
+DEF_LARCH_FTYPE (2, (V8SI, V8SI, SI))
+DEF_LARCH_FTYPE (2, (V8SI, V8SI, UQI))
+DEF_LARCH_FTYPE (2, (V8SI, V8SI, USI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, SI, UQI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, UQI, V8SI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, UV16HI, UV16HI))
+DEF_LARCH_FTYPE (2, (V8SI, V8SI, V8SI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, V8SI, SI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, V8SI, UQI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, V8SI, USI))
+DEF_LARCH_FTYPE (4, (V8SI, V8SI, V8SI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, V8SI, V8SI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, V16HI, V16HI))
+DEF_LARCH_FTYPE (2, (V8SI, V16HI, V16HI))
+
 DEF_LARCH_FTYPE (2, (V8HI, CVPOINTER, SI))
 DEF_LARCH_FTYPE (1, (V8HI, HI))
 DEF_LARCH_FTYPE (1, (V8HI, SI))
@@ -326,6 +463,31 @@ DEF_LARCH_FTYPE (4, (V8HI, V8HI, V8HI, UQI, UQI))
 DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, USI))
 DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, V8HI))
 
+DEF_LARCH_FTYPE (2, (V16HI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V16HI, HI))
+DEF_LARCH_FTYPE (1, (V16HI, SI))
+DEF_LARCH_FTYPE (2, (V16HI, UV32QI, UV32QI))
+DEF_LARCH_FTYPE (2, (V16HI, UV16HI, UQI))
+DEF_LARCH_FTYPE (2, (V16HI, UV16HI, UV16HI))
+DEF_LARCH_FTYPE (2, (V16HI, V32QI, V32QI))
+DEF_LARCH_FTYPE (2, (V16HI, V8SF, V8SF))
+DEF_LARCH_FTYPE (1, (V16HI, V16HI))
+DEF_LARCH_FTYPE (2, (V16HI, V16HI, QI))
+DEF_LARCH_FTYPE (2, (V16HI, V16HI, SI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, SI, UQI))
+DEF_LARCH_FTYPE (2, (V16HI, V16HI, UQI))
+DEF_LARCH_FTYPE (2, (V16HI, V16HI, USI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, UQI, SI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, UQI, V16HI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, UV32QI, UV32QI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, V32QI, V32QI))
+DEF_LARCH_FTYPE (2, (V16HI, V16HI, V16HI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, V16HI, SI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, V16HI, UQI))
+DEF_LARCH_FTYPE (4, (V16HI, V16HI, V16HI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, V16HI, USI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, V16HI, V16HI))
+
 DEF_LARCH_FTYPE (2, (V8QI, V4HI, V4HI))
 DEF_LARCH_FTYPE (1, (V8QI, V8QI))
 DEF_LARCH_FTYPE (2, (V8QI, V8QI, V8QI))
@@ -337,62 +499,113 @@ DEF_LARCH_FTYPE (2, (VOID, USI, UQI))
 DEF_LARCH_FTYPE (1, (VOID, UHI))
 DEF_LARCH_FTYPE (3, (VOID, V16QI, CVPOINTER, SI))
 DEF_LARCH_FTYPE (3, (VOID, V16QI, CVPOINTER, DI))
+DEF_LARCH_FTYPE (3, (VOID, V32QI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V32QI, CVPOINTER, DI))
+DEF_LARCH_FTYPE (3, (VOID, V4DF, POINTER, SI))
 DEF_LARCH_FTYPE (3, (VOID, V2DF, POINTER, SI))
 DEF_LARCH_FTYPE (3, (VOID, V2DI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V4DI, CVPOINTER, SI))
 DEF_LARCH_FTYPE (2, (VOID, V2HI, V2HI))
 DEF_LARCH_FTYPE (2, (VOID, V4QI, V4QI))
 DEF_LARCH_FTYPE (3, (VOID, V4SF, POINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V8SF, POINTER, SI))
 DEF_LARCH_FTYPE (3, (VOID, V4SI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V8SI, CVPOINTER, SI))
 DEF_LARCH_FTYPE (3, (VOID, V8HI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V16HI, CVPOINTER, SI))
 
+DEF_LARCH_FTYPE (1, (V16HI, V32QI))
+DEF_LARCH_FTYPE (1, (UV16HI, UV32QI))
+DEF_LARCH_FTYPE (1, (V8SI, V32QI))
+DEF_LARCH_FTYPE (1, (V4DI, V32QI))
 DEF_LARCH_FTYPE (1, (V8HI, V16QI))
 DEF_LARCH_FTYPE (1, (V4SI, V16QI))
 DEF_LARCH_FTYPE (1, (V2DI, V16QI))
+DEF_LARCH_FTYPE (1, (UV8SI, UV16HI))
+DEF_LARCH_FTYPE (1, (V8SI, V16HI))
+DEF_LARCH_FTYPE (1, (V4DI, V16HI))
 DEF_LARCH_FTYPE (1, (V4SI, V8HI))
 DEF_LARCH_FTYPE (1, (V2DI, V8HI))
 DEF_LARCH_FTYPE (1, (V2DI, V4SI))
+DEF_LARCH_FTYPE (1, (V4DI, V8SI))
+DEF_LARCH_FTYPE (1, (UV4DI, UV8SI))
+DEF_LARCH_FTYPE (1, (UV16HI, V32QI))
+DEF_LARCH_FTYPE (1, (UV8SI, V32QI))
+DEF_LARCH_FTYPE (1, (UV4DI, V32QI))
 DEF_LARCH_FTYPE (1, (UV8HI, V16QI))
 DEF_LARCH_FTYPE (1, (UV4SI, V16QI))
 DEF_LARCH_FTYPE (1, (UV2DI, V16QI))
+DEF_LARCH_FTYPE (1, (UV8SI, V16HI))
+DEF_LARCH_FTYPE (1, (UV4DI, V16HI))
 DEF_LARCH_FTYPE (1, (UV4SI, V8HI))
 DEF_LARCH_FTYPE (1, (UV2DI, V8HI))
 DEF_LARCH_FTYPE (1, (UV2DI, V4SI))
+DEF_LARCH_FTYPE (1, (UV4DI, V8SI))
 DEF_LARCH_FTYPE (1, (UV8HI, UV16QI))
 DEF_LARCH_FTYPE (1, (UV4SI, UV16QI))
 DEF_LARCH_FTYPE (1, (UV2DI, UV16QI))
+DEF_LARCH_FTYPE (1, (UV4DI, UV32QI))
 DEF_LARCH_FTYPE (1, (UV4SI, UV8HI))
 DEF_LARCH_FTYPE (1, (UV2DI, UV8HI))
 DEF_LARCH_FTYPE (1, (UV2DI, UV4SI))
 DEF_LARCH_FTYPE (2, (UV8HI, V16QI, V16QI))
 DEF_LARCH_FTYPE (2, (UV4SI, V8HI, V8HI))
 DEF_LARCH_FTYPE (2, (UV2DI, V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V16HI, V32QI, UQI))
+DEF_LARCH_FTYPE (2, (V8SI, V16HI, UQI))
+DEF_LARCH_FTYPE (2, (V4DI, V8SI, UQI))
 DEF_LARCH_FTYPE (2, (V8HI, V16QI, UQI))
 DEF_LARCH_FTYPE (2, (V4SI, V8HI, UQI))
 DEF_LARCH_FTYPE (2, (V2DI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (UV16HI, UV32QI, UQI))
+DEF_LARCH_FTYPE (2, (UV8SI, UV16HI, UQI))
+DEF_LARCH_FTYPE (2, (UV4DI, UV8SI, UQI))
 DEF_LARCH_FTYPE (2, (UV8HI, UV16QI, UQI))
 DEF_LARCH_FTYPE (2, (UV4SI, UV8HI, UQI))
 DEF_LARCH_FTYPE (2, (UV2DI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (V32QI, V16HI, V16HI))
+DEF_LARCH_FTYPE (2, (V16HI, V8SI, V8SI))
+DEF_LARCH_FTYPE (2, (V8SI, V4DI, V4DI))
 DEF_LARCH_FTYPE (2, (V16QI, V8HI, V8HI))
 DEF_LARCH_FTYPE (2, (V8HI, V4SI, V4SI))
 DEF_LARCH_FTYPE (2, (V4SI, V2DI, V2DI))
+DEF_LARCH_FTYPE (2, (UV32QI, UV16HI, UV16HI))
+DEF_LARCH_FTYPE (2, (UV16HI, UV8SI, UV8SI))
+DEF_LARCH_FTYPE (2, (UV8SI, UV4DI, UV4DI))
 DEF_LARCH_FTYPE (2, (UV16QI, UV8HI, UV8HI))
 DEF_LARCH_FTYPE (2, (UV8HI, UV4SI, UV4SI))
 DEF_LARCH_FTYPE (2, (UV4SI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (V32QI, V16HI, UQI))
+DEF_LARCH_FTYPE (2, (V16HI, V8SI, UQI))
+DEF_LARCH_FTYPE (2, (V8SI, V4DI, UQI))
 DEF_LARCH_FTYPE (2, (V16QI, V8HI, UQI))
 DEF_LARCH_FTYPE (2, (V8HI, V4SI, UQI))
 DEF_LARCH_FTYPE (2, (V4SI, V2DI, UQI))
+DEF_LARCH_FTYPE (2, (UV32QI, UV16HI, UQI))
+DEF_LARCH_FTYPE (2, (UV16HI, UV8SI, UQI))
+DEF_LARCH_FTYPE (2, (UV8SI, UV4DI, UQI))
 DEF_LARCH_FTYPE (2, (UV16QI, UV8HI, UQI))
 DEF_LARCH_FTYPE (2, (UV8HI, UV4SI, UQI))
 DEF_LARCH_FTYPE (2, (UV4SI, UV2DI, UQI))
+DEF_LARCH_FTYPE (2, (V32QI, V32QI, DI))
 DEF_LARCH_FTYPE (2, (V16QI, V16QI, DI))
+DEF_LARCH_FTYPE (2, (V32QI, UQI, UQI))
 DEF_LARCH_FTYPE (2, (V16QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V32QI, V32QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V16HI, V16HI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V8SI, V8SI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V4DI, V4DI, UQI, UQI))
 DEF_LARCH_FTYPE (3, (V16QI, V16QI, UQI, UQI))
 DEF_LARCH_FTYPE (3, (V8HI, V8HI, UQI, UQI))
 DEF_LARCH_FTYPE (3, (V4SI, V4SI, UQI, UQI))
 DEF_LARCH_FTYPE (3, (V2DI, V2DI, UQI, UQI))
+DEF_LARCH_FTYPE (2, (V8SF, V4DI, V4DI))
 DEF_LARCH_FTYPE (2, (V4SF, V2DI, V2DI))
+DEF_LARCH_FTYPE (1, (V4DI, V8SF))
 DEF_LARCH_FTYPE (1, (V2DI, V4SF))
+DEF_LARCH_FTYPE (2, (V4DI, UQI, USI))
 DEF_LARCH_FTYPE (2, (V2DI, UQI, USI))
+DEF_LARCH_FTYPE (2, (V4DI, UQI, UQI))
 DEF_LARCH_FTYPE (2, (V2DI, UQI, UQI))
 DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V16QI, CVPOINTER))
 DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V8HI, CVPOINTER))
@@ -402,6 +615,17 @@ DEF_LARCH_FTYPE (2, (V16QI, SI, CVPOINTER))
 DEF_LARCH_FTYPE (2, (V8HI, SI, CVPOINTER))
 DEF_LARCH_FTYPE (2, (V4SI, SI, CVPOINTER))
 DEF_LARCH_FTYPE (2, (V2DI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, V32QI, UQI, SI,  CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, V16HI, UQI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, V8SI, UQI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, V4DI, UQI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (3, (VOID, V32QI, SI,  CVPOINTER))
+DEF_LARCH_FTYPE (2, (V32QI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V16HI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V8SI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V4DI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (1, (V32QI, POINTER))
+DEF_LARCH_FTYPE (2, (VOID, V32QI, POINTER))
 DEF_LARCH_FTYPE (2, (V8HI, UV16QI, V16QI))
 DEF_LARCH_FTYPE (2, (V16QI, V16QI, UV16QI))
 DEF_LARCH_FTYPE (2, (UV16QI, V16QI, UV16QI))
@@ -431,6 +655,33 @@ DEF_LARCH_FTYPE (3, (V4SI, V4SI, V16QI, V16QI))
 DEF_LARCH_FTYPE (3, (V4SI, V4SI, UV16QI, V16QI))
 DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV16QI, UV16QI))
 
+
+DEF_LARCH_FTYPE(2,(V4DI,V16HI,V16HI))
+DEF_LARCH_FTYPE(2,(V4DI,UV4SI,V4SI))
+DEF_LARCH_FTYPE(2,(V8SI,UV16HI,V16HI))
+DEF_LARCH_FTYPE(2,(V16HI,UV32QI,V32QI))
+DEF_LARCH_FTYPE(2,(V4DI,UV8SI,V8SI))
+DEF_LARCH_FTYPE(3,(V4DI,V4DI,V16HI,V16HI))
+DEF_LARCH_FTYPE(2,(UV32QI,V32QI,UV32QI))
+DEF_LARCH_FTYPE(2,(UV16HI,V16HI,UV16HI))
+DEF_LARCH_FTYPE(2,(UV8SI,V8SI,UV8SI))
+DEF_LARCH_FTYPE(2,(UV4DI,V4DI,UV4DI))
+DEF_LARCH_FTYPE(3,(V4DI,V4DI,UV4DI,V4DI))
+DEF_LARCH_FTYPE(3,(V4DI,V4DI,UV8SI,V8SI))
+DEF_LARCH_FTYPE(3,(V8SI,V8SI,UV16HI,V16HI))
+DEF_LARCH_FTYPE(3,(V16HI,V16HI,UV32QI,V32QI))
+DEF_LARCH_FTYPE(2,(V4DI,UV4DI,V4DI))
+DEF_LARCH_FTYPE(2,(V8SI,V32QI,V32QI))
+DEF_LARCH_FTYPE(2,(UV4DI,UV16HI,UV16HI))
+DEF_LARCH_FTYPE(2,(V4DI,UV16HI,V16HI))
+DEF_LARCH_FTYPE(3,(V8SI,V8SI,V32QI,V32QI))
+DEF_LARCH_FTYPE(3,(UV8SI,UV8SI,UV32QI,UV32QI))
+DEF_LARCH_FTYPE(3,(UV4DI,UV4DI,UV16HI,UV16HI))
+DEF_LARCH_FTYPE(3,(V8SI,V8SI,UV32QI,V32QI))
+DEF_LARCH_FTYPE(3,(V4DI,V4DI,UV16HI,V16HI))
+DEF_LARCH_FTYPE(2,(UV8SI,UV32QI,UV32QI))
+DEF_LARCH_FTYPE(2,(V8SI,UV32QI,V32QI))
+
 DEF_LARCH_FTYPE(4,(VOID,V16QI,CVPOINTER,SI,UQI))
 DEF_LARCH_FTYPE(4,(VOID,V8HI,CVPOINTER,SI,UQI))
 DEF_LARCH_FTYPE(4,(VOID,V4SI,CVPOINTER,SI,UQI))
@@ -448,11 +699,29 @@ DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, V8HI, USI))
 DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, V4SI, USI))
 DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, V2DI, USI))
 
+DEF_LARCH_FTYPE (2, (DI, V8SI, UQI))
+DEF_LARCH_FTYPE (2, (UDI, V8SI, UQI))
+
+DEF_LARCH_FTYPE (3, (UV32QI, UV32QI, V32QI, USI))
+DEF_LARCH_FTYPE (3, (UV16HI, UV16HI, V16HI, USI))
+DEF_LARCH_FTYPE (3, (UV8SI, UV8SI, V8SI, USI))
+DEF_LARCH_FTYPE (3, (UV4DI, UV4DI, V4DI, USI))
+
+DEF_LARCH_FTYPE(4,(VOID,V32QI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V16HI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V8SI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V4DI,CVPOINTER,SI,UQI))
+
 DEF_LARCH_FTYPE (1, (BOOLEAN,V16QI))
 DEF_LARCH_FTYPE(2,(V16QI,CVPOINTER,CVPOINTER))
 DEF_LARCH_FTYPE(3,(VOID,V16QI,CVPOINTER,CVPOINTER))
+DEF_LARCH_FTYPE(2,(V32QI,CVPOINTER,CVPOINTER))
+DEF_LARCH_FTYPE(3,(VOID,V32QI,CVPOINTER,CVPOINTER))
 
 DEF_LARCH_FTYPE (3, (V16QI, V16QI, SI, UQI))
 DEF_LARCH_FTYPE (3, (V2DI, V2DI, SI, UQI))
 DEF_LARCH_FTYPE (3, (V2DI, V2DI, DI, UQI))
 DEF_LARCH_FTYPE (3, (V4SI, V4SI, SI, UQI))
+
+DEF_LARCH_FTYPE (2, (V8SF, V8SF, UQI))
+DEF_LARCH_FTYPE (2, (V4DF, V4DF, UQI))
-- 
2.36.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target.
  2023-08-31  9:08 [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
                   ` (3 preceding siblings ...)
  2023-08-31  9:08 ` [PATCH v6 4/4] LoongArch: Add Loongson ASX directive builtin function support Chenghui Pan
@ 2023-08-31  9:41 ` Xi Ruoyao
  2023-08-31 10:11   ` Chenghui Pan
  2023-09-05 11:34 ` chenglulu
  5 siblings, 1 reply; 8+ messages in thread
From: Xi Ruoyao @ 2023-08-31  9:41 UTC (permalink / raw)
  To: Chenghui Pan, gcc-patches; +Cc: i, chenglulu, xuchenghua

On Thu, 2023-08-31 at 17:08 +0800, Chenghui Pan wrote:
> This is an update of:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628303.html
> 
> Changes since last version of patch set:
> - "dg-skip-if"-related Changes of the g++.dg/torture/vshuf* testcases are reverted.
>   (Replaced by __builtin_shuffle fix)
> - Add fix of __builtin_shuffle() for Loongson SX/ASX (Implemeted by adding
>   vand/xvand insn in front of shuffle operation). There's no significant performance
>   impact in current state.

I think it's the correct fix, thanks!

I'm still unsure about the "partly saved register" issue (I'll need to
resolve similar issues for "ILP32 ABI on loongarch64") but it seems GCC
just don't attempt to preserve any vectors in register across function
call.

After the patches are committed I (and Xuerui, maybe) will perform full
system rebuild with LASX enabled to see if there are subtle issues.  IMO
we still have plenty of time to fix them (if there are any) before GCC
14 release.

> - Rebased on the top of Yang Yujie's latest target configuration interface patch set
>   (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628772.html).
> 
> Brief history of patch set:
> v1 -> v2:
> - Reduce usage of "unspec" in RTL template.
> - Append Support of ADDR_REG_REG in LSX and LASX.
> - Constraint docs are appended in gcc/doc/md.texi and ccomment block.
> - Codes related to vecarg are removed.
> - Testsuite of LSX and LASX is added in v2. (Because of the size limitation of
>   mail list, these patches are not shown)
> - Adjust the loongarch_expand_vector_init() function to reduce instruction 
> output amount.
> - Some minor implementation changes of RTL templates.
> 
> v2 -> v3:
> - Revert vabsd/xvabsd RTL templates to unspec impl.
> - Resolve warning in gcc/config/loongarch/loongarch.cc when bootstrapping 
>   with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
> - Remove redundant definitions in lasxintrin.h.
> - Refine commit info.
> 
> v3 -> v4:
> - Code simplification.
> - Testsuite patches are splited from this patch set again and will be
>   submitted independently in the future.
> 
> v4 -> v5:
> - Regression test fix (pr54346.c)
> - Combine vilvh/xvilvh insn's RTL template impl.
> - Add dg-skip-if for loongarch*-*-* in vshuf test inside g++.dg/torture
>   (reverted in this version)
> 
> Lulu Cheng (4):
>   LoongArch: Add Loongson SX base instruction support.
>   LoongArch: Add Loongson SX directive builtin function support.
>   LoongArch: Add Loongson ASX base instruction support.
>   LoongArch: Add Loongson ASX directive builtin function support.
> 
>  gcc/config.gcc                                |    2 +-
>  gcc/config/loongarch/constraints.md           |  131 +-
>  gcc/config/loongarch/genopts/loongarch.opt.in |    4 +
>  gcc/config/loongarch/lasx.md                  | 5104 ++++++++++++++++
>  gcc/config/loongarch/lasxintrin.h             | 5338 +++++++++++++++++
>  gcc/config/loongarch/loongarch-builtins.cc    | 2686 ++++++++-
>  gcc/config/loongarch/loongarch-ftypes.def     |  666 +-
>  gcc/config/loongarch/loongarch-modes.def      |   39 +
>  gcc/config/loongarch/loongarch-protos.h       |   35 +
>  gcc/config/loongarch/loongarch.cc             | 4751 ++++++++++++++-
>  gcc/config/loongarch/loongarch.h              |  117 +-
>  gcc/config/loongarch/loongarch.md             |   56 +-
>  gcc/config/loongarch/loongarch.opt            |    4 +
>  gcc/config/loongarch/lsx.md                   | 4467 ++++++++++++++
>  gcc/config/loongarch/lsxintrin.h              | 5181 ++++++++++++++++
>  gcc/config/loongarch/predicates.md            |  333 +-
>  gcc/doc/md.texi                               |   11 +
>  17 files changed, 28645 insertions(+), 280 deletions(-)
>  create mode 100644 gcc/config/loongarch/lasx.md
>  create mode 100644 gcc/config/loongarch/lasxintrin.h
>  create mode 100644 gcc/config/loongarch/lsx.md
>  create mode 100644 gcc/config/loongarch/lsxintrin.h
> 

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target.
  2023-08-31  9:41 ` [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Xi Ruoyao
@ 2023-08-31 10:11   ` Chenghui Pan
  0 siblings, 0 replies; 8+ messages in thread
From: Chenghui Pan @ 2023-08-31 10:11 UTC (permalink / raw)
  To: Xi Ruoyao, gcc-patches; +Cc: i, chenglulu, xuchenghua

Thanks for the testing work! I will continue to try to find and resolve
some subtle issues too (Such as use compiler to compile some large
project). I'm also curious about the partly saved register problem and
will take some learning and investigation in the future.

On Thu, 2023-08-31 at 17:41 +0800, Xi Ruoyao wrote:
> On Thu, 2023-08-31 at 17:08 +0800, Chenghui Pan wrote:
> > This is an update of:
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628303.html
> > 
> > Changes since last version of patch set:
> > - "dg-skip-if"-related Changes of the g++.dg/torture/vshuf*
> > testcases are reverted.
> >   (Replaced by __builtin_shuffle fix)
> > - Add fix of __builtin_shuffle() for Loongson SX/ASX (Implemeted by
> > adding
> >   vand/xvand insn in front of shuffle operation). There's no
> > significant performance
> >   impact in current state.
> 
> I think it's the correct fix, thanks!
> 
> I'm still unsure about the "partly saved register" issue (I'll need
> to
> resolve similar issues for "ILP32 ABI on loongarch64") but it seems
> GCC
> just don't attempt to preserve any vectors in register across
> function
> call.
> 
> After the patches are committed I (and Xuerui, maybe) will perform
> full
> system rebuild with LASX enabled to see if there are subtle issues. 
> IMO
> we still have plenty of time to fix them (if there are any) before
> GCC
> 14 release.
> 
> > - Rebased on the top of Yang Yujie's latest target configuration
> > interface patch set
> >  
> > (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628772.html)
> > .
> > 
> > Brief history of patch set:
> > v1 -> v2:
> > - Reduce usage of "unspec" in RTL template.
> > - Append Support of ADDR_REG_REG in LSX and LASX.
> > - Constraint docs are appended in gcc/doc/md.texi and ccomment
> > block.
> > - Codes related to vecarg are removed.
> > - Testsuite of LSX and LASX is added in v2. (Because of the size
> > limitation of
> >   mail list, these patches are not shown)
> > - Adjust the loongarch_expand_vector_init() function to reduce
> > instruction 
> > output amount.
> > - Some minor implementation changes of RTL templates.
> > 
> > v2 -> v3:
> > - Revert vabsd/xvabsd RTL templates to unspec impl.
> > - Resolve warning in gcc/config/loongarch/loongarch.cc when
> > bootstrapping 
> >   with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -
> > mlasx".
> > - Remove redundant definitions in lasxintrin.h.
> > - Refine commit info.
> > 
> > v3 -> v4:
> > - Code simplification.
> > - Testsuite patches are splited from this patch set again and will
> > be
> >   submitted independently in the future.
> > 
> > v4 -> v5:
> > - Regression test fix (pr54346.c)
> > - Combine vilvh/xvilvh insn's RTL template impl.
> > - Add dg-skip-if for loongarch*-*-* in vshuf test inside
> > g++.dg/torture
> >   (reverted in this version)
> > 
> > Lulu Cheng (4):
> >   LoongArch: Add Loongson SX base instruction support.
> >   LoongArch: Add Loongson SX directive builtin function support.
> >   LoongArch: Add Loongson ASX base instruction support.
> >   LoongArch: Add Loongson ASX directive builtin function support.
> > 
> >  gcc/config.gcc                                |    2 +-
> >  gcc/config/loongarch/constraints.md           |  131 +-
> >  gcc/config/loongarch/genopts/loongarch.opt.in |    4 +
> >  gcc/config/loongarch/lasx.md                  | 5104
> > ++++++++++++++++
> >  gcc/config/loongarch/lasxintrin.h             | 5338
> > +++++++++++++++++
> >  gcc/config/loongarch/loongarch-builtins.cc    | 2686 ++++++++-
> >  gcc/config/loongarch/loongarch-ftypes.def     |  666 +-
> >  gcc/config/loongarch/loongarch-modes.def      |   39 +
> >  gcc/config/loongarch/loongarch-protos.h       |   35 +
> >  gcc/config/loongarch/loongarch.cc             | 4751
> > ++++++++++++++-
> >  gcc/config/loongarch/loongarch.h              |  117 +-
> >  gcc/config/loongarch/loongarch.md             |   56 +-
> >  gcc/config/loongarch/loongarch.opt            |    4 +
> >  gcc/config/loongarch/lsx.md                   | 4467
> > ++++++++++++++
> >  gcc/config/loongarch/lsxintrin.h              | 5181
> > ++++++++++++++++
> >  gcc/config/loongarch/predicates.md            |  333 +-
> >  gcc/doc/md.texi                               |   11 +
> >  17 files changed, 28645 insertions(+), 280 deletions(-)
> >  create mode 100644 gcc/config/loongarch/lasx.md
> >  create mode 100644 gcc/config/loongarch/lasxintrin.h
> >  create mode 100644 gcc/config/loongarch/lsx.md
> >  create mode 100644 gcc/config/loongarch/lsxintrin.h
> > 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re:[pushed] [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target.
  2023-08-31  9:08 [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
                   ` (4 preceding siblings ...)
  2023-08-31  9:41 ` [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Xi Ruoyao
@ 2023-09-05 11:34 ` chenglulu
  5 siblings, 0 replies; 8+ messages in thread
From: chenglulu @ 2023-09-05 11:34 UTC (permalink / raw)
  To: Chenghui Pan, gcc-patches; +Cc: xry111, i, xuchenghua

Pushed to r14-3700.

在 2023/8/31 下午5:08, Chenghui Pan 写道:
> This is an update of:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628303.html
>
> Changes since last version of patch set:
> - "dg-skip-if"-related Changes of the g++.dg/torture/vshuf* testcases are reverted.
>    (Replaced by __builtin_shuffle fix)
> - Add fix of __builtin_shuffle() for Loongson SX/ASX (Implemeted by adding
>    vand/xvand insn in front of shuffle operation). There's no significant performance
>    impact in current state.
> - Rebased on the top of Yang Yujie's latest target configuration interface patch set
>    (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628772.html).
>
> Brief history of patch set:
> v1 -> v2:
> - Reduce usage of "unspec" in RTL template.
> - Append Support of ADDR_REG_REG in LSX and LASX.
> - Constraint docs are appended in gcc/doc/md.texi and ccomment block.
> - Codes related to vecarg are removed.
> - Testsuite of LSX and LASX is added in v2. (Because of the size limitation of
>    mail list, these patches are not shown)
> - Adjust the loongarch_expand_vector_init() function to reduce instruction
> output amount.
> - Some minor implementation changes of RTL templates.
>
> v2 -> v3:
> - Revert vabsd/xvabsd RTL templates to unspec impl.
> - Resolve warning in gcc/config/loongarch/loongarch.cc when bootstrapping
>    with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
> - Remove redundant definitions in lasxintrin.h.
> - Refine commit info.
>
> v3 -> v4:
> - Code simplification.
> - Testsuite patches are splited from this patch set again and will be
>    submitted independently in the future.
>
> v4 -> v5:
> - Regression test fix (pr54346.c)
> - Combine vilvh/xvilvh insn's RTL template impl.
> - Add dg-skip-if for loongarch*-*-* in vshuf test inside g++.dg/torture
>    (reverted in this version)
>
> Lulu Cheng (4):
>    LoongArch: Add Loongson SX base instruction support.
>    LoongArch: Add Loongson SX directive builtin function support.
>    LoongArch: Add Loongson ASX base instruction support.
>    LoongArch: Add Loongson ASX directive builtin function support.
>
>   gcc/config.gcc                                |    2 +-
>   gcc/config/loongarch/constraints.md           |  131 +-
>   gcc/config/loongarch/genopts/loongarch.opt.in |    4 +
>   gcc/config/loongarch/lasx.md                  | 5104 ++++++++++++++++
>   gcc/config/loongarch/lasxintrin.h             | 5338 +++++++++++++++++
>   gcc/config/loongarch/loongarch-builtins.cc    | 2686 ++++++++-
>   gcc/config/loongarch/loongarch-ftypes.def     |  666 +-
>   gcc/config/loongarch/loongarch-modes.def      |   39 +
>   gcc/config/loongarch/loongarch-protos.h       |   35 +
>   gcc/config/loongarch/loongarch.cc             | 4751 ++++++++++++++-
>   gcc/config/loongarch/loongarch.h              |  117 +-
>   gcc/config/loongarch/loongarch.md             |   56 +-
>   gcc/config/loongarch/loongarch.opt            |    4 +
>   gcc/config/loongarch/lsx.md                   | 4467 ++++++++++++++
>   gcc/config/loongarch/lsxintrin.h              | 5181 ++++++++++++++++
>   gcc/config/loongarch/predicates.md            |  333 +-
>   gcc/doc/md.texi                               |   11 +
>   17 files changed, 28645 insertions(+), 280 deletions(-)
>   create mode 100644 gcc/config/loongarch/lasx.md
>   create mode 100644 gcc/config/loongarch/lasxintrin.h
>   create mode 100644 gcc/config/loongarch/lsx.md
>   create mode 100644 gcc/config/loongarch/lsxintrin.h
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-09-05 11:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-31  9:08 [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
2023-08-31  9:08 ` [PATCH v6 1/4] LoongArch: Add Loongson SX base instruction support Chenghui Pan
2023-08-31  9:08 ` [PATCH v6 2/4] LoongArch: Add Loongson SX directive builtin function support Chenghui Pan
2023-08-31  9:08 ` [PATCH v6 3/4] LoongArch: Add Loongson ASX base instruction support Chenghui Pan
2023-08-31  9:08 ` [PATCH v6 4/4] LoongArch: Add Loongson ASX directive builtin function support Chenghui Pan
2023-08-31  9:41 ` [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target Xi Ruoyao
2023-08-31 10:11   ` Chenghui Pan
2023-09-05 11:34 ` chenglulu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).