From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 9C3383857710 for ; Fri, 30 Jun 2023 02:17:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9C3383857710 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qF3hD-0004re-F6 for gcc-patches@gcc.gnu.org; Thu, 29 Jun 2023 22:17:23 -0400 Received: from loongson.cn (unknown [10.20.4.10]) by gateway (Coremail) with SMTP id _____8BxlMQhO55kEB4EAA--.6617S3; Fri, 30 Jun 2023 10:17:05 +0800 (CST) Received: from loongson-pc.loongson.cn (unknown [10.20.4.10]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxfSMTO55kbnsSAA--.14825S6; Fri, 30 Jun 2023 10:17:02 +0800 (CST) From: Chenghui Pan To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn Subject: [PATCH v1 2/6] LoongArch: Added Loongson SX base instruction support. Date: Fri, 30 Jun 2023 10:16:10 +0800 Message-Id: <20230630021614.57201-3-panchenghui@loongson.cn> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230630021614.57201-1-panchenghui@loongson.cn> References: <20230630021614.57201-1-panchenghui@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID:AQAAf8DxfSMTO55kbnsSAA--.14825S6 X-CM-SenderInfo: psdquxxhqjx33l6o00pqjv00gofq/1tbiAQASBGSdBKsYNQABst X-Coremail-Antispam: 1Uk129KBj9DXoWkCw4ftrW5Cr1xuFyUZryrGrX_yoW5Xw4UZw c_Ww1Syr17Jry5Wa9Yqws29r15GrykJF10kFnxZFyUWas2gw1rtw1qqrs7ZasxZrn7trZ3 tryqkFs09r1Sgr1kKosvyTuYvTs0mTUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUj1kv1T uYvTs0mT0YCTnIWjqI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9f nUUIcSsGvfJTRUUUb28YFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4 v20xvaj40_Wr0E3s1l1IIY67AEw4v_JrI_Jryl8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW5JVW7JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVWxJVW8Jr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x02 67AKxVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6x ACxx1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E 87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82 IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC2 0s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMI IF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF 0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87 Iv6xkF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07j1LvtUUUUU= Received-SPF: pass client-ip=114.242.206.163; envelope-from=panchenghui@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9,SPF_HELO_NONE=0.001,SPF_PASS=-0.001,T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-14.4 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_STOCKGEN,SPF_FAIL,SPF_HELO_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: Lulu Cheng gcc/ChangeLog: * config/loongarch/constraints.md (M):Added Loongson LSX base instruction support. (N): Ditto. (O): Ditto. (P): Ditto. (R): Ditto. (S): Ditto. (YG): Ditto. (YA): Ditto. (YB): Ditto. (Yb): Ditto. (Yh): Ditto. (Yw): Ditto. (YI): Ditto. (YC): Ditto. (YZ): Ditto. (Unv5): Ditto. (Uuv5): Ditto. (Usv5): Ditto. (Uuv6): Ditto. (Urv8): Ditto. * config/loongarch/loongarch-builtins.cc (loongarch_gen_const_int_vector): Ditto. * config/loongarch/loongarch-modes.def (VECTOR_MODES): Ditto. (VECTOR_MODE): Ditto. (INT_MODE): Ditto. * config/loongarch/loongarch-protos.h (loongarch_split_move_insn_p): Ditto. (loongarch_split_move_insn): Ditto. (loongarch_split_128bit_move): Ditto. (loongarch_split_128bit_move_p): Ditto. (loongarch_split_lsx_copy_d): Ditto. (loongarch_split_lsx_insert_d): Ditto. (loongarch_split_lsx_fill_d): Ditto. (loongarch_expand_vec_cmp): Ditto. (loongarch_const_vector_same_val_p): Ditto. (loongarch_const_vector_same_bytes_p): Ditto. (loongarch_const_vector_same_int_p): Ditto. (loongarch_const_vector_shuffle_set_p): Ditto. (loongarch_const_vector_bitimm_set_p): Ditto. (loongarch_const_vector_bitimm_clr_p): Ditto. (loongarch_lsx_vec_parallel_const_half): Ditto. (loongarch_gen_const_int_vector): Ditto. (loongarch_lsx_output_division): Ditto. (loongarch_expand_vector_init): Ditto. (loongarch_expand_vec_unpack): Ditto. (loongarch_expand_vec_perm): Ditto. (loongarch_expand_vector_extract): Ditto. (loongarch_expand_vector_reduc): Ditto. (loongarch_ldst_scaled_shift): Ditto. (loongarch_expand_vec_cond_expr): Ditto. (loongarch_expand_vec_cond_mask_expr): Ditto. (loongarch_builtin_vectorized_function): Ditto. (loongarch_gen_const_int_vector_shuffle): Ditto. (loongarch_build_signbit_mask): Ditto. * config/loongarch/loongarch.cc (loongarch_flatten_aggregate_field): Ditto. (loongarch_flatten_aggregate_argument): Ditto. (loongarch_pass_aggregate_num_fpr): Ditto. (loongarch_pass_aggregate_in_fpr_and_gpr_p): Ditto. (loongarch_get_arg_info): Ditto. (loongarch_setup_incoming_varargs): Ditto. (loongarch_emit_move): Ditto. (loongarch_const_vector_bitimm_set_p): Ditto. (loongarch_const_vector_bitimm_clr_p): Ditto. (loongarch_const_vector_same_val_p): Ditto. (loongarch_const_vector_same_bytes_p): Ditto. (loongarch_const_vector_same_int_p): Ditto. (loongarch_const_vector_shuffle_set_p): Ditto. (loongarch_symbol_insns): Ditto. (loongarch_cannot_force_const_mem): Ditto. (loongarch_valid_offset_p): Ditto. (loongarch_valid_index_p): Ditto. (loongarch_classify_address): Ditto. (loongarch_address_insns): Ditto. (loongarch_ldst_scaled_shift): Ditto. (loongarch_const_insns): Ditto. (loongarch_split_move_insn_p): Ditto. (loongarch_subword_at_byte): Ditto. (loongarch_legitimize_move): Ditto. (loongarch_builtin_vectorization_cost): Ditto. (loongarch_split_move_p): Ditto. (loongarch_split_move): Ditto. (loongarch_split_move_insn): Ditto. (loongarch_output_move_index): Ditto. (loongarch_output_move_index_float): Ditto. (loongarch_split_128bit_move_p): Ditto. (loongarch_split_128bit_move): Ditto. (loongarch_split_lsx_copy_d): Ditto. (loongarch_split_lsx_insert_d): Ditto. (loongarch_split_lsx_fill_d): Ditto. (loongarch_output_move): Ditto. (loongarch_extend_comparands): Ditto. (loongarch_print_operand_reloc): Ditto. (loongarch_print_operand): Ditto. (loongarch_hard_regno_mode_ok_uncached): Ditto. (loongarch_hard_regno_call_part_clobbered): Ditto. (loongarch_hard_regno_nregs): Ditto. (loongarch_class_max_nregs): Ditto. (loongarch_can_change_mode_class): Ditto. (loongarch_mode_ok_for_mov_fmt_p): Ditto. (loongarch_secondary_reload): Ditto. (loongarch_vector_mode_supported_p): Ditto. (loongarch_preferred_simd_mode): Ditto. (loongarch_autovectorize_vector_modes): Ditto. (loongarch_lsx_output_division): Ditto. (loongarch_option_override_internal): Ditto. (loongarch_hard_regno_caller_save_mode): Ditto. (MAX_VECT_LEN): Ditto. (loongarch_spill_class): Ditto. (struct expand_vec_perm_d): Ditto. (loongarch_promote_function_mode): Ditto. (loongarch_expand_vselect): Ditto. (loongarch_starting_frame_offset): Ditto. (loongarch_expand_vselect_vconcat): Ditto. (TARGET_ASM_ALIGNED_DI_OP): Ditto. (TARGET_OPTION_OVERRIDE): Ditto. (TARGET_LEGITIMIZE_ADDRESS): Ditto. (loongarch_expand_lsx_shuffle): Ditto. (TARGET_ASM_SELECT_RTX_SECTION): Ditto. (TARGET_ASM_FUNCTION_RODATA_SECTION): Ditto. (TARGET_SCHED_INIT): Ditto. (TARGET_SCHED_REORDER): Ditto. (TARGET_SCHED_REORDER2): Ditto. (TARGET_SCHED_VARIABLE_ISSUE): Ditto. (TARGET_SCHED_ADJUST_COST): Ditto. (TARGET_SCHED_ISSUE_RATE): Ditto. (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Ditto. (TARGET_FUNCTION_OK_FOR_SIBCALL): Ditto. (TARGET_VALID_POINTER_MODE): Ditto. (TARGET_REGISTER_MOVE_COST): Ditto. (TARGET_MEMORY_MOVE_COST): Ditto. (TARGET_RTX_COSTS): Ditto. (TARGET_ADDRESS_COST): Ditto. (TARGET_IN_SMALL_DATA_P): Ditto. (TARGET_PREFERRED_RELOAD_CLASS): Ditto. (TARGET_ASM_FILE_START_FILE_DIRECTIVE): Ditto. (loongarch_expand_vec_perm): Ditto. (TARGET_EXPAND_BUILTIN_VA_START): Ditto. (TARGET_PROMOTE_FUNCTION_MODE): Ditto. (TARGET_RETURN_IN_MEMORY): Ditto. (TARGET_FUNCTION_VALUE): Ditto. (TARGET_LIBCALL_VALUE): Ditto. (loongarch_try_expand_lsx_vshuf_const): Ditto. (TARGET_ASM_OUTPUT_MI_THUNK): Ditto. (TARGET_ASM_CAN_OUTPUT_MI_THUNK): Ditto. (TARGET_PRINT_OPERAND): Ditto. (TARGET_PRINT_OPERAND_ADDRESS): Ditto. (TARGET_PRINT_OPERAND_PUNCT_VALID_P): Ditto. (TARGET_SETUP_INCOMING_VARARGS): Ditto. (TARGET_STRICT_ARGUMENT_NAMING): Ditto. (TARGET_MUST_PASS_IN_STACK): Ditto. (TARGET_PASS_BY_REFERENCE): Ditto. (TARGET_ARG_PARTIAL_BYTES): Ditto. (TARGET_FUNCTION_ARG): Ditto. (TARGET_FUNCTION_ARG_ADVANCE): Ditto. (TARGET_FUNCTION_ARG_BOUNDARY): Ditto. (TARGET_SCALAR_MODE_SUPPORTED_P): Ditto. (TARGET_INIT_BUILTINS): Ditto. (TARGET_BUILTIN_DECL): Ditto. (TARGET_EXPAND_BUILTIN): Ditto. (loongarch_expand_vec_perm_const_1): Ditto. (loongarch_expand_vec_perm_const_2): Ditto. (loongarch_vectorize_vec_perm_const): Ditto. (loongarch_sched_reassociation_width): Ditto. (loongarch_expand_vector_extract): Ditto. (emit_reduc_half): Ditto. (loongarch_expand_vector_reduc): Ditto. (loongarch_expand_vec_unpack): Ditto. (loongarch_lsx_vec_parallel_const_half): Ditto. (loongarch_constant_elt_p): Ditto. (loongarch_gen_const_int_vector_shuffle): Ditto. (loongarch_expand_vector_init): Ditto. (loongarch_expand_lsx_cmp): Ditto. (loongarch_expand_vec_cond_expr): Ditto. (loongarch_expand_vec_cond_mask_expr): Ditto. (loongarch_expand_vec_cmp): Ditto. (loongarch_case_values_threshold): Ditto. (loongarch_build_const_vector): Ditto. (loongarch_build_signbit_mask): Ditto. (loongarch_builtin_support_vector_misalignment): Ditto. (TARGET_ASM_ALIGNED_HI_OP): Ditto. (TARGET_ASM_ALIGNED_SI_OP): Ditto. (TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Ditto. (TARGET_VECTOR_MODE_SUPPORTED_P): Ditto. (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto. (TARGET_VECTORIZE_VEC_PERM_CONST): Ditto. (TARGET_SCHED_REASSOCIATION_WIDTH): Ditto. (TARGET_CASE_VALUES_THRESHOLD): Ditto. (TARGET_HARD_REGNO_CALL_PART_CLOBBERED): Ditto. (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): Ditto. * config/loongarch/loongarch.h (TARGET_SUPPORTS_WIDE_INT): Ditto. (UNITS_PER_LSX_REG): Ditto. (BITS_PER_LSX_REG): Ditto. (BIGGEST_ALIGNMENT): Ditto. (LSX_REG_FIRST): Ditto. (LSX_REG_LAST): Ditto. (LSX_REG_NUM): Ditto. (LSX_REG_P): Ditto. (LSX_REG_RTX_P): Ditto. (IMM13_OPERAND): Ditto. (LSX_SUPPORTED_MODE_P): Ditto. * config/loongarch/loongarch.md (unknown,add,sub,not,nor,and,or,xor): Ditto. (unknown,add,sub,not,nor,and,or,xor,simd_add): Ditto. (unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC): Ditto. (mode" ): Ditto. (DF): Ditto. (SF): Ditto. (sf): Ditto. (DI): Ditto. (SI): Ditto. * config/loongarch/predicates.md (const_lsx_branch_operand): Ditto. (const_uimm3_operand): Ditto. (const_8_to_11_operand): Ditto. (const_12_to_15_operand): Ditto. (const_uimm4_operand): Ditto. (const_uimm6_operand): Ditto. (const_uimm7_operand): Ditto. (const_uimm8_operand): Ditto. (const_imm5_operand): Ditto. (const_imm10_operand): Ditto. (const_imm13_operand): Ditto. (reg_imm10_operand): Ditto. (aq8b_operand): Ditto. (aq8h_operand): Ditto. (aq8w_operand): Ditto. (aq8d_operand): Ditto. (aq10b_operand): Ditto. (aq10h_operand): Ditto. (aq10w_operand): Ditto. (aq10d_operand): Ditto. (aq12b_operand): Ditto. (aq12h_operand): Ditto. (aq12w_operand): Ditto. (aq12d_operand): Ditto. (const_m1_operand): Ditto. (reg_or_m1_operand): Ditto. (const_exp_2_operand): Ditto. (const_exp_4_operand): Ditto. (const_exp_8_operand): Ditto. (const_exp_16_operand): Ditto. (const_exp_32_operand): Ditto. (const_0_or_1_operand): Ditto. (const_0_to_3_operand): Ditto. (const_0_to_7_operand): Ditto. (const_2_or_3_operand): Ditto. (const_4_to_7_operand): Ditto. (const_8_to_15_operand): Ditto. (const_16_to_31_operand): Ditto. (qi_mask_operand): Ditto. (hi_mask_operand): Ditto. (si_mask_operand): Ditto. (d_operand): Ditto. (db4_operand): Ditto. (db7_operand): Ditto. (db8_operand): Ditto. (ib3_operand): Ditto. (sb4_operand): Ditto. (sb5_operand): Ditto. (sb8_operand): Ditto. (sd8_operand): Ditto. (ub4_operand): Ditto. (ub8_operand): Ditto. (uh4_operand): Ditto. (uw4_operand): Ditto. (uw5_operand): Ditto. (uw6_operand): Ditto. (uw8_operand): Ditto. (addiur2_operand): Ditto. (addiusp_operand): Ditto. (andi16_operand): Ditto. (movep_src_register): Ditto. (movep_src_operand): Ditto. (fcc_reload_operand): Ditto. (muldiv_target_operand): Ditto. (const_vector_same_val_operand): Ditto. (const_vector_same_simm5_operand): Ditto. (const_vector_same_uimm5_operand): Ditto. (const_vector_same_ximm5_operand): Ditto. (const_vector_same_uimm6_operand): Ditto. (par_const_vector_shf_set_operand): Ditto. (reg_or_vector_same_val_operand): Ditto. (reg_or_vector_same_simm5_operand): Ditto. (reg_or_vector_same_uimm5_operand): Ditto. (reg_or_vector_same_ximm5_operand): Ditto. (reg_or_vector_same_uimm6_operand): Ditto. * config/loongarch/lsx.md: New file. --- gcc/config/loongarch/constraints.md | 128 +- gcc/config/loongarch/loongarch-builtins.cc | 10 + gcc/config/loongarch/loongarch-modes.def | 38 + gcc/config/loongarch/loongarch-protos.h | 31 + gcc/config/loongarch/loongarch.cc | 2235 +++++++++- gcc/config/loongarch/loongarch.h | 65 +- gcc/config/loongarch/loongarch.md | 44 +- gcc/config/loongarch/lsx.md | 4490 ++++++++++++++++++++ gcc/config/loongarch/predicates.md | 333 +- 9 files changed, 7184 insertions(+), 190 deletions(-) create mode 100644 gcc/config/loongarch/lsx.md diff --git a/gcc/config/loongarch/constraints.md b/gcc/config/loongarch/constraints.md index 7a38cd07ae9..1dd56af07c4 100644 --- a/gcc/config/loongarch/constraints.md +++ b/gcc/config/loongarch/constraints.md @@ -30,8 +30,7 @@ ;; "h" <-----unused ;; "i" "Matches a general integer constant." (Global non-architectural) ;; "j" SIBCALL_REGS -;; "k" "A memory operand whose address is formed by a base register and -;; (optionally scaled) index register." +;; "k" <-----unused ;; "l" "A signed 16-bit constant." ;; "m" "A memory operand whose address is formed by a base register and offset ;; that is suitable for use in instructions with the same addressing mode @@ -80,13 +79,14 @@ ;; "N" <-----unused ;; "O" <-----unused ;; "P" <-----unused -;; "Q" <-----unused +;; "Q" "A signed 12-bit constant" ;; "R" <-----unused ;; "S" <-----unused ;; "T" <-----unused ;; "U" <-----unused ;; "V" "Matches a non-offsettable memory reference." (Global non-architectural) -;; "W" <-----unused +;; "W" "A memory address based on a member of @code{BASE_REG_CLASS}. This is +;; true for all references." ;; "X" "Matches anything." (Global non-architectural) ;; "Y" - ;; "Yd" @@ -214,6 +214,63 @@ (define_constraint "Le" (and (match_code "const_int") (match_test "loongarch_addu16i_imm12_operand_p (ival, SImode)"))) +(define_constraint "M" + "A constant that cannot be loaded using @code{lui}, @code{addiu} + or @code{ori}." + (and (match_code "const_int") + (not (match_test "IMM12_OPERAND (ival)")) + (not (match_test "IMM12_OPERAND_UNSIGNED (ival)")) + (not (match_test "LU12I_OPERAND (ival)")))) + +(define_constraint "N" + "A constant in the range -65535 to -1 (inclusive)." + (and (match_code "const_int") + (match_test "ival >= -0xffff && ival < 0"))) + +(define_constraint "O" + "A signed 15-bit constant." + (and (match_code "const_int") + (match_test "ival >= -0x4000 && ival < 0x4000"))) + +(define_constraint "P" + "A constant in the range 1 to 65535 (inclusive)." + (and (match_code "const_int") + (match_test "ival > 0 && ival < 0x10000"))) + +;; General constraints + +(define_memory_constraint "R" + "An address that can be used in a non-macro load or store." + (and (match_code "mem") + (match_test "loongarch_address_insns (XEXP (op, 0), mode, false) == 1"))) +(define_constraint "S" + "@internal + A constant call address." + (and (match_operand 0 "call_insn_operand") + (match_test "CONSTANT_P (op)"))) + +(define_constraint "YG" + "@internal + A vector zero." + (and (match_code "const_vector") + (match_test "op == CONST0_RTX (mode)"))) + +(define_constraint "YA" + "@internal + An unsigned 6-bit constant." + (and (match_code "const_int") + (match_test "UIMM6_OPERAND (ival)"))) + +(define_constraint "YB" + "@internal + A signed 10-bit constant." + (and (match_code "const_int") + (match_test "IMM10_OPERAND (ival)"))) + +(define_constraint "Yb" + "@internal" + (match_operand 0 "qi_mask_operand")) + (define_constraint "Yd" "@internal A constant @code{move_operand} that can be safely loaded using @@ -221,10 +278,73 @@ (define_constraint "Yd" (and (match_operand 0 "move_operand") (match_test "CONSTANT_P (op)"))) +(define_constraint "Yh" + "@internal" + (match_operand 0 "hi_mask_operand")) + +(define_constraint "Yw" + "@internal" + (match_operand 0 "si_mask_operand")) + (define_constraint "Yx" "@internal" (match_operand 0 "low_bitmask_operand")) +(define_constraint "YI" + "@internal + A replicated vector const in which the replicated value is in the range + [-512,511]." + (and (match_code "const_vector") + (match_test "loongarch_const_vector_same_int_p (op, mode, -512, 511)"))) + +(define_constraint "YC" + "@internal + A replicated vector const in which the replicated value has a single + bit set." + (and (match_code "const_vector") + (match_test "loongarch_const_vector_bitimm_set_p (op, mode)"))) + +(define_constraint "YZ" + "@internal + A replicated vector const in which the replicated value has a single + bit clear." + (and (match_code "const_vector") + (match_test "loongarch_const_vector_bitimm_clr_p (op, mode)"))) + +(define_constraint "Unv5" + "@internal + A replicated vector const in which the replicated value is in the range + [-31,0]." + (and (match_code "const_vector") + (match_test "loongarch_const_vector_same_int_p (op, mode, -31, 0)"))) + +(define_constraint "Uuv5" + "@internal + A replicated vector const in which the replicated value is in the range + [0,31]." + (and (match_code "const_vector") + (match_test "loongarch_const_vector_same_int_p (op, mode, 0, 31)"))) + +(define_constraint "Usv5" + "@internal + A replicated vector const in which the replicated value is in the range + [-16,15]." + (and (match_code "const_vector") + (match_test "loongarch_const_vector_same_int_p (op, mode, -16, 15)"))) + +(define_constraint "Uuv6" + "@internal + A replicated vector const in which the replicated value is in the range + [0,63]." + (and (match_code "const_vector") + (match_test "loongarch_const_vector_same_int_p (op, mode, 0, 63)"))) + +(define_constraint "Urv8" + "@internal + A replicated vector const with replicated byte values as well as elements" + (and (match_code "const_vector") + (match_test "loongarch_const_vector_same_bytes_p (op, mode)"))) + (define_memory_constraint "ZC" "A memory operand whose address is formed by a base register and offset that is suitable for use in instructions with the same addressing mode diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc index b929f224dfa..ebe70a986c3 100644 --- a/gcc/config/loongarch/loongarch-builtins.cc +++ b/gcc/config/loongarch/loongarch-builtins.cc @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. If not see #include "fold-const.h" #include "expr.h" #include "langhooks.h" +#include "emit-rtl.h" /* Macros to create an enumeration identifier for a function prototype. */ #define LARCH_FTYPE_NAME1(A, B) LARCH_##A##_FTYPE_##B @@ -297,6 +298,15 @@ loongarch_prepare_builtin_arg (struct expand_operand *op, tree exp, create_input_operand (op, value, TYPE_MODE (TREE_TYPE (arg))); } +/* Return a const_int vector of VAL with mode MODE. */ + +rtx +loongarch_gen_const_int_vector (machine_mode mode, HOST_WIDE_INT val) +{ + rtx c = gen_int_mode (val, GET_MODE_INNER (mode)); + return gen_const_vec_duplicate (mode, c); +} + /* Expand instruction ICODE as part of a built-in function sequence. Use the first NOPS elements of OPS as the instruction's operands. HAS_TARGET_P is true if operand 0 is a target; it is false if the diff --git a/gcc/config/loongarch/loongarch-modes.def b/gcc/config/loongarch/loongarch-modes.def index 8082ce993a5..6f57b60525d 100644 --- a/gcc/config/loongarch/loongarch-modes.def +++ b/gcc/config/loongarch/loongarch-modes.def @@ -23,3 +23,41 @@ FLOAT_MODE (TF, 16, ieee_quad_format); /* For floating point conditions in FCC registers. */ CC_MODE (FCC); + +/* Vector modes. */ +VECTOR_MODES (INT, 4); /* V4QI V2HI */ +VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */ +VECTOR_MODES (FLOAT, 8); /* V4HF V2SF */ + +/* For LARCH LSX 128 bits. */ +VECTOR_MODES (INT, 16); /* V16QI V8HI V4SI V2DI */ +VECTOR_MODES (FLOAT, 16); /* V4SF V2DF */ + +VECTOR_MODES (INT, 32); /* V32QI V16HI V8SI V4DI */ +VECTOR_MODES (FLOAT, 32); /* V8SF V4DF */ + +/* Double-sized vector modes for vec_concat. */ +/* VECTOR_MODE (INT, QI, 32); V32QI */ +/* VECTOR_MODE (INT, HI, 16); V16HI */ +/* VECTOR_MODE (INT, SI, 8); V8SI */ +/* VECTOR_MODE (INT, DI, 4); V4DI */ +/* VECTOR_MODE (FLOAT, SF, 8); V8SF */ +/* VECTOR_MODE (FLOAT, DF, 4); V4DF */ + +VECTOR_MODE (INT, QI, 64); /* V64QI */ +VECTOR_MODE (INT, HI, 32); /* V32HI */ +VECTOR_MODE (INT, SI, 16); /* V16SI */ +VECTOR_MODE (INT, DI, 8); /* V8DI */ +VECTOR_MODE (FLOAT, SF, 16); /* V16SF */ +VECTOR_MODE (FLOAT, DF, 8); /* V8DF */ + +VECTOR_MODES (FRACT, 4); /* V4QQ V2HQ */ +VECTOR_MODES (UFRACT, 4); /* V4UQQ V2UHQ */ +VECTOR_MODES (ACCUM, 4); /* V2HA */ +VECTOR_MODES (UACCUM, 4); /* V2UHA */ + +INT_MODE (OI, 32); + +/* Keep the OI modes from confusing the compiler into thinking + that these modes could actually be used for computation. They are + only holders for vectors during data movement. */ diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h index b71b188507a..fc33527cdcf 100644 --- a/gcc/config/loongarch/loongarch-protos.h +++ b/gcc/config/loongarch/loongarch-protos.h @@ -85,10 +85,18 @@ extern bool loongarch_split_move_p (rtx, rtx); extern void loongarch_split_move (rtx, rtx, rtx); extern bool loongarch_addu16i_imm12_operand_p (HOST_WIDE_INT, machine_mode); extern void loongarch_split_plus_constant (rtx *, machine_mode); +extern bool loongarch_split_move_insn_p (rtx, rtx); +extern void loongarch_split_move_insn (rtx, rtx, rtx); +extern void loongarch_split_128bit_move (rtx, rtx); +extern bool loongarch_split_128bit_move_p (rtx, rtx); +extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx (*)(rtx, rtx, rtx)); +extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx); +extern void loongarch_split_lsx_fill_d (rtx, rtx); extern const char *loongarch_output_move (rtx, rtx); extern bool loongarch_cfun_has_cprestore_slot_p (void); #ifdef RTX_CODE extern void loongarch_expand_scc (rtx *); +extern bool loongarch_expand_vec_cmp (rtx *); extern void loongarch_expand_conditional_branch (rtx *); extern void loongarch_expand_conditional_move (rtx *); extern void loongarch_expand_conditional_trap (rtx); @@ -110,6 +118,15 @@ extern bool loongarch_small_data_pattern_p (rtx); extern rtx loongarch_rewrite_small_data (rtx); extern rtx loongarch_return_addr (int, rtx); +extern bool loongarch_const_vector_same_val_p (rtx, machine_mode); +extern bool loongarch_const_vector_same_bytes_p (rtx, machine_mode); +extern bool loongarch_const_vector_same_int_p (rtx, machine_mode, HOST_WIDE_INT, + HOST_WIDE_INT); +extern bool loongarch_const_vector_shuffle_set_p (rtx, machine_mode); +extern bool loongarch_const_vector_bitimm_set_p (rtx, machine_mode); +extern bool loongarch_const_vector_bitimm_clr_p (rtx, machine_mode); +extern rtx loongarch_lsx_vec_parallel_const_half (machine_mode, bool); +extern rtx loongarch_gen_const_int_vector (machine_mode, HOST_WIDE_INT); extern enum reg_class loongarch_secondary_reload_class (enum reg_class, machine_mode, rtx, bool); @@ -129,6 +146,7 @@ extern const char *loongarch_output_equal_conditional_branch (rtx_insn *, rtx *, bool); extern const char *loongarch_output_division (const char *, rtx *); +extern const char *loongarch_lsx_output_division (const char *, rtx *); extern const char *loongarch_output_probe_stack_range (rtx, rtx, rtx); extern bool loongarch_hard_regno_rename_ok (unsigned int, unsigned int); extern int loongarch_dspalu_bypass_p (rtx, rtx); @@ -156,6 +174,13 @@ union loongarch_gen_fn_ptrs extern void loongarch_expand_atomic_qihi (union loongarch_gen_fn_ptrs, rtx, rtx, rtx, rtx, rtx); +extern void loongarch_expand_vector_init (rtx, rtx); +extern void loongarch_expand_vec_unpack (rtx op[2], bool, bool); +extern void loongarch_expand_vec_perm (rtx, rtx, rtx, rtx); +extern void loongarch_expand_vector_extract (rtx, rtx, int); +extern void loongarch_expand_vector_reduc (rtx (*)(rtx, rtx, rtx), rtx, rtx); + +extern int loongarch_ldst_scaled_shift (machine_mode); extern bool loongarch_signed_immediate_p (unsigned HOST_WIDE_INT, int, int); extern bool loongarch_unsigned_immediate_p (unsigned HOST_WIDE_INT, int, int); extern bool loongarch_12bit_offset_address_p (rtx, machine_mode); @@ -171,6 +196,9 @@ extern bool loongarch_split_symbol_type (enum loongarch_symbol_type); typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx); extern void loongarch_register_frame_header_opt (void); +extern void loongarch_expand_vec_cond_expr (machine_mode, machine_mode, rtx *); +extern void loongarch_expand_vec_cond_mask_expr (machine_mode, machine_mode, + rtx *); /* Routines implemented in loongarch-c.c. */ void loongarch_cpu_cpp_builtins (cpp_reader *); @@ -180,6 +208,9 @@ extern void loongarch_atomic_assign_expand_fenv (tree *, tree *, tree *); extern tree loongarch_builtin_decl (unsigned int, bool); extern rtx loongarch_expand_builtin (tree, rtx, rtx subtarget ATTRIBUTE_UNUSED, machine_mode, int); +extern tree loongarch_builtin_vectorized_function (unsigned int, tree, tree); +extern rtx loongarch_gen_const_int_vector_shuffle (machine_mode, int); extern tree loongarch_build_builtin_va_list (void); +extern rtx loongarch_build_signbit_mask (machine_mode, bool, bool); #endif /* ! GCC_LOONGARCH_PROTOS_H */ diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 5b8b93eb24b..11cbb33e3ad 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -310,7 +310,8 @@ typedef struct static int loongarch_flatten_aggregate_field (const_tree type, loongarch_aggregate_field fields[2], int n, - HOST_WIDE_INT offset) + HOST_WIDE_INT offset, + const int use_vecarg_p) { switch (TREE_CODE (type)) { @@ -332,7 +333,7 @@ loongarch_flatten_aggregate_field (const_tree type, HOST_WIDE_INT pos = offset + int_byte_position (f); n = loongarch_flatten_aggregate_field (TREE_TYPE (f), fields, n, - pos); + pos, 0); if (n < 0) return -1; } @@ -346,7 +347,7 @@ loongarch_flatten_aggregate_field (const_tree type, tree elt_size = TYPE_SIZE_UNIT (TREE_TYPE (type)); int n_subfields = loongarch_flatten_aggregate_field (TREE_TYPE (type), subfields, 0, - offset); + offset, 0); /* Can't handle incomplete types nor sizes that are not fixed. */ if (n_subfields <= 0 @@ -399,11 +400,14 @@ loongarch_flatten_aggregate_field (const_tree type, } default: - if (n < 2 + if ((n < 2 && ((SCALAR_FLOAT_TYPE_P (type) && GET_MODE_SIZE (TYPE_MODE (type)) <= UNITS_PER_FP_ARG) || (INTEGRAL_TYPE_P (type) && GET_MODE_SIZE (TYPE_MODE (type)) <= UNITS_PER_WORD))) + || (use_vecarg_p && VECTOR_TYPE_P (type) + && (ISA_HAS_LSX && GET_MODE_SIZE (TYPE_MODE (type)) + <= UNITS_PER_LSX_REG))) { fields[n].type = type; fields[n].offset = offset; @@ -419,12 +423,14 @@ loongarch_flatten_aggregate_field (const_tree type, static int loongarch_flatten_aggregate_argument (const_tree type, - loongarch_aggregate_field fields[2]) + loongarch_aggregate_field fields[2], + const int use_vecarg_p) { - if (!type || TREE_CODE (type) != RECORD_TYPE) + if (!type || !(TREE_CODE (type) == RECORD_TYPE + || (use_vecarg_p && TREE_CODE (type) == VECTOR_TYPE))) return -1; - return loongarch_flatten_aggregate_field (type, fields, 0, 0); + return loongarch_flatten_aggregate_field (type, fields, 0, 0, use_vecarg_p); } /* See whether TYPE is a record whose fields should be returned in one or @@ -432,12 +438,14 @@ loongarch_flatten_aggregate_argument (const_tree type, static unsigned loongarch_pass_aggregate_num_fpr (const_tree type, - loongarch_aggregate_field fields[2]) + loongarch_aggregate_field fields[2], + const int use_vecarg_p) { - int n = loongarch_flatten_aggregate_argument (type, fields); + int n = loongarch_flatten_aggregate_argument (type, fields, use_vecarg_p); for (int i = 0; i < n; i++) - if (!SCALAR_FLOAT_TYPE_P (fields[i].type)) + if (!SCALAR_FLOAT_TYPE_P (fields[i].type) + && !VECTOR_TYPE_P (fields[i].type)) return 0; return n > 0 ? n : 0; @@ -452,7 +460,7 @@ loongarch_pass_aggregate_in_fpr_and_gpr_p (const_tree type, loongarch_aggregate_field fields[2]) { unsigned num_int = 0, num_float = 0; - int n = loongarch_flatten_aggregate_argument (type, fields); + int n = loongarch_flatten_aggregate_argument (type, fields, 0); for (int i = 0; i < n; i++) { @@ -523,6 +531,9 @@ loongarch_get_arg_info (struct loongarch_arg_info *info, unsigned gpr_base = return_p ? GP_RETURN : GP_ARG_FIRST; unsigned alignment = loongarch_function_arg_boundary (mode, type); + int use_vecarg_p = TARGET_VECARG + && LSX_SUPPORTED_MODE_P (mode); + memset (info, 0, sizeof (*info)); info->gpr_offset = cum->num_gprs; info->fpr_offset = cum->num_fprs; @@ -535,7 +546,7 @@ loongarch_get_arg_info (struct loongarch_arg_info *info, /* Pass one- or two-element floating-point aggregates in FPRs. */ if ((info->num_fprs - = loongarch_pass_aggregate_num_fpr (type, fields)) + = loongarch_pass_aggregate_num_fpr (type, fields, use_vecarg_p)) && info->fpr_offset + info->num_fprs <= MAX_ARGS_IN_REGISTERS) switch (info->num_fprs) { @@ -773,7 +784,7 @@ loongarch_setup_incoming_varargs (cumulative_args_t cum, { rtx ptr = plus_constant (Pmode, virtual_incoming_args_rtx, REG_PARM_STACK_SPACE (cfun->decl) - - gp_saved * UNITS_PER_WORD); + - gp_saved * UNITS_PER_WORD); rtx mem = gen_frame_mem (BLKmode, ptr); set_mem_alias_set (mem, get_varargs_alias_set ()); @@ -1049,7 +1060,7 @@ rtx loongarch_emit_move (rtx dest, rtx src) { return (can_create_pseudo_p () ? emit_move_insn (dest, src) - : emit_move_insn_1 (dest, src)); + : emit_move_insn_1 (dest, src)); } /* Save register REG to MEM. Make the instruction frame-related. */ @@ -1675,6 +1686,140 @@ loongarch_symbol_binds_local_p (const_rtx x) return false; } +/* Return true if OP is a constant vector with the number of units in MODE, + and each unit has the same bit set. */ + +bool +loongarch_const_vector_bitimm_set_p (rtx op, machine_mode mode) +{ + if (GET_CODE (op) == CONST_VECTOR && op != CONST0_RTX (mode)) + { + unsigned HOST_WIDE_INT val = UINTVAL (CONST_VECTOR_ELT (op, 0)); + int vlog2 = exact_log2 (val & GET_MODE_MASK (GET_MODE_INNER (mode))); + + if (vlog2 != -1) + { + gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT); + gcc_assert (vlog2 >= 0 && vlog2 <= GET_MODE_UNIT_BITSIZE (mode) - 1); + return loongarch_const_vector_same_val_p (op, mode); + } + } + + return false; +} + +/* Return true if OP is a constant vector with the number of units in MODE, + and each unit has the same bit clear. */ + +bool +loongarch_const_vector_bitimm_clr_p (rtx op, machine_mode mode) +{ + if (GET_CODE (op) == CONST_VECTOR && op != CONSTM1_RTX (mode)) + { + unsigned HOST_WIDE_INT val = ~UINTVAL (CONST_VECTOR_ELT (op, 0)); + int vlog2 = exact_log2 (val & GET_MODE_MASK (GET_MODE_INNER (mode))); + + if (vlog2 != -1) + { + gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT); + gcc_assert (vlog2 >= 0 && vlog2 <= GET_MODE_UNIT_BITSIZE (mode) - 1); + return loongarch_const_vector_same_val_p (op, mode); + } + } + + return false; +} + +/* Return true if OP is a constant vector with the number of units in MODE, + and each unit has the same value. */ + +bool +loongarch_const_vector_same_val_p (rtx op, machine_mode mode) +{ + int i, nunits = GET_MODE_NUNITS (mode); + rtx first; + + if (GET_CODE (op) != CONST_VECTOR || GET_MODE (op) != mode) + return false; + + first = CONST_VECTOR_ELT (op, 0); + for (i = 1; i < nunits; i++) + if (!rtx_equal_p (first, CONST_VECTOR_ELT (op, i))) + return false; + + return true; +} + +/* Return true if OP is a constant vector with the number of units in MODE, + and each unit has the same value as well as replicated bytes in the value. +*/ + +bool +loongarch_const_vector_same_bytes_p (rtx op, machine_mode mode) +{ + int i, bytes; + HOST_WIDE_INT val, first_byte; + rtx first; + + if (!loongarch_const_vector_same_val_p (op, mode)) + return false; + + first = CONST_VECTOR_ELT (op, 0); + bytes = GET_MODE_UNIT_SIZE (mode); + val = INTVAL (first); + first_byte = val & 0xff; + for (i = 1; i < bytes; i++) + { + val >>= 8; + if ((val & 0xff) != first_byte) + return false; + } + + return true; +} + +/* Return true if OP is a constant vector with the number of units in MODE, + and each unit has the same integer value in the range [LOW, HIGH]. */ + +bool +loongarch_const_vector_same_int_p (rtx op, machine_mode mode, HOST_WIDE_INT low, + HOST_WIDE_INT high) +{ + HOST_WIDE_INT value; + rtx elem0; + + if (!loongarch_const_vector_same_val_p (op, mode)) + return false; + + elem0 = CONST_VECTOR_ELT (op, 0); + if (!CONST_INT_P (elem0)) + return false; + + value = INTVAL (elem0); + return (value >= low && value <= high); +} + +/* Return true if OP is a constant vector with repeated 4-element sets + in mode MODE. */ + +bool +loongarch_const_vector_shuffle_set_p (rtx op, machine_mode mode) +{ + int nunits = GET_MODE_NUNITS (mode); + int nsets = nunits / 4; + int set = 0; + int i, j; + + /* Check if we have the same 4-element sets. */ + for (j = 0; j < nsets; j++, set = 4 * j) + for (i = 0; i < 4; i++) + if ((INTVAL (XVECEXP (op, 0, i)) + != (INTVAL (XVECEXP (op, 0, set + i)) - set)) + || !IN_RANGE (INTVAL (XVECEXP (op, 0, set + i)), 0, set + 3)) + return false; + return true; +} + /* Return true if rtx constants of mode MODE should be put into a small data section. */ @@ -1792,6 +1937,11 @@ loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_type *symbol_type) static int loongarch_symbol_insns (enum loongarch_symbol_type type, machine_mode mode) { + /* LSX LD.* and ST.* cannot support loading symbols via an immediate + operand. */ + if (LSX_SUPPORTED_MODE_P (mode)) + return 0; + switch (type) { case SYMBOL_GOT_DISP: @@ -1838,7 +1988,8 @@ loongarch_cannot_force_const_mem (machine_mode mode, rtx x) references, reload will consider forcing C into memory and using one of the instruction's memory alternatives. Returning false here will force it to use an input reload instead. */ - if (CONST_INT_P (x) && loongarch_legitimate_constant_p (mode, x)) + if ((CONST_INT_P (x) || GET_CODE (x) == CONST_VECTOR) + && loongarch_legitimate_constant_p (mode, x)) return true; split_const (x, &base, &offset); @@ -1915,6 +2066,12 @@ loongarch_valid_offset_p (rtx x, machine_mode mode) && !IMM12_OPERAND (INTVAL (x) + GET_MODE_SIZE (mode) - UNITS_PER_WORD)) return false; + /* LSX LD.* and ST.* supports 10-bit signed offsets. */ + if (LSX_SUPPORTED_MODE_P (mode) + && !loongarch_signed_immediate_p (INTVAL (x), 10, + loongarch_ldst_scaled_shift (mode))) + return false; + return true; } @@ -1999,9 +2156,10 @@ loongarch_valid_lo_sum_p (enum loongarch_symbol_type symbol_type, static bool loongarch_valid_index_p (struct loongarch_address_info *info, rtx x, - machine_mode mode, bool strict_p) + machine_mode mode, bool strict_p) { rtx index; + bool vector_p = LSX_SUPPORTED_MODE_P (mode); if ((REG_P (x) || SUBREG_P (x)) && GET_MODE (x) == Pmode) @@ -2016,7 +2174,7 @@ loongarch_valid_index_p (struct loongarch_address_info *info, rtx x, && contains_reg_of_mode[GENERAL_REGS][GET_MODE (SUBREG_REG (index))]) index = SUBREG_REG (index); - if (loongarch_valid_base_register_p (index, mode, strict_p)) + if (loongarch_valid_base_register_p (index, mode, strict_p) && !vector_p) { info->type = ADDRESS_REG_REG; info->offset = index; @@ -2052,7 +2210,7 @@ loongarch_classify_address (struct loongarch_address_info *info, rtx x, } if (loongarch_valid_base_register_p (XEXP (x, 1), mode, strict_p) - && loongarch_valid_index_p (info, XEXP (x, 0), mode, strict_p)) + && loongarch_valid_index_p (info, XEXP (x, 0), mode, strict_p)) { info->reg = XEXP (x, 1); return true; @@ -2127,6 +2285,7 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p) { struct loongarch_address_info addr; int factor; + bool lsx_p = !might_split_p && LSX_SUPPORTED_MODE_P (mode); if (!loongarch_classify_address (&addr, x, mode, false)) return 0; @@ -2144,15 +2303,27 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p) switch (addr.type) { case ADDRESS_REG: + if (lsx_p) + { + /* LSX LD.* and ST.* supports 10-bit signed offsets. */ + if (loongarch_signed_immediate_p (INTVAL (addr.offset), 10, + loongarch_ldst_scaled_shift (mode))) + return 1; + else + return 0; + } + return factor; + case ADDRESS_REG_REG: case ADDRESS_CONST_INT: - return factor; + return lsx_p ? 0 : factor; case ADDRESS_LO_SUM: return factor + 1; case ADDRESS_SYMBOLIC: - return factor * loongarch_symbol_insns (addr.symbol_type, mode); + return lsx_p ? 0 + : factor * loongarch_symbol_insns (addr.symbol_type, mode); } return 0; } @@ -2178,6 +2349,19 @@ loongarch_signed_immediate_p (unsigned HOST_WIDE_INT x, int bits, return loongarch_unsigned_immediate_p (x, bits, shift); } +/* Return the scale shift that applied to LSX LD/ST address offset. */ + +int +loongarch_ldst_scaled_shift (machine_mode mode) +{ + int shift = exact_log2 (GET_MODE_UNIT_SIZE (mode)); + + if (shift < 0 || shift > 8) + gcc_unreachable (); + + return shift; +} + /* Return true if X is a legitimate address with a 12-bit offset or addr.type is ADDRESS_LO_SUM. MODE is the mode of the value being accessed. */ @@ -2245,6 +2429,9 @@ loongarch_const_insns (rtx x) return loongarch_integer_cost (INTVAL (x)); case CONST_VECTOR: + if (LSX_SUPPORTED_MODE_P (GET_MODE (x)) + && loongarch_const_vector_same_int_p (x, GET_MODE (x), -512, 511)) + return 1; /* Fall through. */ case CONST_DOUBLE: return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; @@ -2279,7 +2466,7 @@ loongarch_const_insns (rtx x) case SYMBOL_REF: case LABEL_REF: return loongarch_symbol_insns ( - loongarch_classify_symbol (x), MAX_MACHINE_MODE); + loongarch_classify_symbol (x), MAX_MACHINE_MODE); default: return 0; @@ -2301,7 +2488,26 @@ loongarch_split_const_insns (rtx x) return low + high; } -static bool loongarch_split_move_insn_p (rtx dest, rtx src); +bool loongarch_split_move_insn_p (rtx dest, rtx src); +/* Return one word of 128-bit value OP, taking into account the fixed + endianness of certain registers. BYTE selects from the byte address. */ + +rtx +loongarch_subword_at_byte (rtx op, unsigned int byte) +{ + machine_mode mode; + + mode = GET_MODE (op); + if (mode == VOIDmode) + mode = TImode; + + gcc_assert (!FP_REG_RTX_P (op)); + + if (MEM_P (op)) + return loongarch_rewrite_small_data (adjust_address (op, word_mode, byte)); + + return simplify_gen_subreg (word_mode, op, mode, byte); +} /* Return the number of instructions needed to implement INSN, given that it loads from or stores to MEM. */ @@ -3062,9 +3268,10 @@ loongarch_legitimize_move (machine_mode mode, rtx dest, rtx src) /* Both src and dest are non-registers; one special case is supported where the source is (const_int 0) and the store can source the zero register. - */ + LSX is never able to source the zero register directly in + memory operations. */ if (!register_operand (dest, mode) && !register_operand (src, mode) - && !const_0_operand (src, mode)) + && (!const_0_operand (src, mode) || LSX_SUPPORTED_MODE_P (mode))) { loongarch_emit_move (dest, force_reg (mode, src)); return true; @@ -3636,6 +3843,54 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code, } } +/* Vectorizer cost model implementation. */ + +/* Implement targetm.vectorize.builtin_vectorization_cost. */ + +static int +loongarch_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, + tree vectype, + int misalign ATTRIBUTE_UNUSED) +{ + unsigned elements; + + switch (type_of_cost) + { + case scalar_stmt: + case scalar_load: + case vector_stmt: + case vector_load: + case vec_to_scalar: + case scalar_to_vec: + case cond_branch_not_taken: + case vec_promote_demote: + case scalar_store: + case vector_store: + return 1; + + case vec_perm: + return 1; + + case unaligned_load: + case vector_gather_load: + return 2; + + case unaligned_store: + case vector_scatter_store: + return 10; + + case cond_branch_taken: + return 3; + + case vec_construct: + elements = TYPE_VECTOR_SUBPARTS (vectype); + return elements / 2 + 1; + + default: + gcc_unreachable (); + } +} + /* Implement TARGET_ADDRESS_COST. */ static int @@ -3690,6 +3945,11 @@ loongarch_split_move_p (rtx dest, rtx src) if (FP_REG_RTX_P (src) && MEM_P (dest)) return false; } + + /* Check if LSX moves need splitting. */ + if (LSX_SUPPORTED_MODE_P (GET_MODE (dest))) + return loongarch_split_128bit_move_p (dest, src); + /* Otherwise split all multiword moves. */ return size > UNITS_PER_WORD; } @@ -3703,7 +3963,9 @@ loongarch_split_move (rtx dest, rtx src, rtx insn_) rtx low_dest; gcc_checking_assert (loongarch_split_move_p (dest, src)); - if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src)) + if (LSX_SUPPORTED_MODE_P (GET_MODE (dest))) + loongarch_split_128bit_move (dest, src); + else if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src)) { if (!TARGET_64BIT && GET_MODE (dest) == DImode) emit_insn (gen_move_doubleword_fprdi (dest, src)); @@ -3807,12 +4069,21 @@ loongarch_split_plus_constant (rtx *op, machine_mode mode) /* Return true if a move from SRC to DEST in INSN should be split. */ -static bool +bool loongarch_split_move_insn_p (rtx dest, rtx src) { return loongarch_split_move_p (dest, src); } +/* Split a move from SRC to DEST in INSN, given that + loongarch_split_move_insn_p holds. */ + +void +loongarch_split_move_insn (rtx dest, rtx src, rtx insn) +{ + loongarch_split_move (dest, src, insn); +} + /* Implement TARGET_CONSTANT_ALIGNMENT. */ static HOST_WIDE_INT @@ -3826,6 +4097,9 @@ loongarch_constant_alignment (const_tree exp, HOST_WIDE_INT align) const char * loongarch_output_move_index (rtx x, machine_mode mode, bool ldr) { + if (LSX_SUPPORTED_MODE_P (mode)) + return NULL; + int index = exact_log2 (GET_MODE_SIZE (mode)); if (!IN_RANGE (index, 0, 3)) return NULL; @@ -3858,6 +4132,9 @@ loongarch_output_move_index (rtx x, machine_mode mode, bool ldr) const char * loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr) { + if (LSX_SUPPORTED_MODE_P (mode)) + return NULL; + int index = exact_log2 (GET_MODE_SIZE (mode)); if (!IN_RANGE (index, 2, 3)) return NULL; @@ -3877,11 +4154,205 @@ loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr) { "fldx.s\t%0,%1", "fldx.d\t%0,%1" - }, + } }; return insn[ldr][index-2]; } +/* Return true if a 128-bit move from SRC to DEST should be split. */ + +bool +loongarch_split_128bit_move_p (rtx dest, rtx src) +{ + /* LSX-to-LSX moves can be done in a single instruction. */ + if (FP_REG_RTX_P (src) && FP_REG_RTX_P (dest)) + return false; + + /* Check for LSX loads and stores. */ + if (FP_REG_RTX_P (dest) && MEM_P (src)) + return false; + if (FP_REG_RTX_P (src) && MEM_P (dest)) + return false; + + /* Check for LSX set to an immediate const vector with valid replicated + element. */ + if (FP_REG_RTX_P (dest) + && loongarch_const_vector_same_int_p (src, GET_MODE (src), -512, 511)) + return false; + + /* Check for LSX load zero immediate. */ + if (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src))) + return false; + + return true; +} + +/* Split a 128-bit move from SRC to DEST. */ + +void +loongarch_split_128bit_move (rtx dest, rtx src) +{ + int byte, index; + rtx low_dest, low_src, d, s; + + if (FP_REG_RTX_P (dest)) + { + gcc_assert (!MEM_P (src)); + + rtx new_dest = dest; + if (!TARGET_64BIT) + { + if (GET_MODE (dest) != V4SImode) + new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0); + } + else + { + if (GET_MODE (dest) != V2DImode) + new_dest = simplify_gen_subreg (V2DImode, dest, GET_MODE (dest), 0); + } + + for (byte = 0, index = 0; byte < GET_MODE_SIZE (TImode); + byte += UNITS_PER_WORD, index++) + { + s = loongarch_subword_at_byte (src, byte); + if (!TARGET_64BIT) + emit_insn (gen_lsx_vinsgr2vr_w (new_dest, s, new_dest, + GEN_INT (1 << index))); + else + emit_insn (gen_lsx_vinsgr2vr_d (new_dest, s, new_dest, + GEN_INT (1 << index))); + } + } + else if (FP_REG_RTX_P (src)) + { + gcc_assert (!MEM_P (dest)); + + rtx new_src = src; + if (!TARGET_64BIT) + { + if (GET_MODE (src) != V4SImode) + new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0); + } + else + { + if (GET_MODE (src) != V2DImode) + new_src = simplify_gen_subreg (V2DImode, src, GET_MODE (src), 0); + } + + for (byte = 0, index = 0; byte < GET_MODE_SIZE (TImode); + byte += UNITS_PER_WORD, index++) + { + d = loongarch_subword_at_byte (dest, byte); + if (!TARGET_64BIT) + emit_insn (gen_lsx_vpickve2gr_w (d, new_src, GEN_INT (index))); + else + emit_insn (gen_lsx_vpickve2gr_d (d, new_src, GEN_INT (index))); + } + } + else + { + low_dest = loongarch_subword_at_byte (dest, 0); + low_src = loongarch_subword_at_byte (src, 0); + gcc_assert (REG_P (low_dest) && REG_P (low_src)); + /* Make sure the source register is not written before reading. */ + if (REGNO (low_dest) <= REGNO (low_src)) + { + for (byte = 0; byte < GET_MODE_SIZE (TImode); + byte += UNITS_PER_WORD) + { + d = loongarch_subword_at_byte (dest, byte); + s = loongarch_subword_at_byte (src, byte); + loongarch_emit_move (d, s); + } + } + else + { + for (byte = GET_MODE_SIZE (TImode) - UNITS_PER_WORD; byte >= 0; + byte -= UNITS_PER_WORD) + { + d = loongarch_subword_at_byte (dest, byte); + s = loongarch_subword_at_byte (src, byte); + loongarch_emit_move (d, s); + } + } + } +} + + +/* Split a COPY_S.D with operands DEST, SRC and INDEX. GEN is a function + used to generate subregs. */ + +void +loongarch_split_lsx_copy_d (rtx dest, rtx src, rtx index, + rtx (*gen_fn)(rtx, rtx, rtx)) +{ + gcc_assert ((GET_MODE (src) == V2DImode && GET_MODE (dest) == DImode) + || (GET_MODE (src) == V2DFmode && GET_MODE (dest) == DFmode)); + + /* Note that low is always from the lower index, and high is always + from the higher index. */ + rtx low = loongarch_subword (dest, false); + rtx high = loongarch_subword (dest, true); + rtx new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0); + + emit_insn (gen_fn (low, new_src, GEN_INT (INTVAL (index) * 2))); + emit_insn (gen_fn (high, new_src, GEN_INT (INTVAL (index) * 2 + 1))); +} + +/* Split a INSERT.D with operand DEST, SRC1.INDEX and SRC2. */ + +void +loongarch_split_lsx_insert_d (rtx dest, rtx src1, rtx index, rtx src2) +{ + int i; + gcc_assert (GET_MODE (dest) == GET_MODE (src1)); + gcc_assert ((GET_MODE (dest) == V2DImode + && (GET_MODE (src2) == DImode || src2 == const0_rtx)) + || (GET_MODE (dest) == V2DFmode && GET_MODE (src2) == DFmode)); + + /* Note that low is always from the lower index, and high is always + from the higher index. */ + rtx low = loongarch_subword (src2, false); + rtx high = loongarch_subword (src2, true); + rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0); + rtx new_src1 = simplify_gen_subreg (V4SImode, src1, GET_MODE (src1), 0); + i = exact_log2 (INTVAL (index)); + gcc_assert (i != -1); + + emit_insn (gen_lsx_vinsgr2vr_w (new_dest, low, new_src1, + GEN_INT (1 << (i * 2)))); + emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest, + GEN_INT (1 << (i * 2 + 1)))); +} + +/* Split FILL.D. */ + +void +loongarch_split_lsx_fill_d (rtx dest, rtx src) +{ + gcc_assert ((GET_MODE (dest) == V2DImode + && (GET_MODE (src) == DImode || src == const0_rtx)) + || (GET_MODE (dest) == V2DFmode && GET_MODE (src) == DFmode)); + + /* Note that low is always from the lower index, and high is always + from the higher index. */ + rtx low, high; + if (src == const0_rtx) + { + low = src; + high = src; + } + else + { + low = loongarch_subword (src, false); + high = loongarch_subword (src, true); + } + rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0); + emit_insn (gen_lsx_vreplgr2vr_w (new_dest, low)); + emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest, GEN_INT (1 << 1))); + emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest, GEN_INT (1 << 3))); +} + /* Return the appropriate instructions to move SRC into DEST. Assume that SRC is operand 1 and DEST is operand 0. */ @@ -3893,10 +4364,25 @@ loongarch_output_move (rtx dest, rtx src) enum rtx_code src_code = GET_CODE (src); machine_mode mode = GET_MODE (dest); bool dbl_p = (GET_MODE_SIZE (mode) == 8); + bool lsx_p = LSX_SUPPORTED_MODE_P (mode); if (loongarch_split_move_p (dest, src)) return "#"; + if ((lsx_p) + && dest_code == REG && FP_REG_P (REGNO (dest)) + && src_code == CONST_VECTOR + && CONST_INT_P (CONST_VECTOR_ELT (src, 0))) + { + gcc_assert (loongarch_const_vector_same_int_p (src, mode, -512, 511)); + switch (GET_MODE_SIZE (mode)) + { + case 16: + return "vrepli.%v0\t%w0,%E1"; + default: gcc_unreachable (); + } + } + if ((src_code == REG && GP_REG_P (REGNO (src))) || (src == CONST0_RTX (mode))) { @@ -3906,7 +4392,21 @@ loongarch_output_move (rtx dest, rtx src) return "or\t%0,%z1,$r0"; if (FP_REG_P (REGNO (dest))) - return dbl_p ? "movgr2fr.d\t%0,%z1" : "movgr2fr.w\t%0,%z1"; + { + if (lsx_p) + { + gcc_assert (src == CONST0_RTX (GET_MODE (src))); + switch (GET_MODE_SIZE (mode)) + { + case 16: + return "vrepli.b\t%w0,0"; + default: + gcc_unreachable (); + } + } + + return dbl_p ? "movgr2fr.d\t%0,%z1" : "movgr2fr.w\t%0,%z1"; + } } if (dest_code == MEM) { @@ -3948,7 +4448,10 @@ loongarch_output_move (rtx dest, rtx src) { if (src_code == REG) if (FP_REG_P (REGNO (src))) - return dbl_p ? "movfr2gr.d\t%0,%1" : "movfr2gr.s\t%0,%1"; + { + gcc_assert (!lsx_p); + return dbl_p ? "movfr2gr.d\t%0,%1" : "movfr2gr.s\t%0,%1"; + } if (src_code == MEM) { @@ -3993,7 +4496,7 @@ loongarch_output_move (rtx dest, rtx src) enum loongarch_symbol_type type = SYMBOL_PCREL; if (UNSPEC_ADDRESS_P (x)) - type = UNSPEC_ADDRESS_TYPE (x); + type = UNSPEC_ADDRESS_TYPE (x); if (type == SYMBOL_TLS_LE) return "lu12i.w\t%0,%h1"; @@ -4028,7 +4531,20 @@ loongarch_output_move (rtx dest, rtx src) if (src_code == REG && FP_REG_P (REGNO (src))) { if (dest_code == REG && FP_REG_P (REGNO (dest))) - return dbl_p ? "fmov.d\t%0,%1" : "fmov.s\t%0,%1"; + { + if (lsx_p) + { + switch (GET_MODE_SIZE (mode)) + { + case 16: + return "vori.b\t%w0,%w1,0"; + default: + gcc_unreachable (); + } + } + + return dbl_p ? "fmov.d\t%0,%1" : "fmov.s\t%0,%1"; + } if (dest_code == MEM) { @@ -4039,6 +4555,17 @@ loongarch_output_move (rtx dest, rtx src) if (insn) return insn; + if (lsx_p) + { + switch (GET_MODE_SIZE (mode)) + { + case 16: + return "vst\t%w1,%0"; + default: + gcc_unreachable (); + } + } + return dbl_p ? "fst.d\t%1,%0" : "fst.s\t%1,%0"; } } @@ -4054,6 +4581,16 @@ loongarch_output_move (rtx dest, rtx src) if (insn) return insn; + if (lsx_p) + { + switch (GET_MODE_SIZE (mode)) + { + case 16: + return "vld\t%w0,%1"; + default: + gcc_unreachable (); + } + } return dbl_p ? "fld.d\t%0,%1" : "fld.s\t%0,%1"; } } @@ -4243,6 +4780,7 @@ loongarch_extend_comparands (rtx_code code, rtx *op0, rtx *op1) } } + /* Convert a comparison into something that can be used in a branch. On entry, *OP0 and *OP1 are the values being compared and *CODE is the code used to compare them. Update them to describe the final comparison. */ @@ -5002,9 +5540,12 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool hi64_part, 'A' Print a _DB suffix if the memory model requires a release. 'b' Print the address of a memory operand, without offset. + 'B' Print CONST_INT OP element 0 of a replicated CONST_VECTOR + as an unsigned byte [0..255]. 'c' Print an integer. 'C' Print the integer branch condition for comparison OP. 'd' Print CONST_INT OP in decimal. + 'E' Print CONST_INT OP element 0 of a replicated CONST_VECTOR in decimal. 'F' Print the FPU branch condition for comparison OP. 'G' Print a DBAR insn if the memory model requires a release. 'H' Print address 52-61bit relocation associated with OP. @@ -5020,13 +5561,16 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool hi64_part, 't' Like 'T', but with the EQ/NE cases reversed 'V' Print exact log2 of CONST_INT OP element 0 of a replicated CONST_VECTOR in decimal. + 'v' Print the insn size suffix b, h, w or d for vector modes V16QI, V8HI, + V4SI, V2SI, and w, d for vector modes V4SF, V2DF respectively. 'W' Print the inverse of the FPU branch condition for comparison OP. + 'w' Print a LSX register. 'X' Print CONST_INT OP in hexadecimal format. 'x' Print the low 16 bits of CONST_INT OP in hexadecimal format. 'Y' Print loongarch_fp_conditions[INTVAL (OP)] 'y' Print exact log2 of CONST_INT OP in decimal. 'Z' Print OP and a comma for 8CC, otherwise print nothing. - 'z' Print $0 if OP is zero, otherwise print OP normally. */ + 'z' Print $r0 if OP is zero, otherwise print OP normally. */ static void loongarch_print_operand (FILE *file, rtx op, int letter) @@ -5048,6 +5592,18 @@ loongarch_print_operand (FILE *file, rtx op, int letter) if (loongarch_memmodel_needs_rel_acq_fence ((enum memmodel) INTVAL (op))) fputs ("_db", file); break; + case 'E': + if (GET_CODE (op) == CONST_VECTOR) + { + gcc_assert (loongarch_const_vector_same_val_p (op, GET_MODE (op))); + op = CONST_VECTOR_ELT (op, 0); + gcc_assert (CONST_INT_P (op)); + fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op)); + } + else + output_operand_lossage ("invalid use of '%%%c'", letter); + break; + case 'c': if (CONST_INT_P (op)) @@ -5098,6 +5654,18 @@ loongarch_print_operand (FILE *file, rtx op, int letter) loongarch_print_operand_reloc (file, op, false /* hi64_part*/, false /* lo_reloc */); break; + case 'B': + if (GET_CODE (op) == CONST_VECTOR) + { + gcc_assert (loongarch_const_vector_same_val_p (op, GET_MODE (op))); + op = CONST_VECTOR_ELT (op, 0); + gcc_assert (CONST_INT_P (op)); + unsigned HOST_WIDE_INT val8 = UINTVAL (op) & GET_MODE_MASK (QImode); + fprintf (file, HOST_WIDE_INT_PRINT_UNSIGNED, val8); + } + else + output_operand_lossage ("invalid use of '%%%c'", letter); + break; case 'm': if (CONST_INT_P (op)) @@ -5144,11 +5712,46 @@ loongarch_print_operand (FILE *file, rtx op, int letter) output_operand_lossage ("invalid use of '%%%c'", letter); break; - case 'W': + case 'v': + switch (GET_MODE (op)) + { + case E_V16QImode: + case E_V32QImode: + fprintf (file, "b"); + break; + case E_V8HImode: + case E_V16HImode: + fprintf (file, "h"); + break; + case E_V4SImode: + case E_V4SFmode: + case E_V8SImode: + case E_V8SFmode: + fprintf (file, "w"); + break; + case E_V2DImode: + case E_V2DFmode: + case E_V4DImode: + case E_V4DFmode: + fprintf (file, "d"); + break; + default: + output_operand_lossage ("invalid use of '%%%c'", letter); + } + break; + + case 'W': loongarch_print_float_branch_condition (file, reverse_condition (code), letter); break; + case 'w': + if (code == REG && LSX_REG_P (REGNO (op))) + fprintf (file, "$vr%s", ®_names[REGNO (op)][2]); + else + output_operand_lossage ("invalid use of '%%%c'", letter); + break; + case 'x': if (CONST_INT_P (op)) fprintf (file, HOST_WIDE_INT_PRINT_HEX, INTVAL (op) & 0xffff); @@ -5520,9 +6123,13 @@ loongarch_hard_regno_mode_ok_uncached (unsigned int regno, machine_mode mode) size = GET_MODE_SIZE (mode); mclass = GET_MODE_CLASS (mode); - if (GP_REG_P (regno)) + if (GP_REG_P (regno) && !LSX_SUPPORTED_MODE_P (mode)) return ((regno - GP_REG_FIRST) & 1) == 0 || size <= UNITS_PER_WORD; + /* For LSX, allow TImode and 128-bit vector modes in all FPR. */ + if (FP_REG_P (regno) && LSX_SUPPORTED_MODE_P (mode)) + return true; + if (FP_REG_P (regno)) { if (mclass == MODE_FLOAT @@ -5549,6 +6156,17 @@ loongarch_hard_regno_mode_ok (unsigned int regno, machine_mode mode) return loongarch_hard_regno_mode_ok_p[mode][regno]; } + +static bool +loongarch_hard_regno_call_part_clobbered (unsigned int, + unsigned int regno, machine_mode mode) +{ + if (ISA_HAS_LSX && FP_REG_P (regno) && GET_MODE_SIZE (mode) > 8) + return true; + + return false; +} + /* Implement TARGET_HARD_REGNO_NREGS. */ static unsigned int @@ -5560,7 +6178,12 @@ loongarch_hard_regno_nregs (unsigned int regno, machine_mode mode) return (GET_MODE_SIZE (mode) + 3) / 4; if (FP_REG_P (regno)) - return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG; + { + if (LSX_SUPPORTED_MODE_P (mode)) + return 1; + + return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG; + } /* All other registers are word-sized. */ return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD; @@ -5587,8 +6210,12 @@ loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode) if (hard_reg_set_intersect_p (left, reg_class_contents[(int) FP_REGS])) { if (loongarch_hard_regno_mode_ok (FP_REG_FIRST, mode)) - size = MIN (size, UNITS_PER_FPREG); - + { + if (LSX_SUPPORTED_MODE_P (mode)) + size = MIN (size, UNITS_PER_LSX_REG); + else + size = MIN (size, UNITS_PER_FPREG); + } left &= ~reg_class_contents[FP_REGS]; } if (!hard_reg_set_empty_p (left)) @@ -5599,9 +6226,13 @@ loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode) /* Implement TARGET_CAN_CHANGE_MODE_CLASS. */ static bool -loongarch_can_change_mode_class (machine_mode, machine_mode, +loongarch_can_change_mode_class (machine_mode from, machine_mode to, reg_class_t rclass) { + /* Allow conversions between different LSX vector modes. */ + if (LSX_SUPPORTED_MODE_P (from) && LSX_SUPPORTED_MODE_P (to)) + return true; + return !reg_classes_intersect_p (FP_REGS, rclass); } @@ -5621,7 +6252,7 @@ loongarch_mode_ok_for_mov_fmt_p (machine_mode mode) return TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT; default: - return 0; + return LSX_SUPPORTED_MODE_P (mode); } } @@ -5778,7 +6409,12 @@ loongarch_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x, if (regno < 0 || (MEM_P (x) && (GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8))) - /* In this case we can use fld.s, fst.s, fld.d or fst.d. */ + /* In this case we can use lwc1, swc1, ldc1 or sdc1. We'll use + pairs of lwc1s and swc1s if ldc1 and sdc1 are not supported. */ + return NO_REGS; + + if (MEM_P (x) && LSX_SUPPORTED_MODE_P (mode)) + /* In this case we can use LSX LD.* and ST.*. */ return NO_REGS; if (GP_REG_P (regno) || x == CONST0_RTX (mode)) @@ -5813,6 +6449,14 @@ loongarch_valid_pointer_mode (scalar_int_mode mode) return mode == SImode || (TARGET_64BIT && mode == DImode); } +/* Implement TARGET_VECTOR_MODE_SUPPORTED_P. */ + +static bool +loongarch_vector_mode_supported_p (machine_mode mode) +{ + return LSX_SUPPORTED_MODE_P (mode); +} + /* Implement TARGET_SCALAR_MODE_SUPPORTED_P. */ static bool @@ -5825,6 +6469,48 @@ loongarch_scalar_mode_supported_p (scalar_mode mode) return default_scalar_mode_supported_p (mode); } +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE. */ + +static machine_mode +loongarch_preferred_simd_mode (scalar_mode mode) +{ + if (!ISA_HAS_LSX) + return word_mode; + + switch (mode) + { + case E_QImode: + return E_V16QImode; + case E_HImode: + return E_V8HImode; + case E_SImode: + return E_V4SImode; + case E_DImode: + return E_V2DImode; + + case E_SFmode: + return E_V4SFmode; + + case E_DFmode: + return E_V2DFmode; + + default: + break; + } + return word_mode; +} + +static unsigned int +loongarch_autovectorize_vector_modes (vector_modes *modes, bool) +{ + if (ISA_HAS_LSX) + { + modes->safe_push (V16QImode); + } + + return 0; +} + /* Return the assembly code for INSN, which has the operands given by OPERANDS, and which branches to OPERANDS[0] if some condition is true. BRANCH_IF_TRUE is the asm template that should be used if OPERANDS[0] @@ -5989,6 +6675,29 @@ loongarch_output_division (const char *division, rtx *operands) return s; } +/* Return the assembly code for LSX DIV_{S,U}.DF or MOD_{S,U}.DF instructions, + which has the operands given by OPERANDS. Add in a divide-by-zero check + if needed. */ + +const char * +loongarch_lsx_output_division (const char *division, rtx *operands) +{ + const char *s; + + s = division; + if (TARGET_CHECK_ZERO_DIV) + { + if (ISA_HAS_LSX) + { + output_asm_insn ("vsetallnez.%v0\t$fcc7,%w2",operands); + output_asm_insn (s, operands); + output_asm_insn ("bcnez\t$fcc7,1f", operands); + } + s = "break\t7\n1:"; + } + return s; +} + /* Implement TARGET_SCHED_ADJUST_COST. We assume that anti and output dependencies have no cost. */ @@ -6258,6 +6967,9 @@ loongarch_option_override_internal (struct gcc_options *opts) if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib) error ("%qs cannot be used for compiling a shared library", "-mdirect-extern-access"); + if (loongarch_vector_access_cost == 0) + loongarch_vector_access_cost = 5; + switch (la_target.cmodel) { @@ -6476,64 +7188,60 @@ loongarch_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value) emit_insn (gen_clear_cache (addr, end_addr)); } -/* Implement HARD_REGNO_CALLER_SAVE_MODE. */ - -machine_mode -loongarch_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs, - machine_mode mode) -{ - /* For performance, avoid saving/restoring upper parts of a register - by returning MODE as save mode when the mode is known. */ - if (mode == VOIDmode) - return choose_hard_reg_mode (regno, nregs, NULL); - else - return mode; -} +/* Generate or test for an insn that supports a constant permutation. */ -/* Implement TARGET_SPILL_CLASS. */ +#define MAX_VECT_LEN 32 -static reg_class_t -loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED, - machine_mode mode ATTRIBUTE_UNUSED) +struct expand_vec_perm_d { - return NO_REGS; -} - -/* Implement TARGET_PROMOTE_FUNCTION_MODE. */ + rtx target, op0, op1; + unsigned char perm[MAX_VECT_LEN]; + machine_mode vmode; + unsigned char nelt; + bool one_vector_p; + bool testing_p; +}; -/* This function is equivalent to default_promote_function_mode_always_promote - except that it returns a promoted mode even if type is NULL_TREE. This is - needed by libcalls which have no type (only a mode) such as fixed conversion - routines that take a signed or unsigned char/short argument and convert it - to a fixed type. */ +/* Construct (set target (vec_select op0 (parallel perm))) and + return true if that's a valid instruction in the active ISA. */ -static machine_mode -loongarch_promote_function_mode (const_tree type ATTRIBUTE_UNUSED, - machine_mode mode, - int *punsignedp ATTRIBUTE_UNUSED, - const_tree fntype ATTRIBUTE_UNUSED, - int for_return ATTRIBUTE_UNUSED) +static bool +loongarch_expand_vselect (rtx target, rtx op0, + const unsigned char *perm, unsigned nelt) { - int unsignedp; + rtx rperm[MAX_VECT_LEN], x; + rtx_insn *insn; + unsigned i; - if (type != NULL_TREE) - return promote_mode (type, mode, punsignedp); + for (i = 0; i < nelt; ++i) + rperm[i] = GEN_INT (perm[i]); - unsignedp = *punsignedp; - PROMOTE_MODE (mode, unsignedp, type); - *punsignedp = unsignedp; - return mode; + x = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (nelt, rperm)); + x = gen_rtx_VEC_SELECT (GET_MODE (target), op0, x); + x = gen_rtx_SET (target, x); + + insn = emit_insn (x); + if (recog_memoized (insn) < 0) + { + remove_insn (insn); + return false; + } + return true; } -/* Implement TARGET_STARTING_FRAME_OFFSET. See loongarch_compute_frame_info - for details about the frame layout. */ +/* Similar, but generate a vec_concat from op0 and op1 as well. */ -static HOST_WIDE_INT -loongarch_starting_frame_offset (void) +static bool +loongarch_expand_vselect_vconcat (rtx target, rtx op0, rtx op1, + const unsigned char *perm, unsigned nelt) { - if (FRAME_GROWS_DOWNWARD) - return 0; - return crtl->outgoing_args_size; + machine_mode v2mode; + rtx x; + + if (!GET_MODE_2XWIDER_MODE (GET_MODE (op0)).exists (&v2mode)) + return false; + x = gen_rtx_VEC_CONCAT (v2mode, op0, op1); + return loongarch_expand_vselect (target, x, perm, nelt); } static tree @@ -6796,109 +7504,1274 @@ loongarch_set_handled_components (sbitmap components) #define TARGET_ASM_ALIGNED_SI_OP "\t.word\t" #undef TARGET_ASM_ALIGNED_DI_OP #define TARGET_ASM_ALIGNED_DI_OP "\t.dword\t" +/* Construct (set target (vec_select op0 (parallel selector))) and + return true if that's a valid instruction in the active ISA. */ -#undef TARGET_OPTION_OVERRIDE -#define TARGET_OPTION_OVERRIDE loongarch_option_override - -#undef TARGET_LEGITIMIZE_ADDRESS -#define TARGET_LEGITIMIZE_ADDRESS loongarch_legitimize_address +static bool +loongarch_expand_lsx_shuffle (struct expand_vec_perm_d *d) +{ + rtx x, elts[MAX_VECT_LEN]; + rtvec v; + rtx_insn *insn; + unsigned i; -#undef TARGET_ASM_SELECT_RTX_SECTION -#define TARGET_ASM_SELECT_RTX_SECTION loongarch_select_rtx_section -#undef TARGET_ASM_FUNCTION_RODATA_SECTION -#define TARGET_ASM_FUNCTION_RODATA_SECTION loongarch_function_rodata_section + if (!ISA_HAS_LSX) + return false; -#undef TARGET_SCHED_INIT -#define TARGET_SCHED_INIT loongarch_sched_init -#undef TARGET_SCHED_REORDER -#define TARGET_SCHED_REORDER loongarch_sched_reorder -#undef TARGET_SCHED_REORDER2 -#define TARGET_SCHED_REORDER2 loongarch_sched_reorder2 -#undef TARGET_SCHED_VARIABLE_ISSUE -#define TARGET_SCHED_VARIABLE_ISSUE loongarch_variable_issue -#undef TARGET_SCHED_ADJUST_COST -#define TARGET_SCHED_ADJUST_COST loongarch_adjust_cost -#undef TARGET_SCHED_ISSUE_RATE -#define TARGET_SCHED_ISSUE_RATE loongarch_issue_rate -#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD -#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \ - loongarch_multipass_dfa_lookahead + for (i = 0; i < d->nelt; i++) + elts[i] = GEN_INT (d->perm[i]); -#undef TARGET_FUNCTION_OK_FOR_SIBCALL -#define TARGET_FUNCTION_OK_FOR_SIBCALL loongarch_function_ok_for_sibcall + v = gen_rtvec_v (d->nelt, elts); + x = gen_rtx_PARALLEL (VOIDmode, v); -#undef TARGET_VALID_POINTER_MODE -#define TARGET_VALID_POINTER_MODE loongarch_valid_pointer_mode -#undef TARGET_REGISTER_MOVE_COST -#define TARGET_REGISTER_MOVE_COST loongarch_register_move_cost -#undef TARGET_MEMORY_MOVE_COST -#define TARGET_MEMORY_MOVE_COST loongarch_memory_move_cost -#undef TARGET_RTX_COSTS -#define TARGET_RTX_COSTS loongarch_rtx_costs -#undef TARGET_ADDRESS_COST -#define TARGET_ADDRESS_COST loongarch_address_cost + if (!loongarch_const_vector_shuffle_set_p (x, d->vmode)) + return false; -#undef TARGET_IN_SMALL_DATA_P -#define TARGET_IN_SMALL_DATA_P loongarch_in_small_data_p + x = gen_rtx_VEC_SELECT (d->vmode, d->op0, x); + x = gen_rtx_SET (d->target, x); -#undef TARGET_PREFERRED_RELOAD_CLASS -#define TARGET_PREFERRED_RELOAD_CLASS loongarch_preferred_reload_class + insn = emit_insn (x); + if (recog_memoized (insn) < 0) + { + remove_insn (insn); + return false; + } + return true; +} -#undef TARGET_ASM_FILE_START_FILE_DIRECTIVE -#define TARGET_ASM_FILE_START_FILE_DIRECTIVE true +void +loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel) +{ + machine_mode vmode = GET_MODE (target); -#undef TARGET_EXPAND_BUILTIN_VA_START -#define TARGET_EXPAND_BUILTIN_VA_START loongarch_va_start + gcc_checking_assert (vmode == E_V16QImode + || vmode == E_V2DImode || vmode == E_V2DFmode + || vmode == E_V4SImode || vmode == E_V4SFmode + || vmode == E_V8HImode); + gcc_checking_assert (GET_MODE (op0) == vmode); + gcc_checking_assert (GET_MODE (op1) == vmode); + gcc_checking_assert (GET_MODE (sel) == vmode); + gcc_checking_assert (ISA_HAS_LSX); -#undef TARGET_PROMOTE_FUNCTION_MODE -#define TARGET_PROMOTE_FUNCTION_MODE loongarch_promote_function_mode -#undef TARGET_RETURN_IN_MEMORY -#define TARGET_RETURN_IN_MEMORY loongarch_return_in_memory + switch (vmode) + { + case E_V16QImode: + emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel)); + break; + case E_V2DFmode: + emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0)); + break; + case E_V2DImode: + emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0)); + break; + case E_V4SFmode: + emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0)); + break; + case E_V4SImode: + emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0)); + break; + case E_V8HImode: + emit_insn (gen_lsx_vshuf_h (target, sel, op1, op0)); + break; + default: + break; + } +} -#undef TARGET_FUNCTION_VALUE -#define TARGET_FUNCTION_VALUE loongarch_function_value -#undef TARGET_LIBCALL_VALUE -#define TARGET_LIBCALL_VALUE loongarch_libcall_value +static bool +loongarch_try_expand_lsx_vshuf_const (struct expand_vec_perm_d *d) +{ + int i; + rtx target, op0, op1, sel, tmp; + rtx rperm[MAX_VECT_LEN]; -#undef TARGET_ASM_OUTPUT_MI_THUNK -#define TARGET_ASM_OUTPUT_MI_THUNK loongarch_output_mi_thunk -#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK -#define TARGET_ASM_CAN_OUTPUT_MI_THUNK \ - hook_bool_const_tree_hwi_hwi_const_tree_true + if (d->vmode == E_V2DImode || d->vmode == E_V2DFmode + || d->vmode == E_V4SImode || d->vmode == E_V4SFmode + || d->vmode == E_V8HImode || d->vmode == E_V16QImode) + { + target = d->target; + op0 = d->op0; + op1 = d->one_vector_p ? d->op0 : d->op1; -#undef TARGET_PRINT_OPERAND -#define TARGET_PRINT_OPERAND loongarch_print_operand -#undef TARGET_PRINT_OPERAND_ADDRESS -#define TARGET_PRINT_OPERAND_ADDRESS loongarch_print_operand_address -#undef TARGET_PRINT_OPERAND_PUNCT_VALID_P -#define TARGET_PRINT_OPERAND_PUNCT_VALID_P \ - loongarch_print_operand_punct_valid_p + if (GET_MODE (op0) != GET_MODE (op1) + || GET_MODE (op0) != GET_MODE (target)) + return false; -#undef TARGET_SETUP_INCOMING_VARARGS -#define TARGET_SETUP_INCOMING_VARARGS loongarch_setup_incoming_varargs -#undef TARGET_STRICT_ARGUMENT_NAMING -#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true -#undef TARGET_MUST_PASS_IN_STACK -#define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size -#undef TARGET_PASS_BY_REFERENCE -#define TARGET_PASS_BY_REFERENCE loongarch_pass_by_reference -#undef TARGET_ARG_PARTIAL_BYTES -#define TARGET_ARG_PARTIAL_BYTES loongarch_arg_partial_bytes -#undef TARGET_FUNCTION_ARG -#define TARGET_FUNCTION_ARG loongarch_function_arg -#undef TARGET_FUNCTION_ARG_ADVANCE -#define TARGET_FUNCTION_ARG_ADVANCE loongarch_function_arg_advance -#undef TARGET_FUNCTION_ARG_BOUNDARY -#define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary + if (d->testing_p) + return true; -#undef TARGET_SCALAR_MODE_SUPPORTED_P -#define TARGET_SCALAR_MODE_SUPPORTED_P loongarch_scalar_mode_supported_p + for (i = 0; i < d->nelt; i += 1) + { + rperm[i] = GEN_INT (d->perm[i]); + } -#undef TARGET_INIT_BUILTINS -#define TARGET_INIT_BUILTINS loongarch_init_builtins -#undef TARGET_BUILTIN_DECL -#define TARGET_BUILTIN_DECL loongarch_builtin_decl -#undef TARGET_EXPAND_BUILTIN + if (d->vmode == E_V2DFmode) + { + sel = gen_rtx_CONST_VECTOR (E_V2DImode, gen_rtvec_v (d->nelt, rperm)); + tmp = gen_rtx_SUBREG (E_V2DImode, d->target, 0); + emit_move_insn (tmp, sel); + } + else if (d->vmode == E_V4SFmode) + { + sel = gen_rtx_CONST_VECTOR (E_V4SImode, gen_rtvec_v (d->nelt, rperm)); + tmp = gen_rtx_SUBREG (E_V4SImode, d->target, 0); + emit_move_insn (tmp, sel); + } + else + { + sel = gen_rtx_CONST_VECTOR (d->vmode, gen_rtvec_v (d->nelt, rperm)); + emit_move_insn (d->target, sel); + } + + switch (d->vmode) + { + case E_V2DFmode: + emit_insn (gen_lsx_vshuf_d_f (target, target, op1, op0)); + break; + case E_V2DImode: + emit_insn (gen_lsx_vshuf_d (target, target, op1, op0)); + break; + case E_V4SFmode: + emit_insn (gen_lsx_vshuf_w_f (target, target, op1, op0)); + break; + case E_V4SImode: + emit_insn (gen_lsx_vshuf_w (target, target, op1, op0)); + break; + case E_V8HImode: + emit_insn (gen_lsx_vshuf_h (target, target, op1, op0)); + break; + case E_V16QImode: + emit_insn (gen_lsx_vshuf_b (target, op1, op0, target)); + break; + default: + break; + } + + return true; + } + return false; +} + +static bool +loongarch_expand_vec_perm_const_1 (struct expand_vec_perm_d *d) +{ + unsigned int i, nelt = d->nelt; + unsigned char perm2[MAX_VECT_LEN]; + + if (d->one_vector_p) + { + /* Try interleave with alternating operands. */ + memcpy (perm2, d->perm, sizeof (perm2)); + for (i = 1; i < nelt; i += 2) + perm2[i] += nelt; + if (loongarch_expand_vselect_vconcat (d->target, d->op0, d->op1, perm2, + nelt)) + return true; + } + else + { + if (loongarch_expand_vselect_vconcat (d->target, d->op0, d->op1, + d->perm, nelt)) + return true; + + /* Try again with swapped operands. */ + for (i = 0; i < nelt; ++i) + perm2[i] = (d->perm[i] + nelt) & (2 * nelt - 1); + if (loongarch_expand_vselect_vconcat (d->target, d->op1, d->op0, perm2, + nelt)) + return true; + } + + if (loongarch_expand_lsx_shuffle (d)) + return true; + return false; +} + +/* Implementation of constant vector permuatation. This function identifies + * recognized pattern of permuation selector argument, and use one or more + * instruction(s) to finish the permutation job correctly. For unsupported + * patterns, it will return false. */ + +static bool +loongarch_expand_vec_perm_const_2 (struct expand_vec_perm_d *d) +{ + /* Although we have the LSX vec_perm template, there's still some + 128bit vector permuatation operations send to vectorize_vec_perm_const. + In this case, we just simpliy wrap them by single vshuf.* instruction, + because LSX vshuf.* instruction just have the same behavior that GCC + expects. */ + return loongarch_try_expand_lsx_vshuf_const (d); +} + +/* Implement TARGET_VECTORIZE_VEC_PERM_CONST. */ + +static bool +loongarch_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode, + rtx target, rtx op0, rtx op1, + const vec_perm_indices &sel) +{ + if (vmode != op_mode) + return false; + + struct expand_vec_perm_d d; + int i, nelt, which; + unsigned char orig_perm[MAX_VECT_LEN]; + bool ok; + + d.target = target; + if (op0) + { + rtx nop0 = force_reg (vmode, op0); + if (op0 == op1) + op1 = nop0; + op0 = nop0; + } + if (op1) + op1 = force_reg (vmode, op1); + d.op0 = op0; + d.op1 = op1; + + d.vmode = vmode; + gcc_assert (VECTOR_MODE_P (vmode)); + d.nelt = nelt = GET_MODE_NUNITS (vmode); + d.testing_p = !target; + + /* This is overly conservative, but ensures we don't get an + uninitialized warning on ORIG_PERM. */ + memset (orig_perm, 0, MAX_VECT_LEN); + for (i = which = 0; i < nelt; ++i) + { + int ei = sel[i] & (2 * nelt - 1); + which |= (ei < nelt ? 1 : 2); + orig_perm[i] = ei; + } + memcpy (d.perm, orig_perm, MAX_VECT_LEN); + + switch (which) + { + default: + gcc_unreachable (); + + case 3: + d.one_vector_p = false; + if (d.testing_p || !rtx_equal_p (d.op0, d.op1)) + break; + /* FALLTHRU */ + + case 2: + for (i = 0; i < nelt; ++i) + d.perm[i] &= nelt - 1; + d.op0 = d.op1; + d.one_vector_p = true; + break; + + case 1: + d.op1 = d.op0; + d.one_vector_p = true; + break; + } + + if (d.testing_p) + { + d.target = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 1); + d.op1 = d.op0 = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 2); + if (!d.one_vector_p) + d.op1 = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 3); + + ok = loongarch_expand_vec_perm_const_2 (&d); + if (ok) + return ok; + + start_sequence (); + ok = loongarch_expand_vec_perm_const_1 (&d); + end_sequence (); + return ok; + } + + ok = loongarch_expand_vec_perm_const_2 (&d); + if (!ok) + ok = loongarch_expand_vec_perm_const_1 (&d); + + /* If we were given a two-vector permutation which just happened to + have both input vectors equal, we folded this into a one-vector + permutation. There are several loongson patterns that are matched + via direct vec_select+vec_concat expansion, but we do not have + support in loongarch_expand_vec_perm_const_1 to guess the adjustment + that should be made for a single operand. Just try again with + the original permutation. */ + if (!ok && which == 3) + { + d.op0 = op0; + d.op1 = op1; + d.one_vector_p = false; + memcpy (d.perm, orig_perm, MAX_VECT_LEN); + ok = loongarch_expand_vec_perm_const_1 (&d); + } + + return ok; +} + +/* Implement TARGET_SCHED_REASSOCIATION_WIDTH. */ + +static int +loongarch_sched_reassociation_width (unsigned int opc, machine_mode mode) +{ + switch (LARCH_ACTUAL_TUNE) + { + case CPU_LOONGARCH64: + case CPU_LA464: + /* Vector part. */ + if (LSX_SUPPORTED_MODE_P (mode)) + { + /* Integer vector instructions execute in FP unit. + The width of integer/float-point vector instructions is 3. */ + return 3; + } + + /* Scalar part. */ + else if (INTEGRAL_MODE_P (mode)) + return 1; + else if (FLOAT_MODE_P (mode)) + { + if (opc == PLUS_EXPR) + { + return 2; + } + return 4; + } + break; + default: + break; + } + return 1; +} + +/* Implement extract a scalar element from vecotr register */ + +void +loongarch_expand_vector_extract (rtx target, rtx vec, int elt) +{ + machine_mode mode = GET_MODE (vec); + machine_mode inner_mode = GET_MODE_INNER (mode); + rtx tmp; + + switch (mode) + { + case E_V8HImode: + case E_V16QImode: + break; + + default: + break; + } + + tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, GEN_INT (elt))); + tmp = gen_rtx_VEC_SELECT (inner_mode, vec, tmp); + + /* Let the rtl optimizers know about the zero extension performed. */ + if (inner_mode == QImode || inner_mode == HImode) + { + tmp = gen_rtx_ZERO_EXTEND (SImode, tmp); + target = gen_lowpart (SImode, target); + } + if (inner_mode == SImode || inner_mode == DImode) + { + tmp = gen_rtx_SIGN_EXTEND (inner_mode, tmp); + } + + emit_insn (gen_rtx_SET (target, tmp)); +} + +/* Generate code to copy vector bits i / 2 ... i - 1 from vector SRC + to bits 0 ... i / 2 - 1 of vector DEST, which has the same mode. + The upper bits of DEST are undefined, though they shouldn't cause + exceptions (some bits from src or all zeros are ok). */ + +static void +emit_reduc_half (rtx dest, rtx src, int i) +{ + rtx tem, d = dest; + switch (GET_MODE (src)) + { + case E_V4SFmode: + tem = gen_lsx_vbsrl_w_f (dest, src, GEN_INT (i == 128 ? 8 : 4)); + break; + case E_V2DFmode: + tem = gen_lsx_vbsrl_d_f (dest, src, GEN_INT (8)); + break; + case E_V16QImode: + case E_V8HImode: + case E_V4SImode: + case E_V2DImode: + d = gen_reg_rtx (V2DImode); + tem = gen_lsx_vbsrl_d (d, gen_lowpart (V2DImode, src), GEN_INT (i/16)); + break; + default: + gcc_unreachable (); + } + emit_insn (tem); + if (d != dest) + emit_move_insn (dest, gen_lowpart (GET_MODE (dest), d)); +} + +/* Expand a vector reduction. FN is the binary pattern to reduce; + DEST is the destination; IN is the input vector. */ + +void +loongarch_expand_vector_reduc (rtx (*fn) (rtx, rtx, rtx), rtx dest, rtx in) +{ + rtx half, dst, vec = in; + machine_mode mode = GET_MODE (in); + int i; + + for (i = GET_MODE_BITSIZE (mode); + i > GET_MODE_UNIT_BITSIZE (mode); + i >>= 1) + { + half = gen_reg_rtx (mode); + emit_reduc_half (half, vec, i); + if (i == GET_MODE_UNIT_BITSIZE (mode) * 2) + dst = dest; + else + dst = gen_reg_rtx (mode); + emit_insn (fn (dst, half, vec)); + vec = dst; + } +} + +/* Expand an integral vector unpack operation. */ + +void +loongarch_expand_vec_unpack (rtx operands[2], bool unsigned_p, bool high_p) +{ + machine_mode imode = GET_MODE (operands[1]); + rtx (*unpack) (rtx, rtx, rtx); + rtx (*cmpFunc) (rtx, rtx, rtx); + rtx tmp, dest; + + if (ISA_HAS_LSX) + { + switch (imode) + { + case E_V4SImode: + if (high_p != 0) + unpack = gen_lsx_vilvh_w; + else + unpack = gen_lsx_vilvl_w; + + cmpFunc = gen_lsx_vslt_w; + break; + + case E_V8HImode: + if (high_p != 0) + unpack = gen_lsx_vilvh_h; + else + unpack = gen_lsx_vilvl_h; + + cmpFunc = gen_lsx_vslt_h; + break; + + case E_V16QImode: + if (high_p != 0) + unpack = gen_lsx_vilvh_b; + else + unpack = gen_lsx_vilvl_b; + + cmpFunc = gen_lsx_vslt_b; + break; + + default: + gcc_unreachable (); + break; + } + + if (!unsigned_p) + { + /* Extract sign extention for each element comparing each element + with immediate zero. */ + tmp = gen_reg_rtx (imode); + emit_insn (cmpFunc (tmp, operands[1], CONST0_RTX (imode))); + } + else + tmp = force_reg (imode, CONST0_RTX (imode)); + + dest = gen_reg_rtx (imode); + + emit_insn (unpack (dest, operands[1], tmp)); + emit_move_insn (operands[0], gen_lowpart (GET_MODE (operands[0]), dest)); + return; + } + gcc_unreachable (); +} + +/* Construct and return PARALLEL RTX with CONST_INTs for HIGH (high_p == TRUE) + or LOW (high_p == FALSE) half of a vector for mode MODE. */ + +rtx +loongarch_lsx_vec_parallel_const_half (machine_mode mode, bool high_p) +{ + int nunits = GET_MODE_NUNITS (mode); + rtvec v = rtvec_alloc (nunits / 2); + int base; + int i; + + base = high_p ? nunits / 2 : 0; + + for (i = 0; i < nunits / 2; i++) + RTVEC_ELT (v, i) = GEN_INT (base + i); + + return gen_rtx_PARALLEL (VOIDmode, v); +} + +/* A subroutine of loongarch_expand_vec_init, match constant vector + elements. */ + +static inline bool +loongarch_constant_elt_p (rtx x) +{ + return CONST_INT_P (x) || GET_CODE (x) == CONST_DOUBLE; +} + +rtx +loongarch_gen_const_int_vector_shuffle (machine_mode mode, int val) +{ + int nunits = GET_MODE_NUNITS (mode); + int nsets = nunits / 4; + rtx elts[MAX_VECT_LEN]; + int set = 0; + int i, j; + + /* Generate a const_int vector replicating the same 4-element set + from an immediate. */ + for (j = 0; j < nsets; j++, set = 4 * j) + for (i = 0; i < 4; i++) + elts[set + i] = GEN_INT (set + ((val >> (2 * i)) & 0x3)); + + return gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (nunits, elts)); +} + +/* Expand a vector initialization. */ + +void +loongarch_expand_vector_init (rtx target, rtx vals) +{ + machine_mode vmode = GET_MODE (target); + machine_mode imode = GET_MODE_INNER (vmode); + unsigned i, nelt = GET_MODE_NUNITS (vmode); + unsigned nvar = 0; + bool all_same = true; + rtx x; + + for (i = 0; i < nelt; ++i) + { + x = XVECEXP (vals, 0, i); + if (!loongarch_constant_elt_p (x)) + nvar++; + if (i > 0 && !rtx_equal_p (x, XVECEXP (vals, 0, 0))) + all_same = false; + } + + if (ISA_HAS_LSX) + { + if (all_same) + { + rtx same = XVECEXP (vals, 0, 0); + rtx temp, temp2; + + if (CONST_INT_P (same) && nvar == 0 + && loongarch_signed_immediate_p (INTVAL (same), 10, 0)) + { + switch (vmode) + { + case E_V16QImode: + case E_V8HImode: + case E_V4SImode: + case E_V2DImode: + temp = gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0)); + emit_move_insn (target, temp); + return; + + default: + gcc_unreachable (); + } + } + temp = gen_reg_rtx (imode); + if (imode == GET_MODE (same)) + temp2 = same; + else if (GET_MODE_SIZE (imode) >= UNITS_PER_WORD) + { + if (GET_CODE (same) == MEM) + { + rtx reg_tmp = gen_reg_rtx (GET_MODE (same)); + loongarch_emit_move (reg_tmp, same); + temp2 = simplify_gen_subreg (imode, reg_tmp, + GET_MODE (reg_tmp), 0); + } + else + temp2 = simplify_gen_subreg (imode, same, GET_MODE (same), 0); + } + else + { + if (GET_CODE (same) == MEM) + { + rtx reg_tmp = gen_reg_rtx (GET_MODE (same)); + loongarch_emit_move (reg_tmp, same); + temp2 = lowpart_subreg (imode, reg_tmp, GET_MODE (reg_tmp)); + } + else + temp2 = lowpart_subreg (imode, same, GET_MODE (same)); + } + emit_move_insn (temp, temp2); + + switch (vmode) + { + case E_V16QImode: + case E_V8HImode: + case E_V4SImode: + case E_V2DImode: + loongarch_emit_move (target, gen_rtx_VEC_DUPLICATE (vmode, temp)); + break; + + case E_V4SFmode: + emit_insn (gen_lsx_vreplvei_w_f_scalar (target, temp)); + break; + + case E_V2DFmode: + emit_insn (gen_lsx_vreplvei_d_f_scalar (target, temp)); + break; + + default: + gcc_unreachable (); + } + } + else + { + emit_move_insn (target, CONST0_RTX (vmode)); + + for (i = 0; i < nelt; ++i) + { + rtx temp = gen_reg_rtx (imode); + emit_move_insn (temp, XVECEXP (vals, 0, i)); + switch (vmode) + { + case E_V16QImode: + emit_insn (gen_vec_setv16qi (target, temp, GEN_INT (i))); + break; + + case E_V8HImode: + emit_insn (gen_vec_setv8hi (target, temp, GEN_INT (i))); + break; + + case E_V4SImode: + emit_insn (gen_vec_setv4si (target, temp, GEN_INT (i))); + break; + + case E_V2DImode: + emit_insn (gen_vec_setv2di (target, temp, GEN_INT (i))); + break; + + case E_V4SFmode: + emit_insn (gen_vec_setv4sf (target, temp, GEN_INT (i))); + break; + + case E_V2DFmode: + emit_insn (gen_vec_setv2df (target, temp, GEN_INT (i))); + break; + + default: + gcc_unreachable (); + } + } + } + return; + } + + /* Load constants from the pool, or whatever's handy. */ + if (nvar == 0) + { + emit_move_insn (target, gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0))); + return; + } + + /* For two-part initialization, always use CONCAT. */ + if (nelt == 2) + { + rtx op0 = force_reg (imode, XVECEXP (vals, 0, 0)); + rtx op1 = force_reg (imode, XVECEXP (vals, 0, 1)); + x = gen_rtx_VEC_CONCAT (vmode, op0, op1); + emit_insn (gen_rtx_SET (target, x)); + return; + } + + /* Loongson is the only cpu with vectors with more elements. */ + gcc_assert (0); +} + +/* Implement HARD_REGNO_CALLER_SAVE_MODE. */ + +machine_mode +loongarch_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs, + machine_mode mode) +{ + /* For performance, avoid saving/restoring upper parts of a register + by returning MODE as save mode when the mode is known. */ + if (mode == VOIDmode) + return choose_hard_reg_mode (regno, nregs, NULL); + else + return mode; +} + +/* Generate RTL for comparing CMP_OP0 and CMP_OP1 using condition COND and + store the result -1 or 0 in DEST. */ + +static void +loongarch_expand_lsx_cmp (rtx dest, enum rtx_code cond, rtx op0, rtx op1) +{ + machine_mode cmp_mode = GET_MODE (op0); + int unspec = -1; + bool negate = false; + + switch (cmp_mode) + { + case E_V16QImode: + case E_V32QImode: + case E_V8HImode: + case E_V16HImode: + case E_V4SImode: + case E_V8SImode: + case E_V2DImode: + case E_V4DImode: + switch (cond) + { + case NE: + cond = reverse_condition (cond); + negate = true; + break; + case EQ: + case LT: + case LE: + case LTU: + case LEU: + break; + case GE: + case GT: + case GEU: + case GTU: + std::swap (op0, op1); + cond = swap_condition (cond); + break; + default: + gcc_unreachable (); + } + loongarch_emit_binary (cond, dest, op0, op1); + if (negate) + emit_move_insn (dest, gen_rtx_NOT (GET_MODE (dest), dest)); + break; + + case E_V4SFmode: + case E_V2DFmode: + switch (cond) + { + case UNORDERED: + case ORDERED: + case EQ: + case NE: + case UNEQ: + case UNLE: + case UNLT: + break; + case LTGT: cond = NE; break; + case UNGE: cond = UNLE; std::swap (op0, op1); break; + case UNGT: cond = UNLT; std::swap (op0, op1); break; + case LE: unspec = UNSPEC_LSX_VFCMP_SLE; break; + case LT: unspec = UNSPEC_LSX_VFCMP_SLT; break; + case GE: unspec = UNSPEC_LSX_VFCMP_SLE; std::swap (op0, op1); break; + case GT: unspec = UNSPEC_LSX_VFCMP_SLT; std::swap (op0, op1); break; + default: + gcc_unreachable (); + } + if (unspec < 0) + loongarch_emit_binary (cond, dest, op0, op1); + else + { + rtx x = gen_rtx_UNSPEC (GET_MODE (dest), + gen_rtvec (2, op0, op1), unspec); + emit_insn (gen_rtx_SET (dest, x)); + } + break; + + default: + gcc_unreachable (); + break; + } +} + +/* Expand VEC_COND_EXPR, where: + MODE is mode of the result + VIMODE equivalent integer mode + OPERANDS operands of VEC_COND_EXPR. */ + +void +loongarch_expand_vec_cond_expr (machine_mode mode, machine_mode vimode, + rtx *operands) +{ + rtx cond = operands[3]; + rtx cmp_op0 = operands[4]; + rtx cmp_op1 = operands[5]; + rtx cmp_res = gen_reg_rtx (vimode); + + loongarch_expand_lsx_cmp (cmp_res, GET_CODE (cond), cmp_op0, cmp_op1); + + /* We handle the following cases: + 1) r = a CMP b ? -1 : 0 + 2) r = a CMP b ? -1 : v + 3) r = a CMP b ? v : 0 + 4) r = a CMP b ? v1 : v2 */ + + /* Case (1) above. We only move the results. */ + if (operands[1] == CONSTM1_RTX (vimode) + && operands[2] == CONST0_RTX (vimode)) + emit_move_insn (operands[0], cmp_res); + else + { + rtx src1 = gen_reg_rtx (vimode); + rtx src2 = gen_reg_rtx (vimode); + rtx mask = gen_reg_rtx (vimode); + rtx bsel; + + /* Move the vector result to use it as a mask. */ + emit_move_insn (mask, cmp_res); + + if (register_operand (operands[1], mode)) + { + rtx xop1 = operands[1]; + if (mode != vimode) + { + xop1 = gen_reg_rtx (vimode); + emit_move_insn (xop1, gen_rtx_SUBREG (vimode, operands[1], 0)); + } + emit_move_insn (src1, xop1); + } + else + { + gcc_assert (operands[1] == CONSTM1_RTX (vimode)); + /* Case (2) if the below doesn't move the mask to src2. */ + emit_move_insn (src1, mask); + } + + if (register_operand (operands[2], mode)) + { + rtx xop2 = operands[2]; + if (mode != vimode) + { + xop2 = gen_reg_rtx (vimode); + emit_move_insn (xop2, gen_rtx_SUBREG (vimode, operands[2], 0)); + } + emit_move_insn (src2, xop2); + } + else + { + gcc_assert (operands[2] == CONST0_RTX (mode)); + /* Case (3) if the above didn't move the mask to src1. */ + emit_move_insn (src2, mask); + } + + /* We deal with case (4) if the mask wasn't moved to either src1 or src2. + In any case, we eventually do vector mask-based copy. */ + bsel = gen_rtx_IOR (vimode, + gen_rtx_AND (vimode, + gen_rtx_NOT (vimode, mask), src2), + gen_rtx_AND (vimode, mask, src1)); + /* The result is placed back to a register with the mask. */ + emit_insn (gen_rtx_SET (mask, bsel)); + emit_move_insn (operands[0], gen_rtx_SUBREG (mode, mask, 0)); + } +} + +void +loongarch_expand_vec_cond_mask_expr (machine_mode mode, machine_mode vimode, + rtx *operands) +{ + rtx cmp_res = operands[3]; + + /* We handle the following cases: + 1) r = a CMP b ? -1 : 0 + 2) r = a CMP b ? -1 : v + 3) r = a CMP b ? v : 0 + 4) r = a CMP b ? v1 : v2 */ + + /* Case (1) above. We only move the results. */ + if (operands[1] == CONSTM1_RTX (vimode) + && operands[2] == CONST0_RTX (vimode)) + emit_move_insn (operands[0], cmp_res); + else + { + rtx src1 = gen_reg_rtx (vimode); + rtx src2 = gen_reg_rtx (vimode); + rtx mask = gen_reg_rtx (vimode); + rtx bsel; + + /* Move the vector result to use it as a mask. */ + emit_move_insn (mask, cmp_res); + + if (register_operand (operands[1], mode)) + { + rtx xop1 = operands[1]; + if (mode != vimode) + { + xop1 = gen_reg_rtx (vimode); + emit_move_insn (xop1, gen_rtx_SUBREG (vimode, operands[1], 0)); + } + emit_move_insn (src1, xop1); + } + else + { + gcc_assert (operands[1] == CONSTM1_RTX (vimode)); + /* Case (2) if the below doesn't move the mask to src2. */ + emit_move_insn (src1, mask); + } + + if (register_operand (operands[2], mode)) + { + rtx xop2 = operands[2]; + if (mode != vimode) + { + xop2 = gen_reg_rtx (vimode); + emit_move_insn (xop2, gen_rtx_SUBREG (vimode, operands[2], 0)); + } + emit_move_insn (src2, xop2); + } + else + { + gcc_assert (operands[2] == CONST0_RTX (mode)); + /* Case (3) if the above didn't move the mask to src1. */ + emit_move_insn (src2, mask); + } + + /* We deal with case (4) if the mask wasn't moved to either src1 or src2. + In any case, we eventually do vector mask-based copy. */ + bsel = gen_rtx_IOR (vimode, + gen_rtx_AND (vimode, + gen_rtx_NOT (vimode, mask), src2), + gen_rtx_AND (vimode, mask, src1)); + /* The result is placed back to a register with the mask. */ + emit_insn (gen_rtx_SET (mask, bsel)); + emit_move_insn (operands[0], gen_rtx_SUBREG (mode, mask, 0)); + } +} + +/* Expand integer vector comparison */ +bool +loongarch_expand_vec_cmp (rtx operands[]) +{ + + rtx_code code = GET_CODE (operands[1]); + loongarch_expand_lsx_cmp (operands[0], code, operands[2], operands[3]); + return true; +} + +/* Implement TARGET_CASE_VALUES_THRESHOLD. */ + +unsigned int +loongarch_case_values_threshold (void) +{ + return default_case_values_threshold (); +} + +/* Implement TARGET_SPILL_CLASS. */ + +static reg_class_t +loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED, + machine_mode mode ATTRIBUTE_UNUSED) +{ + return NO_REGS; +} + +/* Implement TARGET_PROMOTE_FUNCTION_MODE. */ + +/* This function is equivalent to default_promote_function_mode_always_promote + except that it returns a promoted mode even if type is NULL_TREE. This is + needed by libcalls which have no type (only a mode) such as fixed conversion + routines that take a signed or unsigned char/short argument and convert it + to a fixed type. */ + +static machine_mode +loongarch_promote_function_mode (const_tree type ATTRIBUTE_UNUSED, + machine_mode mode, + int *punsignedp ATTRIBUTE_UNUSED, + const_tree fntype ATTRIBUTE_UNUSED, + int for_return ATTRIBUTE_UNUSED) +{ + int unsignedp; + + if (type != NULL_TREE) + return promote_mode (type, mode, punsignedp); + + unsignedp = *punsignedp; + PROMOTE_MODE (mode, unsignedp, type); + *punsignedp = unsignedp; + return mode; +} + +/* Implement TARGET_STARTING_FRAME_OFFSET. See loongarch_compute_frame_info + for details about the frame layout. */ + +static HOST_WIDE_INT +loongarch_starting_frame_offset (void) +{ + if (FRAME_GROWS_DOWNWARD) + return 0; + return crtl->outgoing_args_size; +} + +/* A subroutine of loongarch_build_signbit_mask. If VECT is true, + then replicate the value for all elements of the vector + register. */ + +rtx +loongarch_build_const_vector (machine_mode mode, bool vect, rtx value) +{ + int i, n_elt; + rtvec v; + machine_mode scalar_mode; + + switch (mode) + { + case E_V32QImode: + case E_V16QImode: + case E_V32HImode: + case E_V16HImode: + case E_V8HImode: + case E_V8SImode: + case E_V4SImode: + case E_V8DImode: + case E_V4DImode: + case E_V2DImode: + gcc_assert (vect); + /* FALLTHRU */ + case E_V8SFmode: + case E_V4SFmode: + case E_V8DFmode: + case E_V4DFmode: + case E_V2DFmode: + n_elt = GET_MODE_NUNITS (mode); + v = rtvec_alloc (n_elt); + scalar_mode = GET_MODE_INNER (mode); + + RTVEC_ELT (v, 0) = value; + + for (i = 1; i < n_elt; ++i) + RTVEC_ELT (v, i) = vect ? value : CONST0_RTX (scalar_mode); + + return gen_rtx_CONST_VECTOR (mode, v); + + default: + gcc_unreachable (); + } +} + +/* Create a mask for the sign bit in MODE + for an register. If VECT is true, then replicate the mask for + all elements of the vector register. If INVERT is true, then create + a mask excluding the sign bit. */ + +rtx +loongarch_build_signbit_mask (machine_mode mode, bool vect, bool invert) +{ + machine_mode vec_mode, imode; + wide_int w; + rtx mask, v; + + switch (mode) + { + case E_V16SImode: + case E_V16SFmode: + case E_V8SImode: + case E_V4SImode: + case E_V8SFmode: + case E_V4SFmode: + vec_mode = mode; + imode = SImode; + break; + + case E_V8DImode: + case E_V4DImode: + case E_V2DImode: + case E_V8DFmode: + case E_V4DFmode: + case E_V2DFmode: + vec_mode = mode; + imode = DImode; + break; + + case E_TImode: + case E_TFmode: + vec_mode = VOIDmode; + imode = TImode; + break; + + default: + gcc_unreachable (); + } + + machine_mode inner_mode = GET_MODE_INNER (mode); + w = wi::set_bit_in_zero (GET_MODE_BITSIZE (inner_mode) - 1, + GET_MODE_BITSIZE (inner_mode)); + if (invert) + w = wi::bit_not (w); + + /* Force this value into the low part of a fp vector constant. */ + mask = immed_wide_int_const (w, imode); + mask = gen_lowpart (inner_mode, mask); + + if (vec_mode == VOIDmode) + return force_reg (inner_mode, mask); + + v = loongarch_build_const_vector (vec_mode, vect, mask); + return force_reg (vec_mode, v); +} + +static bool +loongarch_builtin_support_vector_misalignment (machine_mode mode, + const_tree type, + int misalignment, + bool is_packed) +{ + if (ISA_HAS_LSX && STRICT_ALIGNMENT) + { + if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing) + return false; + if (misalignment == -1) + return false; + } + return default_builtin_support_vector_misalignment (mode, type, misalignment, + is_packed); +} + +/* Initialize the GCC target structure. */ +#undef TARGET_ASM_ALIGNED_HI_OP +#define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" +#undef TARGET_ASM_ALIGNED_SI_OP +#define TARGET_ASM_ALIGNED_SI_OP "\t.word\t" +#undef TARGET_ASM_ALIGNED_DI_OP +#define TARGET_ASM_ALIGNED_DI_OP "\t.dword\t" + +#undef TARGET_OPTION_OVERRIDE +#define TARGET_OPTION_OVERRIDE loongarch_option_override + +#undef TARGET_LEGITIMIZE_ADDRESS +#define TARGET_LEGITIMIZE_ADDRESS loongarch_legitimize_address + +#undef TARGET_ASM_SELECT_RTX_SECTION +#define TARGET_ASM_SELECT_RTX_SECTION loongarch_select_rtx_section +#undef TARGET_ASM_FUNCTION_RODATA_SECTION +#define TARGET_ASM_FUNCTION_RODATA_SECTION loongarch_function_rodata_section + +#undef TARGET_SCHED_INIT +#define TARGET_SCHED_INIT loongarch_sched_init +#undef TARGET_SCHED_REORDER +#define TARGET_SCHED_REORDER loongarch_sched_reorder +#undef TARGET_SCHED_REORDER2 +#define TARGET_SCHED_REORDER2 loongarch_sched_reorder2 +#undef TARGET_SCHED_VARIABLE_ISSUE +#define TARGET_SCHED_VARIABLE_ISSUE loongarch_variable_issue +#undef TARGET_SCHED_ADJUST_COST +#define TARGET_SCHED_ADJUST_COST loongarch_adjust_cost +#undef TARGET_SCHED_ISSUE_RATE +#define TARGET_SCHED_ISSUE_RATE loongarch_issue_rate +#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD +#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \ + loongarch_multipass_dfa_lookahead + +#undef TARGET_FUNCTION_OK_FOR_SIBCALL +#define TARGET_FUNCTION_OK_FOR_SIBCALL loongarch_function_ok_for_sibcall + +#undef TARGET_VALID_POINTER_MODE +#define TARGET_VALID_POINTER_MODE loongarch_valid_pointer_mode +#undef TARGET_REGISTER_MOVE_COST +#define TARGET_REGISTER_MOVE_COST loongarch_register_move_cost +#undef TARGET_MEMORY_MOVE_COST +#define TARGET_MEMORY_MOVE_COST loongarch_memory_move_cost +#undef TARGET_RTX_COSTS +#define TARGET_RTX_COSTS loongarch_rtx_costs +#undef TARGET_ADDRESS_COST +#define TARGET_ADDRESS_COST loongarch_address_cost +#undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST +#define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \ + loongarch_builtin_vectorization_cost + + +#undef TARGET_IN_SMALL_DATA_P +#define TARGET_IN_SMALL_DATA_P loongarch_in_small_data_p + +#undef TARGET_PREFERRED_RELOAD_CLASS +#define TARGET_PREFERRED_RELOAD_CLASS loongarch_preferred_reload_class + +#undef TARGET_ASM_FILE_START_FILE_DIRECTIVE +#define TARGET_ASM_FILE_START_FILE_DIRECTIVE true + +#undef TARGET_EXPAND_BUILTIN_VA_START +#define TARGET_EXPAND_BUILTIN_VA_START loongarch_va_start + +#undef TARGET_PROMOTE_FUNCTION_MODE +#define TARGET_PROMOTE_FUNCTION_MODE loongarch_promote_function_mode +#undef TARGET_RETURN_IN_MEMORY +#define TARGET_RETURN_IN_MEMORY loongarch_return_in_memory + +#undef TARGET_FUNCTION_VALUE +#define TARGET_FUNCTION_VALUE loongarch_function_value +#undef TARGET_LIBCALL_VALUE +#define TARGET_LIBCALL_VALUE loongarch_libcall_value + +#undef TARGET_ASM_OUTPUT_MI_THUNK +#define TARGET_ASM_OUTPUT_MI_THUNK loongarch_output_mi_thunk +#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK +#define TARGET_ASM_CAN_OUTPUT_MI_THUNK \ + hook_bool_const_tree_hwi_hwi_const_tree_true + +#undef TARGET_PRINT_OPERAND +#define TARGET_PRINT_OPERAND loongarch_print_operand +#undef TARGET_PRINT_OPERAND_ADDRESS +#define TARGET_PRINT_OPERAND_ADDRESS loongarch_print_operand_address +#undef TARGET_PRINT_OPERAND_PUNCT_VALID_P +#define TARGET_PRINT_OPERAND_PUNCT_VALID_P \ + loongarch_print_operand_punct_valid_p + +#undef TARGET_SETUP_INCOMING_VARARGS +#define TARGET_SETUP_INCOMING_VARARGS loongarch_setup_incoming_varargs +#undef TARGET_STRICT_ARGUMENT_NAMING +#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true +#undef TARGET_MUST_PASS_IN_STACK +#define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size +#undef TARGET_PASS_BY_REFERENCE +#define TARGET_PASS_BY_REFERENCE loongarch_pass_by_reference +#undef TARGET_ARG_PARTIAL_BYTES +#define TARGET_ARG_PARTIAL_BYTES loongarch_arg_partial_bytes +#undef TARGET_FUNCTION_ARG +#define TARGET_FUNCTION_ARG loongarch_function_arg +#undef TARGET_FUNCTION_ARG_ADVANCE +#define TARGET_FUNCTION_ARG_ADVANCE loongarch_function_arg_advance +#undef TARGET_FUNCTION_ARG_BOUNDARY +#define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary + +#undef TARGET_VECTOR_MODE_SUPPORTED_P +#define TARGET_VECTOR_MODE_SUPPORTED_P loongarch_vector_mode_supported_p + +#undef TARGET_SCALAR_MODE_SUPPORTED_P +#define TARGET_SCALAR_MODE_SUPPORTED_P loongarch_scalar_mode_supported_p + +#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE +#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE loongarch_preferred_simd_mode + +#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES +#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES \ + loongarch_autovectorize_vector_modes + +#undef TARGET_INIT_BUILTINS +#define TARGET_INIT_BUILTINS loongarch_init_builtins +#undef TARGET_BUILTIN_DECL +#define TARGET_BUILTIN_DECL loongarch_builtin_decl +#undef TARGET_EXPAND_BUILTIN #define TARGET_EXPAND_BUILTIN loongarch_expand_builtin /* The generic ELF target does not always have TLS support. */ @@ -6941,6 +8814,14 @@ loongarch_set_handled_components (sbitmap components) #undef TARGET_MAX_ANCHOR_OFFSET #define TARGET_MAX_ANCHOR_OFFSET (IMM_REACH/2-1) +#undef TARGET_VECTORIZE_VEC_PERM_CONST +#define TARGET_VECTORIZE_VEC_PERM_CONST loongarch_vectorize_vec_perm_const + +#undef TARGET_SCHED_REASSOCIATION_WIDTH +#define TARGET_SCHED_REASSOCIATION_WIDTH loongarch_sched_reassociation_width + +#undef TARGET_CASE_VALUES_THRESHOLD +#define TARGET_CASE_VALUES_THRESHOLD loongarch_case_values_threshold #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV loongarch_atomic_assign_expand_fenv @@ -6959,6 +8840,10 @@ loongarch_set_handled_components (sbitmap components) #undef TARGET_MODES_TIEABLE_P #define TARGET_MODES_TIEABLE_P loongarch_modes_tieable_p +#undef TARGET_HARD_REGNO_CALL_PART_CLOBBERED +#define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \ + loongarch_hard_regno_call_part_clobbered + #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 2 @@ -7009,6 +8894,10 @@ loongarch_set_handled_components (sbitmap components) #define TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS \ loongarch_set_handled_components +#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT +#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \ + loongarch_builtin_support_vector_misalignment + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-loongarch.h" diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h index eca723293a1..e939dd826d1 100644 --- a/gcc/config/loongarch/loongarch.h +++ b/gcc/config/loongarch/loongarch.h @@ -23,6 +23,8 @@ along with GCC; see the file COPYING3. If not see #include "config/loongarch/loongarch-opts.h" +#define TARGET_SUPPORTS_WIDE_INT 1 + /* Macros to silence warnings about numbers being signed in traditional C and unsigned in ISO C when compiled on 32-bit hosts. */ @@ -179,6 +181,11 @@ along with GCC; see the file COPYING3. If not see #define MIN_UNITS_PER_WORD 4 #endif +/* Width of a LSX vector register in bytes. */ +#define UNITS_PER_LSX_REG 16 +/* Width of a LSX vector register in bits. */ +#define BITS_PER_LSX_REG (UNITS_PER_LSX_REG * BITS_PER_UNIT) + /* For LARCH, width of a floating point register. */ #define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4) @@ -241,8 +248,10 @@ along with GCC; see the file COPYING3. If not see #define STRUCTURE_SIZE_BOUNDARY 8 /* There is no point aligning anything to a rounder boundary than - LONG_DOUBLE_TYPE_SIZE. */ -#define BIGGEST_ALIGNMENT (LONG_DOUBLE_TYPE_SIZE) + LONG_DOUBLE_TYPE_SIZE, unless under LSX the bigggest alignment is + BITS_PER_LSX_REG/.. */ +#define BIGGEST_ALIGNMENT \ + (ISA_HAS_LSX ? BITS_PER_LSX_REG : LONG_DOUBLE_TYPE_SIZE) /* All accesses must be aligned. */ #define STRICT_ALIGNMENT (TARGET_STRICT_ALIGN) @@ -378,6 +387,9 @@ along with GCC; see the file COPYING3. If not see #define FP_REG_FIRST 32 #define FP_REG_LAST 63 #define FP_REG_NUM (FP_REG_LAST - FP_REG_FIRST + 1) +#define LSX_REG_FIRST FP_REG_FIRST +#define LSX_REG_LAST FP_REG_LAST +#define LSX_REG_NUM FP_REG_NUM /* The DWARF 2 CFA column which tracks the return address from a signal handler context. This means that to maintain backwards @@ -395,8 +407,11 @@ along with GCC; see the file COPYING3. If not see ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM) #define FCC_REG_P(REGNO) \ ((unsigned int) ((int) (REGNO) - FCC_REG_FIRST) < FCC_REG_NUM) +#define LSX_REG_P(REGNO) \ + ((unsigned int) ((int) (REGNO) - LSX_REG_FIRST) < LSX_REG_NUM) #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X))) +#define LSX_REG_RTX_P(X) (REG_P (X) && LSX_REG_P (REGNO (X))) /* Select a register mode required for caller save of hard regno REGNO. */ #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \ @@ -577,6 +592,11 @@ enum reg_class #define IMM12_OPERAND(VALUE) \ ((unsigned HOST_WIDE_INT) (VALUE) + IMM_REACH / 2 < IMM_REACH) +/* True if VALUE is a signed 13-bit number. */ + +#define IMM13_OPERAND(VALUE) \ + ((unsigned HOST_WIDE_INT) (VALUE) + 0x1000 < 0x2000) + /* True if VALUE is a signed 16-bit number. */ #define IMM16_OPERAND(VALUE) \ @@ -706,6 +726,13 @@ enum reg_class #define FP_ARG_FIRST (FP_REG_FIRST + 0) #define FP_ARG_LAST (FP_ARG_FIRST + MAX_ARGS_IN_REGISTERS - 1) +/* True if MODE is vector and supported in a LSX vector register. */ +#define LSX_SUPPORTED_MODE_P(MODE) \ + (ISA_HAS_LSX \ + && GET_MODE_SIZE (MODE) == UNITS_PER_LSX_REG \ + && (GET_MODE_CLASS (MODE) == MODE_VECTOR_INT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT)) + /* 1 if N is a possible register number for function argument passing. We have no FP argument registers when soft-float. */ @@ -926,7 +953,39 @@ typedef struct { { "s7", 30 + GP_REG_FIRST }, \ { "s8", 31 + GP_REG_FIRST }, \ { "v0", 4 + GP_REG_FIRST }, \ - { "v1", 5 + GP_REG_FIRST } \ + { "v1", 5 + GP_REG_FIRST }, \ + { "vr0", 0 + FP_REG_FIRST }, \ + { "vr1", 1 + FP_REG_FIRST }, \ + { "vr2", 2 + FP_REG_FIRST }, \ + { "vr3", 3 + FP_REG_FIRST }, \ + { "vr4", 4 + FP_REG_FIRST }, \ + { "vr5", 5 + FP_REG_FIRST }, \ + { "vr6", 6 + FP_REG_FIRST }, \ + { "vr7", 7 + FP_REG_FIRST }, \ + { "vr8", 8 + FP_REG_FIRST }, \ + { "vr9", 9 + FP_REG_FIRST }, \ + { "vr10", 10 + FP_REG_FIRST }, \ + { "vr11", 11 + FP_REG_FIRST }, \ + { "vr12", 12 + FP_REG_FIRST }, \ + { "vr13", 13 + FP_REG_FIRST }, \ + { "vr14", 14 + FP_REG_FIRST }, \ + { "vr15", 15 + FP_REG_FIRST }, \ + { "vr16", 16 + FP_REG_FIRST }, \ + { "vr17", 17 + FP_REG_FIRST }, \ + { "vr18", 18 + FP_REG_FIRST }, \ + { "vr19", 19 + FP_REG_FIRST }, \ + { "vr20", 20 + FP_REG_FIRST }, \ + { "vr21", 21 + FP_REG_FIRST }, \ + { "vr22", 22 + FP_REG_FIRST }, \ + { "vr23", 23 + FP_REG_FIRST }, \ + { "vr24", 24 + FP_REG_FIRST }, \ + { "vr25", 25 + FP_REG_FIRST }, \ + { "vr26", 26 + FP_REG_FIRST }, \ + { "vr27", 27 + FP_REG_FIRST }, \ + { "vr28", 28 + FP_REG_FIRST }, \ + { "vr29", 29 + FP_REG_FIRST }, \ + { "vr30", 30 + FP_REG_FIRST }, \ + { "vr31", 31 + FP_REG_FIRST } \ } /* Globalizing directive for a label. */ diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index b37e070660f..7b8978e2533 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -158,11 +158,12 @@ (define_attr "move_type" const,signext,pick_ins,logical,arith,sll0,andi,shift_shift" (const_string "unknown")) -(define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor" +(define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor,simd_add" (const_string "unknown")) ;; Main data type used by the insn -(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC" +(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC, + V2DI,V4SI,V8HI,V16QI,V2DF,V4SF" (const_string "unknown")) ;; True if the main data type is twice the size of a word. @@ -234,7 +235,12 @@ (define_attr "type" prefetch,prefetchx,condmove,mgtf,mftg,const,arith,logical, shift,slt,signext,clz,trap,imul,idiv,move, fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,flogb,fneg,fcmp,fcopysign,fcvt, - fscaleb,fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost" + fscaleb,fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost, + simd_div,simd_fclass,simd_flog2,simd_fadd,simd_fcvt,simd_fmul,simd_fmadd, + simd_fdiv,simd_bitins,simd_bitmov,simd_insert,simd_sld,simd_mul,simd_fcmp, + simd_fexp2,simd_int_arith,simd_bit,simd_shift,simd_splat,simd_fill, + simd_permute,simd_shf,simd_sat,simd_pcnt,simd_copy,simd_branch,simd_clsx, + simd_fminmax,simd_logic,simd_move,simd_load,simd_store" (cond [(eq_attr "jirl" "!unset") (const_string "call") (eq_attr "got" "load") (const_string "load") @@ -414,11 +420,20 @@ (define_mode_attr ifmt [(SI "w") (DI "l")]) ;; This attribute gives the upper-case mode name for one unit of a ;; floating-point mode or vector mode. -(define_mode_attr UNITMODE [(SF "SF") (DF "DF")]) +(define_mode_attr UNITMODE [(SF "SF") (DF "DF") (V2SF "SF") (V4SF "SF") + (V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI") + (V2DF "DF")]) + +;; As above, but in lower case. +(define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf") + (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi") + (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")]) ;; This attribute gives the integer mode that has half the size of ;; the controlling mode. -(define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")]) +(define_mode_attr HALFMODE [(DF "SI") (DI "SI") (V2SF "SI") + (V2SI "SI") (V4HI "SI") (V8QI "SI") + (TF "DI")]) ;; This attribute gives the integer mode that has the same size of a ;; floating-point mode. @@ -445,6 +460,18 @@ (define_code_iterator neg_bitwise [and ior]) ;; from the same template. (define_code_iterator any_div [div udiv mod umod]) +;; This code iterator allows addition and subtraction to be generated +;; from the same template. +(define_code_iterator addsub [plus minus]) + +;; This code iterator allows addition and multiplication to be generated +;; from the same template. +(define_code_iterator addmul [plus mult]) + +;; This code iterator allows addition subtraction and multiplication to be +;; generated from the same template +(define_code_iterator addsubmul [plus minus mult]) + ;; This code iterator allows all native floating-point comparisons to be ;; generated from the same template. (define_code_iterator fcond [unordered uneq unlt unle eq lt le @@ -684,7 +711,6 @@ (define_insn "sub3" [(set_attr "alu_type" "sub") (set_attr "mode" "")]) - (define_insn "*subsi3_extended" [(set (match_operand:DI 0 "register_operand" "= r") (sign_extend:DI @@ -1228,7 +1254,7 @@ (define_insn "smina3" "fmina.\t%0,%1,%2" [(set_attr "type" "fmove") (set_attr "mode" "")]) - + ;; ;; .................... ;; @@ -2541,7 +2567,6 @@ (define_insn "rotr3" [(set_attr "type" "shift,shift") (set_attr "mode" "")]) - ;; The following templates were added to generate "bstrpick.d + alsl.d" ;; instruction pairs. ;; It is required that the values of const_immalsl_operand and @@ -3606,6 +3631,9 @@ (define_insn "loongarch_crcc_w__w" (include "generic.md") (include "la464.md") +; The LoongArch SX Instructions. +(include "lsx.md") + (define_c_enum "unspec" [ UNSPEC_ADDRESS_FIRST ]) diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md new file mode 100644 index 00000000000..fadba779b6f --- /dev/null +++ b/gcc/config/loongarch/lsx.md @@ -0,0 +1,4490 @@ +;; Machine Description for LARCH Loongson SX ASE +;; +;; Copyright (C) 2018 Free Software Foundation, Inc. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . +;; + +(define_c_enum "unspec" [ + UNSPEC_LSX_ASUB_S + UNSPEC_LSX_VABSD_U + UNSPEC_LSX_VAVG_S + UNSPEC_LSX_VAVG_U + UNSPEC_LSX_VAVGR_S + UNSPEC_LSX_VAVGR_U + UNSPEC_LSX_VBITCLR + UNSPEC_LSX_VBITCLRI + UNSPEC_LSX_VBITREV + UNSPEC_LSX_VBITREVI + UNSPEC_LSX_VBITSET + UNSPEC_LSX_VBITSETI + UNSPEC_LSX_BRANCH_V + UNSPEC_LSX_BRANCH + UNSPEC_LSX_VFCMP_CAF + UNSPEC_LSX_VFCLASS + UNSPEC_LSX_VFCMP_CUNE + UNSPEC_LSX_VFCVT + UNSPEC_LSX_VFCVTH + UNSPEC_LSX_VFCVTL + UNSPEC_LSX_VFLOGB + UNSPEC_LSX_VFRECIP + UNSPEC_LSX_VFRINT + UNSPEC_LSX_VFRSQRT + UNSPEC_LSX_VFCMP_SAF + UNSPEC_LSX_VFCMP_SEQ + UNSPEC_LSX_VFCMP_SLE + UNSPEC_LSX_VFCMP_SLT + UNSPEC_LSX_VFCMP_SNE + UNSPEC_LSX_VFCMP_SOR + UNSPEC_LSX_VFCMP_SUEQ + UNSPEC_LSX_VFCMP_SULE + UNSPEC_LSX_VFCMP_SULT + UNSPEC_LSX_VFCMP_SUN + UNSPEC_LSX_VFCMP_SUNE + UNSPEC_LSX_VFTINT_S + UNSPEC_LSX_VFTINT_U + UNSPEC_LSX_VCLO + UNSPEC_LSX_VSAT_S + UNSPEC_LSX_VSAT_U + UNSPEC_LSX_VREPLVE + UNSPEC_LSX_VREPLVEI + UNSPEC_LSX_VSRAR + UNSPEC_LSX_VSRARI + UNSPEC_LSX_VSRLR + UNSPEC_LSX_VSRLRI + UNSPEC_LSX_VSSUB_S + UNSPEC_LSX_VSSUB_U + UNSPEC_LSX_VSHUF + UNSPEC_LSX_VABS + UNSPEC_LSX_VMUH_S + UNSPEC_LSX_VMUH_U + UNSPEC_LSX_VEXTW_S + UNSPEC_LSX_VEXTW_U + UNSPEC_LSX_VSLLWIL_S + UNSPEC_LSX_VSLLWIL_U + UNSPEC_LSX_VSRAN + UNSPEC_LSX_VSSRAN_S + UNSPEC_LSX_VSSRAN_U + UNSPEC_LSX_VSRAIN + UNSPEC_LSX_VSRAINS_S + UNSPEC_LSX_VSRAINS_U + UNSPEC_LSX_VSRARN + UNSPEC_LSX_VSRLN + UNSPEC_LSX_VSRLRN + UNSPEC_LSX_VSSRLRN_U + UNSPEC_LSX_VFRSTPI + UNSPEC_LSX_VFRSTP + UNSPEC_LSX_VSHUF4I + UNSPEC_LSX_VBSRL_V + UNSPEC_LSX_VBSLL_V + UNSPEC_LSX_VEXTRINS + UNSPEC_LSX_VMSKLTZ + UNSPEC_LSX_VSIGNCOV + UNSPEC_LSX_VFTINTRNE + UNSPEC_LSX_VFTINTRP + UNSPEC_LSX_VFTINTRM + UNSPEC_LSX_VFTINT_W_D + UNSPEC_LSX_VFFINT_S_L + UNSPEC_LSX_VFTINTRZ_W_D + UNSPEC_LSX_VFTINTRP_W_D + UNSPEC_LSX_VFTINTRM_W_D + UNSPEC_LSX_VFTINTRNE_W_D + UNSPEC_LSX_VFTINTL_L_S + UNSPEC_LSX_VFFINTH_D_W + UNSPEC_LSX_VFFINTL_D_W + UNSPEC_LSX_VFTINTRZL_L_S + UNSPEC_LSX_VFTINTRZH_L_S + UNSPEC_LSX_VFTINTRPL_L_S + UNSPEC_LSX_VFTINTRPH_L_S + UNSPEC_LSX_VFTINTRMH_L_S + UNSPEC_LSX_VFTINTRML_L_S + UNSPEC_LSX_VFTINTRNEL_L_S + UNSPEC_LSX_VFTINTRNEH_L_S + UNSPEC_LSX_VFTINTH_L_H + UNSPEC_LSX_VFRINTRNE_S + UNSPEC_LSX_VFRINTRNE_D + UNSPEC_LSX_VFRINTRZ_S + UNSPEC_LSX_VFRINTRZ_D + UNSPEC_LSX_VFRINTRP_S + UNSPEC_LSX_VFRINTRP_D + UNSPEC_LSX_VFRINTRM_S + UNSPEC_LSX_VFRINTRM_D + UNSPEC_LSX_VSSRARN_S + UNSPEC_LSX_VSSRARN_U + UNSPEC_LSX_VSSRLN_U + UNSPEC_LSX_VSSRLN + UNSPEC_LSX_VSSRLRN + UNSPEC_LSX_VLDI + UNSPEC_LSX_VSHUF_B + UNSPEC_LSX_VLDX + UNSPEC_LSX_VSTX + UNSPEC_LSX_VEXTL_QU_DU + UNSPEC_LSX_VSETEQZ_V + UNSPEC_LSX_VADDWEV + UNSPEC_LSX_VADDWEV2 + UNSPEC_LSX_VADDWEV3 + UNSPEC_LSX_VADDWOD + UNSPEC_LSX_VADDWOD2 + UNSPEC_LSX_VADDWOD3 + UNSPEC_LSX_VSUBWEV + UNSPEC_LSX_VSUBWEV2 + UNSPEC_LSX_VSUBWOD + UNSPEC_LSX_VSUBWOD2 + UNSPEC_LSX_VMULWEV + UNSPEC_LSX_VMULWEV2 + UNSPEC_LSX_VMULWEV3 + UNSPEC_LSX_VMULWOD + UNSPEC_LSX_VMULWOD2 + UNSPEC_LSX_VMULWOD3 + UNSPEC_LSX_VHADDW_Q_D + UNSPEC_LSX_VHADDW_QU_DU + UNSPEC_LSX_VHSUBW_Q_D + UNSPEC_LSX_VHSUBW_QU_DU + UNSPEC_LSX_VMADDWEV + UNSPEC_LSX_VMADDWEV2 + UNSPEC_LSX_VMADDWEV3 + UNSPEC_LSX_VMADDWOD + UNSPEC_LSX_VMADDWOD2 + UNSPEC_LSX_VMADDWOD3 + UNSPEC_LSX_VROTR + UNSPEC_LSX_VADD_Q + UNSPEC_LSX_VSUB_Q + UNSPEC_LSX_VEXTH_Q_D + UNSPEC_LSX_VEXTH_QU_DU + UNSPEC_LSX_VMSKGEZ + UNSPEC_LSX_VMSKNZ + UNSPEC_LSX_VROTRI + UNSPEC_LSX_VEXTL_Q_D + UNSPEC_LSX_VSRLNI + UNSPEC_LSX_VSRLRNI + UNSPEC_LSX_VSSRLNI + UNSPEC_LSX_VSSRLNI2 + UNSPEC_LSX_VSSRLRNI + UNSPEC_LSX_VSSRLRNI2 + UNSPEC_LSX_VSRANI + UNSPEC_LSX_VSRARNI + UNSPEC_LSX_VSSRANI + UNSPEC_LSX_VSSRANI2 + UNSPEC_LSX_VSSRARNI + UNSPEC_LSX_VSSRARNI2 + UNSPEC_LSX_VPERMI +]) + +;; This attribute gives suffix for integers in VHMODE. +(define_mode_attr dlsxfmt + [(V2DI "q") + (V4SI "d") + (V8HI "w") + (V16QI "h")]) + +(define_mode_attr dlsxfmt_u + [(V2DI "qu") + (V4SI "du") + (V8HI "wu") + (V16QI "hu")]) + +(define_mode_attr d2lsxfmt + [(V4SI "q") + (V8HI "d") + (V16QI "w")]) + +(define_mode_attr d2lsxfmt_u + [(V4SI "qu") + (V8HI "du") + (V16QI "wu")]) + +;; The attribute gives two double modes for vector modes. +(define_mode_attr VD2MODE + [(V4SI "V2DI") + (V8HI "V2DI") + (V16QI "V4SI")]) + +;; All vector modes with 128 bits. +(define_mode_iterator LSX [V2DF V4SF V2DI V4SI V8HI V16QI]) + +;; Same as LSX. Used by vcond to iterate two modes. +(define_mode_iterator LSX_2 [V2DF V4SF V2DI V4SI V8HI V16QI]) + +;; Only used for splitting insert_d and copy_{u,s}.d. +(define_mode_iterator LSX_D [V2DI V2DF]) + +;; Only used for copy_{u,s}.w. +(define_mode_iterator LSX_W [V4SI V4SF]) + +;; Only integer modes. +(define_mode_iterator ILSX [V2DI V4SI V8HI V16QI]) + +;; As ILSX but excludes V16QI. +(define_mode_iterator ILSX_DWH [V2DI V4SI V8HI]) + +;; As LSX but excludes V16QI. +(define_mode_iterator LSX_DWH [V2DF V4SF V2DI V4SI V8HI]) + +;; As ILSX but excludes V2DI. +(define_mode_iterator ILSX_WHB [V4SI V8HI V16QI]) + +;; Only integer modes equal or larger than a word. +(define_mode_iterator ILSX_DW [V2DI V4SI]) + +;; Only integer modes smaller than a word. +(define_mode_iterator ILSX_HB [V8HI V16QI]) + +;;;; Only integer modes for fixed-point madd_q/maddr_q. +;;(define_mode_iterator ILSX_WH [V4SI V8HI]) + +;; Only floating-point modes. +(define_mode_iterator FLSX [V2DF V4SF]) + +;; Only used for immediate set shuffle elements instruction. +(define_mode_iterator LSX_WHB_W [V4SI V8HI V16QI V4SF]) + +;; The attribute gives the integer vector mode with same size. +(define_mode_attr VIMODE + [(V2DF "V2DI") + (V4SF "V4SI") + (V2DI "V2DI") + (V4SI "V4SI") + (V8HI "V8HI") + (V16QI "V16QI")]) + +;; The attribute gives half modes for vector modes. +(define_mode_attr VHMODE + [(V8HI "V16QI") + (V4SI "V8HI") + (V2DI "V4SI")]) + +;; The attribute gives double modes for vector modes. +(define_mode_attr VDMODE + [(V2DI "V2DI") + (V4SI "V2DI") + (V8HI "V4SI") + (V16QI "V8HI")]) + +;; The attribute gives half modes with same number of elements for vector modes. +(define_mode_attr VTRUNCMODE + [(V8HI "V8QI") + (V4SI "V4HI") + (V2DI "V2SI")]) + +;; This attribute gives the mode of the result for "vpickve2gr_b, copy_u_b" etc. +(define_mode_attr VRES + [(V2DF "DF") + (V4SF "SF") + (V2DI "DI") + (V4SI "SI") + (V8HI "SI") + (V16QI "SI")]) + +;; Only used with LSX_D iterator. +(define_mode_attr lsx_d + [(V2DI "reg_or_0") + (V2DF "register")]) + +;; This attribute gives the integer vector mode with same size. +(define_mode_attr mode_i + [(V2DF "v2di") + (V4SF "v4si") + (V2DI "v2di") + (V4SI "v4si") + (V8HI "v8hi") + (V16QI "v16qi")]) + +;; This attribute gives suffix for LSX instructions. +(define_mode_attr lsxfmt + [(V2DF "d") + (V4SF "w") + (V2DI "d") + (V4SI "w") + (V8HI "h") + (V16QI "b")]) + +;; This attribute gives suffix for LSX instructions. +(define_mode_attr lsxfmt_u + [(V2DF "du") + (V4SF "wu") + (V2DI "du") + (V4SI "wu") + (V8HI "hu") + (V16QI "bu")]) + +;; This attribute gives suffix for integers in VHMODE. +(define_mode_attr hlsxfmt + [(V2DI "w") + (V4SI "h") + (V8HI "b")]) + +;; This attribute gives suffix for integers in VHMODE. +(define_mode_attr hlsxfmt_u + [(V2DI "wu") + (V4SI "hu") + (V8HI "bu")]) + +;; This attribute gives define_insn suffix for LSX instructions that need +;; distinction between integer and floating point. +(define_mode_attr lsxfmt_f + [(V2DF "d_f") + (V4SF "w_f") + (V2DI "d") + (V4SI "w") + (V8HI "h") + (V16QI "b")]) + +(define_mode_attr flsxfmt_f + [(V2DF "d_f") + (V4SF "s_f") + (V2DI "d") + (V4SI "w") + (V8HI "h") + (V16QI "b")]) + +(define_mode_attr flsxfmt + [(V2DF "d") + (V4SF "s") + (V2DI "d") + (V4SI "s")]) + +(define_mode_attr flsxfrint + [(V2DF "d") + (V4SF "s")]) + +(define_mode_attr ilsxfmt + [(V2DF "l") + (V4SF "w")]) + +(define_mode_attr ilsxfmt_u + [(V2DF "lu") + (V4SF "wu")]) + +;; This is used to form an immediate operand constraint using +;; "const__operand". +(define_mode_attr indeximm + [(V2DF "0_or_1") + (V4SF "0_to_3") + (V2DI "0_or_1") + (V4SI "0_to_3") + (V8HI "uimm3") + (V16QI "uimm4")]) + +;; This attribute represents bitmask needed for vec_merge using +;; "const__operand". +(define_mode_attr bitmask + [(V2DF "exp_2") + (V4SF "exp_4") + (V2DI "exp_2") + (V4SI "exp_4") + (V8HI "exp_8") + (V16QI "exp_16")]) + +;; This attribute is used to form an immediate operand constraint using +;; "const__operand". +(define_mode_attr bitimm + [(V16QI "uimm3") + (V8HI "uimm4") + (V4SI "uimm5") + (V2DI "uimm6")]) + + +(define_int_iterator FRINT_S [UNSPEC_LSX_VFRINTRP_S + UNSPEC_LSX_VFRINTRZ_S + UNSPEC_LSX_VFRINT + UNSPEC_LSX_VFRINTRM_S]) + +(define_int_iterator FRINT_D [UNSPEC_LSX_VFRINTRP_D + UNSPEC_LSX_VFRINTRZ_D + UNSPEC_LSX_VFRINT + UNSPEC_LSX_VFRINTRM_D]) + +(define_int_attr frint_pattern_s + [(UNSPEC_LSX_VFRINTRP_S "ceil") + (UNSPEC_LSX_VFRINTRZ_S "btrunc") + (UNSPEC_LSX_VFRINT "rint") + (UNSPEC_LSX_VFRINTRM_S "floor")]) + +(define_int_attr frint_pattern_d + [(UNSPEC_LSX_VFRINTRP_D "ceil") + (UNSPEC_LSX_VFRINTRZ_D "btrunc") + (UNSPEC_LSX_VFRINT "rint") + (UNSPEC_LSX_VFRINTRM_D "floor")]) + +(define_int_attr frint_suffix + [(UNSPEC_LSX_VFRINTRP_S "rp") + (UNSPEC_LSX_VFRINTRP_D "rp") + (UNSPEC_LSX_VFRINTRZ_S "rz") + (UNSPEC_LSX_VFRINTRZ_D "rz") + (UNSPEC_LSX_VFRINT "") + (UNSPEC_LSX_VFRINTRM_S "rm") + (UNSPEC_LSX_VFRINTRM_D "rm")]) + +(define_expand "vec_init" + [(match_operand:LSX 0 "register_operand") + (match_operand:LSX 1 "")] + "ISA_HAS_LSX" +{ + loongarch_expand_vector_init (operands[0], operands[1]); + DONE; +}) + +;; vpickev pattern with implicit type conversion. +(define_insn "vec_pack_trunc_" + [(set (match_operand: 0 "register_operand" "=f") + (vec_concat: + (truncate: + (match_operand:ILSX_DWH 1 "register_operand" "f")) + (truncate: + (match_operand:ILSX_DWH 2 "register_operand" "f"))))] + "ISA_HAS_LSX" + "vpickev.\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "")]) + +(define_expand "vec_unpacks_hi_v4sf" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (float_extend:V2DF + (vec_select:V2SF + (match_operand:V4SF 1 "register_operand" "f") + (match_dup 2))))] + "ISA_HAS_LSX" +{ + operands[2] = loongarch_lsx_vec_parallel_const_half (V4SFmode, + true/*high_p*/); +}) + +(define_expand "vec_unpacks_lo_v4sf" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (float_extend:V2DF + (vec_select:V2SF + (match_operand:V4SF 1 "register_operand" "f") + (match_dup 2))))] + "ISA_HAS_LSX" +{ + operands[2] = loongarch_lsx_vec_parallel_const_half (V4SFmode, + false/*high_p*/); +}) + +(define_expand "vec_unpacks_hi_" + [(match_operand: 0 "register_operand") + (match_operand:ILSX_WHB 1 "register_operand")] + "ISA_HAS_LSX" +{ + loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, true/*high_p*/); + DONE; +}) + +(define_expand "vec_unpacks_lo_" + [(match_operand: 0 "register_operand") + (match_operand:ILSX_WHB 1 "register_operand")] + "ISA_HAS_LSX" +{ + loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, false/*high_p*/); + DONE; +}) + +(define_expand "vec_unpacku_hi_" + [(match_operand: 0 "register_operand") + (match_operand:ILSX_WHB 1 "register_operand")] + "ISA_HAS_LSX" +{ + loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, true/*high_p*/); + DONE; +}) + +(define_expand "vec_unpacku_lo_" + [(match_operand: 0 "register_operand") + (match_operand:ILSX_WHB 1 "register_operand")] + "ISA_HAS_LSX" +{ + loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, false/*high_p*/); + DONE; +}) + +(define_expand "vec_extract" + [(match_operand: 0 "register_operand") + (match_operand:ILSX 1 "register_operand") + (match_operand 2 "const__operand")] + "ISA_HAS_LSX" +{ + if (mode == QImode || mode == HImode) + { + rtx dest1 = gen_reg_rtx (SImode); + emit_insn (gen_lsx_vpickve2gr_ (dest1, operands[1], operands[2])); + emit_move_insn (operands[0], + gen_lowpart (mode, dest1)); + } + else + emit_insn (gen_lsx_vpickve2gr_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "vec_extract" + [(match_operand: 0 "register_operand") + (match_operand:FLSX 1 "register_operand") + (match_operand 2 "const__operand")] + "ISA_HAS_LSX" +{ + rtx temp; + HOST_WIDE_INT val = INTVAL (operands[2]); + + if (val == 0) + temp = operands[1]; + else + { + rtx n = GEN_INT (val * GET_MODE_SIZE (mode)); + temp = gen_reg_rtx (mode); + emit_insn (gen_lsx_vbsrl_ (temp, operands[1], n)); + } + emit_insn (gen_lsx_vec_extract_ (operands[0], temp)); + DONE; +}) + +(define_insn_and_split "lsx_vec_extract_" + [(set (match_operand: 0 "register_operand" "=f") + (vec_select: + (match_operand:FLSX 1 "register_operand" "f") + (parallel [(const_int 0)])))] + "ISA_HAS_LSX" + "#" + "&& reload_completed" + [(set (match_dup 0) (match_dup 1))] +{ + operands[1] = gen_rtx_REG (mode, REGNO (operands[1])); +} + [(set_attr "move_type" "fmove") + (set_attr "mode" "")]) + +(define_expand "vec_set" + [(match_operand:ILSX 0 "register_operand") + (match_operand: 1 "reg_or_0_operand") + (match_operand 2 "const__operand")] + "ISA_HAS_LSX" +{ + rtx index = GEN_INT (1 << INTVAL (operands[2])); + emit_insn (gen_lsx_vinsgr2vr_ (operands[0], operands[1], + operands[0], index)); + DONE; +}) + +(define_expand "vec_set" + [(match_operand:FLSX 0 "register_operand") + (match_operand: 1 "register_operand") + (match_operand 2 "const__operand")] + "ISA_HAS_LSX" +{ + rtx index = GEN_INT (1 << INTVAL (operands[2])); + emit_insn (gen_lsx_vextrins__scalar (operands[0], operands[1], + operands[0], index)); + DONE; +}) + +(define_expand "vec_cmp" + [(set (match_operand: 0 "register_operand") + (match_operator 1 "" + [(match_operand:LSX 2 "register_operand") + (match_operand:LSX 3 "register_operand")]))] + "ISA_HAS_LSX" +{ + bool ok = loongarch_expand_vec_cmp (operands); + gcc_assert (ok); + DONE; +}) + +(define_expand "vec_cmpu" + [(set (match_operand: 0 "register_operand") + (match_operator 1 "" + [(match_operand:ILSX 2 "register_operand") + (match_operand:ILSX 3 "register_operand")]))] + "ISA_HAS_LSX" +{ + bool ok = loongarch_expand_vec_cmp (operands); + gcc_assert (ok); + DONE; +}) + +(define_expand "vcondu" + [(match_operand:LSX 0 "register_operand") + (match_operand:LSX 1 "reg_or_m1_operand") + (match_operand:LSX 2 "reg_or_0_operand") + (match_operator 3 "" + [(match_operand:ILSX 4 "register_operand") + (match_operand:ILSX 5 "register_operand")])] + "ISA_HAS_LSX + && (GET_MODE_NUNITS (mode) == GET_MODE_NUNITS (mode))" +{ + loongarch_expand_vec_cond_expr (mode, mode, operands); + DONE; +}) + +(define_expand "vcond" + [(match_operand:LSX 0 "register_operand") + (match_operand:LSX 1 "reg_or_m1_operand") + (match_operand:LSX 2 "reg_or_0_operand") + (match_operator 3 "" + [(match_operand:LSX_2 4 "register_operand") + (match_operand:LSX_2 5 "register_operand")])] + "ISA_HAS_LSX + && (GET_MODE_NUNITS (mode) == GET_MODE_NUNITS (mode))" +{ + loongarch_expand_vec_cond_expr (mode, mode, operands); + DONE; +}) + +(define_expand "vcond_mask_" + [(match_operand:ILSX 0 "register_operand") + (match_operand:ILSX 1 "reg_or_m1_operand") + (match_operand:ILSX 2 "reg_or_0_operand") + (match_operand:ILSX 3 "register_operand")] + "ISA_HAS_LSX" +{ + loongarch_expand_vec_cond_mask_expr (mode, + mode, operands); + DONE; +}) + +(define_insn "lsx_vinsgr2vr_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (vec_merge:ILSX + (vec_duplicate:ILSX + (match_operand: 1 "reg_or_0_operand" "rJ")) + (match_operand:ILSX 2 "register_operand" "0") + (match_operand 3 "const__operand" "")))] + "ISA_HAS_LSX" +{ + if (!TARGET_64BIT && (mode == V2DImode || mode == V2DFmode)) + return "#"; + else + return "vinsgr2vr.\t%w0,%z1,%y3"; +} + [(set_attr "type" "simd_insert") + (set_attr "mode" "")]) + +(define_split + [(set (match_operand:LSX_D 0 "register_operand") + (vec_merge:LSX_D + (vec_duplicate:LSX_D + (match_operand: 1 "_operand")) + (match_operand:LSX_D 2 "register_operand") + (match_operand 3 "const__operand")))] + "reload_completed && ISA_HAS_LSX && !TARGET_64BIT" + [(const_int 0)] +{ + loongarch_split_lsx_insert_d (operands[0], operands[2], operands[3], operands[1]); + DONE; +}) + +(define_insn "lsx_vextrins__internal" + [(set (match_operand:LSX 0 "register_operand" "=f") + (vec_merge:LSX + (vec_duplicate:LSX + (vec_select: + (match_operand:LSX 1 "register_operand" "f") + (parallel [(const_int 0)]))) + (match_operand:LSX 2 "register_operand" "0") + (match_operand 3 "const__operand" "")))] + "ISA_HAS_LSX" + "vextrins.\t%w0,%w1,%y3<<4" + [(set_attr "type" "simd_insert") + (set_attr "mode" "")]) + +;; Operand 3 is a scalar. +(define_insn "lsx_vextrins__scalar" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (vec_merge:FLSX + (vec_duplicate:FLSX + (match_operand: 1 "register_operand" "f")) + (match_operand:FLSX 2 "register_operand" "0") + (match_operand 3 "const__operand" "")))] + "ISA_HAS_LSX" + "vextrins.\t%w0,%w1,%y3<<4" + [(set_attr "type" "simd_insert") + (set_attr "mode" "")]) + +(define_insn "lsx_vpickve2gr_" + [(set (match_operand: 0 "register_operand" "=r") + (any_extend: + (vec_select: + (match_operand:ILSX_HB 1 "register_operand" "f") + (parallel [(match_operand 2 "const__operand" "")]))))] + "ISA_HAS_LSX" + "vpickve2gr.\t%0,%w1,%2" + [(set_attr "type" "simd_copy") + (set_attr "mode" "")]) + +(define_insn "lsx_vpickve2gr_" + [(set (match_operand: 0 "register_operand" "=r") + (any_extend: + (vec_select: + (match_operand:LSX_W 1 "register_operand" "f") + (parallel [(match_operand 2 "const__operand" "")]))))] + "ISA_HAS_LSX" + "vpickve2gr.\t%0,%w1,%2" + [(set_attr "type" "simd_copy") + (set_attr "mode" "")]) + +(define_insn_and_split "lsx_vpickve2gr_du" + [(set (match_operand:DI 0 "register_operand" "=r") + (vec_select:DI + (match_operand:V2DI 1 "register_operand" "f") + (parallel [(match_operand 2 "const_0_or_1_operand" "")])))] + "ISA_HAS_LSX" +{ + if (TARGET_64BIT) + return "vpickve2gr.du\t%0,%w1,%2"; + else + return "#"; +} + "reload_completed && ISA_HAS_LSX && !TARGET_64BIT" + [(const_int 0)] +{ + loongarch_split_lsx_copy_d (operands[0], operands[1], operands[2], + gen_lsx_vpickve2gr_wu); + DONE; +} + [(set_attr "type" "simd_copy") + (set_attr "mode" "V2DI")]) + +(define_insn_and_split "lsx_vpickve2gr_" + [(set (match_operand: 0 "register_operand" "=r") + (vec_select: + (match_operand:LSX_D 1 "register_operand" "f") + (parallel [(match_operand 2 "const__operand" "")])))] + "ISA_HAS_LSX" +{ + if (TARGET_64BIT) + return "vpickve2gr.\t%0,%w1,%2"; + else + return "#"; +} + "reload_completed && ISA_HAS_LSX && !TARGET_64BIT" + [(const_int 0)] +{ + loongarch_split_lsx_copy_d (operands[0], operands[1], operands[2], + gen_lsx_vpickve2gr_w); + DONE; +} + [(set_attr "type" "simd_copy") + (set_attr "mode" "")]) + + +(define_expand "abs2" + [(match_operand:ILSX 0 "register_operand" "=f") + (abs:ILSX (match_operand:ILSX 1 "register_operand" "f"))] + "ISA_HAS_LSX" +{ + if (ISA_HAS_LSX) + { + emit_insn (gen_vabs2 (operands[0], operands[1])); + DONE; + } + else + { + rtx reg = gen_reg_rtx (mode); + emit_move_insn (reg, CONST0_RTX (mode)); + emit_insn (gen_lsx_vadda_ (operands[0], operands[1], reg)); + DONE; + } +}) + +(define_expand "neg2" + [(set (match_operand:ILSX 0 "register_operand") + (neg:ILSX (match_operand:ILSX 1 "register_operand")))] + "ISA_HAS_LSX" +{ + emit_insn (gen_vneg2 (operands[0], operands[1])); + DONE; +}) + +(define_expand "neg2" + [(set (match_operand:FLSX 0 "register_operand") + (neg:FLSX (match_operand:FLSX 1 "register_operand")))] + "ISA_HAS_LSX" +{ + rtx reg = gen_reg_rtx (mode); + emit_move_insn (reg, CONST0_RTX (mode)); + emit_insn (gen_sub3 (operands[0], reg, operands[1])); + DONE; +}) + +(define_expand "lsx_vrepli" + [(match_operand:ILSX 0 "register_operand") + (match_operand 1 "const_imm10_operand")] + "ISA_HAS_LSX" +{ + if (mode == V16QImode) + operands[1] = GEN_INT (trunc_int_for_mode (INTVAL (operands[1]), + mode)); + emit_move_insn (operands[0], + loongarch_gen_const_int_vector (mode, INTVAL (operands[1]))); + DONE; +}) + +(define_expand "vec_perm" + [(match_operand:LSX 0 "register_operand") + (match_operand:LSX 1 "register_operand") + (match_operand:LSX 2 "register_operand") + (match_operand:LSX 3 "register_operand")] + "ISA_HAS_LSX" +{ + loongarch_expand_vec_perm (operands[0], operands[1], + operands[2], operands[3]); + DONE; +}) + +(define_insn "lsx_vshuf_" + [(set (match_operand:LSX_DWH 0 "register_operand" "=f") + (unspec:LSX_DWH [(match_operand:LSX_DWH 1 "register_operand" "0") + (match_operand:LSX_DWH 2 "register_operand" "f") + (match_operand:LSX_DWH 3 "register_operand" "f")] + UNSPEC_LSX_VSHUF))] + "ISA_HAS_LSX" + "vshuf.\t%w0,%w2,%w3" + [(set_attr "type" "simd_sld") + (set_attr "mode" "")]) + +(define_expand "mov" + [(set (match_operand:LSX 0) + (match_operand:LSX 1))] + "ISA_HAS_LSX" +{ + if (loongarch_legitimize_move (mode, operands[0], operands[1])) + DONE; +}) + +(define_expand "movmisalign" + [(set (match_operand:LSX 0) + (match_operand:LSX 1))] + "ISA_HAS_LSX" +{ + if (loongarch_legitimize_move (mode, operands[0], operands[1])) + DONE; +}) + +(define_insn "mov_lsx" + [(set (match_operand:LSX 0 "nonimmediate_operand" "=f,f,R,*r,*f") + (match_operand:LSX 1 "move_operand" "fYGYI,R,f,*f,*r"))] + "ISA_HAS_LSX" +{ return loongarch_output_move (operands[0], operands[1]); } + [(set_attr "type" "simd_move,simd_load,simd_store,simd_copy,simd_insert") + (set_attr "mode" "")]) + +(define_split + [(set (match_operand:LSX 0 "nonimmediate_operand") + (match_operand:LSX 1 "move_operand"))] + "reload_completed && ISA_HAS_LSX + && loongarch_split_move_insn_p (operands[0], operands[1])" + [(const_int 0)] +{ + loongarch_split_move_insn (operands[0], operands[1], curr_insn); + DONE; +}) + +;; Offset load +(define_expand "lsx_ld_" + [(match_operand:LSX 0 "register_operand") + (match_operand 1 "pmode_register_operand") + (match_operand 2 "aq10_operand")] + "ISA_HAS_LSX" +{ + rtx addr = plus_constant (GET_MODE (operands[1]), operands[1], + INTVAL (operands[2])); + loongarch_emit_move (operands[0], gen_rtx_MEM (mode, addr)); + DONE; +}) + +;; Offset store +(define_expand "lsx_st_" + [(match_operand:LSX 0 "register_operand") + (match_operand 1 "pmode_register_operand") + (match_operand 2 "aq10_operand")] + "ISA_HAS_LSX" +{ + rtx addr = plus_constant (GET_MODE (operands[1]), operands[1], + INTVAL (operands[2])); + loongarch_emit_move (gen_rtx_MEM (mode, addr), operands[0]); + DONE; +}) + +;; Integer operations +(define_insn "add3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f,f") + (plus:ILSX + (match_operand:ILSX 1 "register_operand" "f,f,f") + (match_operand:ILSX 2 "reg_or_vector_same_ximm5_operand" "f,Unv5,Uuv5")))] + "ISA_HAS_LSX" +{ + switch (which_alternative) + { + case 0: + return "vadd.\t%w0,%w1,%w2"; + case 1: + { + HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0)); + + operands[2] = GEN_INT (-val); + return "vsubi.\t%w0,%w1,%d2"; + } + case 2: + return "vaddi.\t%w0,%w1,%E2"; + default: + gcc_unreachable (); + } +} + [(set_attr "alu_type" "simd_add") + (set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "sub3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (minus:ILSX + (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))] + "ISA_HAS_LSX" + "@ + vsub.\t%w0,%w1,%w2 + vsubi.\t%w0,%w1,%E2" + [(set_attr "alu_type" "simd_add") + (set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "mul3" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (mult:ILSX (match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vmul.\t%w0,%w1,%w2" + [(set_attr "type" "simd_mul") + (set_attr "mode" "")]) + +(define_insn "lsx_vmadd_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (plus:ILSX (mult:ILSX (match_operand:ILSX 2 "register_operand" "f") + (match_operand:ILSX 3 "register_operand" "f")) + (match_operand:ILSX 1 "register_operand" "0")))] + "ISA_HAS_LSX" + "vmadd.\t%w0,%w2,%w3" + [(set_attr "type" "simd_mul") + (set_attr "mode" "")]) + +(define_insn "lsx_vmsub_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (minus:ILSX (match_operand:ILSX 1 "register_operand" "0") + (mult:ILSX (match_operand:ILSX 2 "register_operand" "f") + (match_operand:ILSX 3 "register_operand" "f"))))] + "ISA_HAS_LSX" + "vmsub.\t%w0,%w2,%w3" + [(set_attr "type" "simd_mul") + (set_attr "mode" "")]) + +(define_insn "div3" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (div:ILSX (match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" +{ return loongarch_lsx_output_division ("vdiv.\t%w0,%w1,%w2", operands); } + [(set_attr "type" "simd_div") + (set_attr "mode" "")]) + +(define_insn "udiv3" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (udiv:ILSX (match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" +{ return loongarch_lsx_output_division ("vdiv.\t%w0,%w1,%w2", operands); } + [(set_attr "type" "simd_div") + (set_attr "mode" "")]) + +(define_insn "mod3" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (mod:ILSX (match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" +{ return loongarch_lsx_output_division ("vmod.\t%w0,%w1,%w2", operands); } + [(set_attr "type" "simd_div") + (set_attr "mode" "")]) + +(define_insn "umod3" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (umod:ILSX (match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" +{ return loongarch_lsx_output_division ("vmod.\t%w0,%w1,%w2", operands); } + [(set_attr "type" "simd_div") + (set_attr "mode" "")]) + +(define_insn "xor3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f,f") + (xor:ILSX + (match_operand:ILSX 1 "register_operand" "f,f,f") + (match_operand:ILSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] + "ISA_HAS_LSX" + "@ + vxor.v\t%w0,%w1,%w2 + vbitrevi.%v0\t%w0,%w1,%V2 + vxori.b\t%w0,%w1,%B2" + [(set_attr "type" "simd_logic,simd_bit,simd_logic") + (set_attr "mode" "")]) + +(define_insn "ior3" + [(set (match_operand:LSX 0 "register_operand" "=f,f,f") + (ior:LSX + (match_operand:LSX 1 "register_operand" "f,f,f") + (match_operand:LSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] + "ISA_HAS_LSX" + "@ + vor.v\t%w0,%w1,%w2 + vbitseti.%v0\t%w0,%w1,%V2 + vori.b\t%w0,%w1,%B2" + [(set_attr "type" "simd_logic,simd_bit,simd_logic") + (set_attr "mode" "")]) + +(define_insn "and3" + [(set (match_operand:LSX 0 "register_operand" "=f,f,f") + (and:LSX + (match_operand:LSX 1 "register_operand" "f,f,f") + (match_operand:LSX 2 "reg_or_vector_same_val_operand" "f,YZ,Urv8")))] + "ISA_HAS_LSX" +{ + switch (which_alternative) + { + case 0: + return "vand.v\t%w0,%w1,%w2"; + case 1: + { + rtx elt0 = CONST_VECTOR_ELT (operands[2], 0); + unsigned HOST_WIDE_INT val = ~UINTVAL (elt0); + operands[2] = loongarch_gen_const_int_vector (mode, val & (-val)); + return "vbitclri.%v0\t%w0,%w1,%V2"; + } + case 2: + return "vandi.b\t%w0,%w1,%B2"; + default: + gcc_unreachable (); + } +} + [(set_attr "type" "simd_logic,simd_bit,simd_logic") + (set_attr "mode" "")]) + +(define_insn "one_cmpl2" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (not:ILSX (match_operand:ILSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vnor.v\t%w0,%w1,%w1" + [(set_attr "type" "simd_logic") + (set_attr "mode" "TI")]) + +(define_insn "vlshr3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (lshiftrt:ILSX + (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] + "ISA_HAS_LSX" + "@ + vsrl.\t%w0,%w1,%w2 + vsrli.\t%w0,%w1,%E2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "vashr3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (ashiftrt:ILSX + (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] + "ISA_HAS_LSX" + "@ + vsra.\t%w0,%w1,%w2 + vsrai.\t%w0,%w1,%E2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "vashl3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (ashift:ILSX + (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] + "ISA_HAS_LSX" + "@ + vsll.\t%w0,%w1,%w2 + vslli.\t%w0,%w1,%E2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +;; Floating-point operations +(define_insn "add3" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (plus:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfadd.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fadd") + (set_attr "mode" "")]) + +(define_insn "sub3" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (minus:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfsub.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fadd") + (set_attr "mode" "")]) + +(define_insn "mul3" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (mult:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfmul.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fmul") + (set_attr "mode" "")]) + +(define_insn "div3" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (div:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfdiv.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fdiv") + (set_attr "mode" "")]) + +(define_insn "fma4" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (fma:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f") + (match_operand:FLSX 3 "register_operand" "0")))] + "ISA_HAS_LSX" + "vfmadd.\t%w0,%w1,%w2,%w0" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "")]) + +(define_insn "fnma4" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (fma:FLSX (neg:FLSX (match_operand:FLSX 1 "register_operand" "f")) + (match_operand:FLSX 2 "register_operand" "f") + (match_operand:FLSX 3 "register_operand" "0")))] + "ISA_HAS_LSX" + "vfnmsub.\t%w0,%w1,%w2,%w0" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "")]) + +(define_insn "sqrt2" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (sqrt:FLSX (match_operand:FLSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfsqrt.\t%w0,%w1" + [(set_attr "type" "simd_fdiv") + (set_attr "mode" "")]) + +;; Built-in functions +(define_insn "lsx_vadda_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (plus:ILSX (abs:ILSX (match_operand:ILSX 1 "register_operand" "f")) + (abs:ILSX (match_operand:ILSX 2 "register_operand" "f"))))] + "ISA_HAS_LSX" + "vadda.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "ssadd3" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (ss_plus:ILSX (match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vsadd.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "usadd3" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (us_plus:ILSX (match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vsadd.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vabsd_s_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_ASUB_S))] + "ISA_HAS_LSX" + "vabsd.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vabsd_u_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VABSD_U))] + "ISA_HAS_LSX" + "vabsd.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vavg_s_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VAVG_S))] + "ISA_HAS_LSX" + "vavg.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vavg_u_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VAVG_U))] + "ISA_HAS_LSX" + "vavg.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vavgr_s_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VAVGR_S))] + "ISA_HAS_LSX" + "vavgr.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vavgr_u_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VAVGR_U))] + "ISA_HAS_LSX" + "vavgr.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vbitclr_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VBITCLR))] + "ISA_HAS_LSX" + "vbitclr.\t%w0,%w1,%w2" + [(set_attr "type" "simd_bit") + (set_attr "mode" "")]) + +(define_insn "lsx_vbitclri_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VBITCLRI))] + "ISA_HAS_LSX" + "vbitclri.\t%w0,%w1,%2" + [(set_attr "type" "simd_bit") + (set_attr "mode" "")]) + +(define_insn "lsx_vbitrev_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VBITREV))] + "ISA_HAS_LSX" + "vbitrev.\t%w0,%w1,%w2" + [(set_attr "type" "simd_bit") + (set_attr "mode" "")]) + +(define_insn "lsx_vbitrevi_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const_lsx_branch_operand" "")] + UNSPEC_LSX_VBITREVI))] + "ISA_HAS_LSX" + "vbitrevi.\t%w0,%w1,%2" + [(set_attr "type" "simd_bit") + (set_attr "mode" "")]) + +(define_insn "lsx_vbitsel_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (ior:ILSX (and:ILSX (not:ILSX + (match_operand:ILSX 3 "register_operand" "f")) + (match_operand:ILSX 1 "register_operand" "f")) + (and:ILSX (match_dup 3) + (match_operand:ILSX 2 "register_operand" "f"))))] + "ISA_HAS_LSX" + "vbitsel.v\t%w0,%w1,%w2,%w3" + [(set_attr "type" "simd_bitmov") + (set_attr "mode" "")]) + +(define_insn "lsx_vbitseli_b" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (ior:V16QI (and:V16QI (not:V16QI + (match_operand:V16QI 1 "register_operand" "0")) + (match_operand:V16QI 2 "register_operand" "f")) + (and:V16QI (match_dup 1) + (match_operand:V16QI 3 "const_vector_same_val_operand" "Urv8"))))] + "ISA_HAS_LSX" + "vbitseli.b\t%w0,%w2,%B3" + [(set_attr "type" "simd_bitmov") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vbitset_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VBITSET))] + "ISA_HAS_LSX" + "vbitset.\t%w0,%w1,%w2" + [(set_attr "type" "simd_bit") + (set_attr "mode" "")]) + +(define_insn "lsx_vbitseti_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VBITSETI))] + "ISA_HAS_LSX" + "vbitseti.\t%w0,%w1,%2" + [(set_attr "type" "simd_bit") + (set_attr "mode" "")]) + +(define_code_iterator ICC [eq le leu lt ltu]) + +(define_code_attr icc + [(eq "eq") + (le "le") + (leu "le") + (lt "lt") + (ltu "lt")]) + +(define_code_attr icci + [(eq "eqi") + (le "lei") + (leu "lei") + (lt "lti") + (ltu "lti")]) + +(define_code_attr cmpi + [(eq "s") + (le "s") + (leu "u") + (lt "s") + (ltu "u")]) + +(define_code_attr cmpi_1 + [(eq "") + (le "") + (leu "u") + (lt "") + (ltu "u")]) + +(define_insn "lsx_vs_" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (ICC:ILSX + (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_imm5_operand" "f,Uv5")))] + "ISA_HAS_LSX" + "@ + vs.\t%w0,%w1,%w2 + vs.\t%w0,%w1,%E2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vfclass_" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:FLSX 1 "register_operand" "f")] + UNSPEC_LSX_VFCLASS))] + "ISA_HAS_LSX" + "vfclass.\t%w0,%w1" + [(set_attr "type" "simd_fclass") + (set_attr "mode" "")]) + +(define_insn "lsx_vfcmp_caf_" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")] + UNSPEC_LSX_VFCMP_CAF))] + "ISA_HAS_LSX" + "vfcmp.caf.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fcmp") + (set_attr "mode" "")]) + +(define_insn "lsx_vfcmp_cune_" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")] + UNSPEC_LSX_VFCMP_CUNE))] + "ISA_HAS_LSX" + "vfcmp.cune.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fcmp") + (set_attr "mode" "")]) + +(define_code_iterator vfcond [unordered ordered eq ne le lt uneq unle unlt]) + +(define_code_attr fcc + [(unordered "cun") + (ordered "cor") + (eq "ceq") + (ne "cne") + (uneq "cueq") + (unle "cule") + (unlt "cult") + (le "cle") + (lt "clt")]) + +(define_int_iterator FSC_UNS [UNSPEC_LSX_VFCMP_SAF UNSPEC_LSX_VFCMP_SUN UNSPEC_LSX_VFCMP_SOR + UNSPEC_LSX_VFCMP_SEQ UNSPEC_LSX_VFCMP_SNE UNSPEC_LSX_VFCMP_SUEQ + UNSPEC_LSX_VFCMP_SUNE UNSPEC_LSX_VFCMP_SULE UNSPEC_LSX_VFCMP_SULT + UNSPEC_LSX_VFCMP_SLE UNSPEC_LSX_VFCMP_SLT]) + +(define_int_attr fsc + [(UNSPEC_LSX_VFCMP_SAF "saf") + (UNSPEC_LSX_VFCMP_SUN "sun") + (UNSPEC_LSX_VFCMP_SOR "sor") + (UNSPEC_LSX_VFCMP_SEQ "seq") + (UNSPEC_LSX_VFCMP_SNE "sne") + (UNSPEC_LSX_VFCMP_SUEQ "sueq") + (UNSPEC_LSX_VFCMP_SUNE "sune") + (UNSPEC_LSX_VFCMP_SULE "sule") + (UNSPEC_LSX_VFCMP_SULT "sult") + (UNSPEC_LSX_VFCMP_SLE "sle") + (UNSPEC_LSX_VFCMP_SLT "slt")]) + +(define_insn "lsx_vfcmp__" + [(set (match_operand: 0 "register_operand" "=f") + (vfcond: (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfcmp..\t%w0,%w1,%w2" + [(set_attr "type" "simd_fcmp") + (set_attr "mode" "")]) + +(define_insn "lsx_vfcmp__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")] + FSC_UNS))] + "ISA_HAS_LSX" + "vfcmp..\t%w0,%w1,%w2" + [(set_attr "type" "simd_fcmp") + (set_attr "mode" "")]) + +(define_mode_attr fint + [(V4SF "v4si") + (V2DF "v2di")]) + +(define_mode_attr FINTCNV + [(V4SF "I2S") + (V2DF "I2D")]) + +(define_mode_attr FINTCNV_2 + [(V4SF "S2I") + (V2DF "D2I")]) + +(define_insn "float2" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (float:FLSX (match_operand: 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vffint..\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "cnv_mode" "") + (set_attr "mode" "")]) + +(define_insn "floatuns2" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (unsigned_float:FLSX + (match_operand: 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vffint..\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "cnv_mode" "") + (set_attr "mode" "")]) + +(define_mode_attr FFQ + [(V4SF "V8HI") + (V2DF "V4SI")]) + +(define_insn "lsx_vreplgr2vr_" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (vec_duplicate:ILSX + (match_operand: 1 "reg_or_0_operand" "r,J")))] + "ISA_HAS_LSX" +{ + if (which_alternative == 1) + return "ldi.\t%w0,0"; + + if (!TARGET_64BIT && (mode == V2DImode || mode == V2DFmode)) + return "#"; + else + return "vreplgr2vr.\t%w0,%z1"; +} + [(set_attr "type" "simd_fill") + (set_attr "mode" "")]) + +(define_split + [(set (match_operand:LSX_D 0 "register_operand") + (vec_duplicate:LSX_D + (match_operand: 1 "register_operand")))] + "reload_completed && ISA_HAS_LSX && !TARGET_64BIT" + [(const_int 0)] +{ + loongarch_split_lsx_fill_d (operands[0], operands[1]); + DONE; +}) + +(define_insn "logb2" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")] + UNSPEC_LSX_VFLOGB))] + "ISA_HAS_LSX" + "vflogb.\t%w0,%w1" + [(set_attr "type" "simd_flog2") + (set_attr "mode" "")]) + +(define_insn "smax3" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (smax:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfmax.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fminmax") + (set_attr "mode" "")]) + +(define_insn "lsx_vfmaxa_" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (if_then_else:FLSX + (gt (abs:FLSX (match_operand:FLSX 1 "register_operand" "f")) + (abs:FLSX (match_operand:FLSX 2 "register_operand" "f"))) + (match_dup 1) + (match_dup 2)))] + "ISA_HAS_LSX" + "vfmaxa.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fminmax") + (set_attr "mode" "")]) + +(define_insn "smin3" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (smin:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfmin.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fminmax") + (set_attr "mode" "")]) + +(define_insn "lsx_vfmina_" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (if_then_else:FLSX + (lt (abs:FLSX (match_operand:FLSX 1 "register_operand" "f")) + (abs:FLSX (match_operand:FLSX 2 "register_operand" "f"))) + (match_dup 1) + (match_dup 2)))] + "ISA_HAS_LSX" + "vfmina.\t%w0,%w1,%w2" + [(set_attr "type" "simd_fminmax") + (set_attr "mode" "")]) + +(define_insn "lsx_vfrecip_" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")] + UNSPEC_LSX_VFRECIP))] + "ISA_HAS_LSX" + "vfrecip.\t%w0,%w1" + [(set_attr "type" "simd_fdiv") + (set_attr "mode" "")]) + +(define_insn "lsx_vfrint_" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")] + UNSPEC_LSX_VFRINT))] + "ISA_HAS_LSX" + "vfrint.\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "")]) + +(define_insn "lsx_vfrsqrt_" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")] + UNSPEC_LSX_VFRSQRT))] + "ISA_HAS_LSX" + "vfrsqrt.\t%w0,%w1" + [(set_attr "type" "simd_fdiv") + (set_attr "mode" "")]) + +(define_insn "lsx_vftint_s__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:FLSX 1 "register_operand" "f")] + UNSPEC_LSX_VFTINT_S))] + "ISA_HAS_LSX" + "vftint..\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "cnv_mode" "") + (set_attr "mode" "")]) + +(define_insn "lsx_vftint_u__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:FLSX 1 "register_operand" "f")] + UNSPEC_LSX_VFTINT_U))] + "ISA_HAS_LSX" + "vftint..\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "cnv_mode" "") + (set_attr "mode" "")]) + +(define_insn "fix_trunc2" + [(set (match_operand: 0 "register_operand" "=f") + (fix: (match_operand:FLSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vftintrz..\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "cnv_mode" "") + (set_attr "mode" "")]) + +(define_insn "fixuns_trunc2" + [(set (match_operand: 0 "register_operand" "=f") + (unsigned_fix: (match_operand:FLSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vftintrz..\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "cnv_mode" "") + (set_attr "mode" "")]) + +(define_insn "lsx_vhw_h_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (addsub:V8HI + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 1 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)]))) + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)])))))] + "ISA_HAS_LSX" + "vhw.h.b\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vhw_w_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (addsub:V4SI + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 1 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)]))) + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)])))))] + "ISA_HAS_LSX" + "vhw.w.h\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vhw_d_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (addsub:V2DI + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 1 "register_operand" "f") + (parallel [(const_int 1) (const_int 3)]))) + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2)])))))] + "ISA_HAS_LSX" + "vhw.d.w\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vpackev_b" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (vec_select:V16QI + (vec_concat:V32QI + (match_operand:V16QI 1 "register_operand" "f") + (match_operand:V16QI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 16) + (const_int 2) (const_int 18) + (const_int 4) (const_int 20) + (const_int 6) (const_int 22) + (const_int 8) (const_int 24) + (const_int 10) (const_int 26) + (const_int 12) (const_int 28) + (const_int 14) (const_int 30)])))] + "ISA_HAS_LSX" + "vpackev.b\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vpackev_h" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (vec_select:V8HI + (vec_concat:V16HI + (match_operand:V8HI 1 "register_operand" "f") + (match_operand:V8HI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 8) + (const_int 2) (const_int 10) + (const_int 4) (const_int 12) + (const_int 6) (const_int 14)])))] + "ISA_HAS_LSX" + "vpackev.h\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vpackev_w" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (vec_select:V4SI + (vec_concat:V8SI + (match_operand:V4SI 1 "register_operand" "f") + (match_operand:V4SI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 4) + (const_int 2) (const_int 6)])))] + "ISA_HAS_LSX" + "vpackev.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vpackev_w_f" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (vec_select:V4SF + (vec_concat:V8SF + (match_operand:V4SF 1 "register_operand" "f") + (match_operand:V4SF 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 4) + (const_int 2) (const_int 6)])))] + "ISA_HAS_LSX" + "vpackev.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vilvh_b" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (vec_select:V16QI + (vec_concat:V32QI + (match_operand:V16QI 1 "register_operand" "f") + (match_operand:V16QI 2 "register_operand" "f")) + (parallel [(const_int 8) (const_int 24) + (const_int 9) (const_int 25) + (const_int 10) (const_int 26) + (const_int 11) (const_int 27) + (const_int 12) (const_int 28) + (const_int 13) (const_int 29) + (const_int 14) (const_int 30) + (const_int 15) (const_int 31)])))] + "ISA_HAS_LSX" + "vilvh.b\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vilvh_h" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (vec_select:V8HI + (vec_concat:V16HI + (match_operand:V8HI 1 "register_operand" "f") + (match_operand:V8HI 2 "register_operand" "f")) + (parallel [(const_int 4) (const_int 12) + (const_int 5) (const_int 13) + (const_int 6) (const_int 14) + (const_int 7) (const_int 15)])))] + "ISA_HAS_LSX" + "vilvh.h\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vilvh_w" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (vec_select:V4SI + (vec_concat:V8SI + (match_operand:V4SI 1 "register_operand" "f") + (match_operand:V4SI 2 "register_operand" "f")) + (parallel [(const_int 2) (const_int 6) + (const_int 3) (const_int 7)])))] + "ISA_HAS_LSX" + "vilvh.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vilvh_w_f" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (vec_select:V4SF + (vec_concat:V8SF + (match_operand:V4SF 1 "register_operand" "f") + (match_operand:V4SF 2 "register_operand" "f")) + (parallel [(const_int 2) (const_int 6) + (const_int 3) (const_int 7)])))] + "ISA_HAS_LSX" + "vilvh.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vilvh_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (vec_select:V2DI + (vec_concat:V4DI + (match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 3)])))] + "ISA_HAS_LSX" + "vilvh.d\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vilvh_d_f" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (vec_select:V2DF + (vec_concat:V4DF + (match_operand:V2DF 1 "register_operand" "f") + (match_operand:V2DF 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 3)])))] + "ISA_HAS_LSX" + "vilvh.d\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vpackod_b" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (vec_select:V16QI + (vec_concat:V32QI + (match_operand:V16QI 1 "register_operand" "f") + (match_operand:V16QI 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 17) + (const_int 3) (const_int 19) + (const_int 5) (const_int 21) + (const_int 7) (const_int 23) + (const_int 9) (const_int 25) + (const_int 11) (const_int 27) + (const_int 13) (const_int 29) + (const_int 15) (const_int 31)])))] + "ISA_HAS_LSX" + "vpackod.b\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vpackod_h" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (vec_select:V8HI + (vec_concat:V16HI + (match_operand:V8HI 1 "register_operand" "f") + (match_operand:V8HI 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 9) + (const_int 3) (const_int 11) + (const_int 5) (const_int 13) + (const_int 7) (const_int 15)])))] + "ISA_HAS_LSX" + "vpackod.h\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vpackod_w" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (vec_select:V4SI + (vec_concat:V8SI + (match_operand:V4SI 1 "register_operand" "f") + (match_operand:V4SI 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 5) + (const_int 3) (const_int 7)])))] + "ISA_HAS_LSX" + "vpackod.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vpackod_w_f" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (vec_select:V4SF + (vec_concat:V8SF + (match_operand:V4SF 1 "register_operand" "f") + (match_operand:V4SF 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 5) + (const_int 3) (const_int 7)])))] + "ISA_HAS_LSX" + "vpackod.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vilvl_b" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (vec_select:V16QI + (vec_concat:V32QI + (match_operand:V16QI 1 "register_operand" "f") + (match_operand:V16QI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 16) + (const_int 1) (const_int 17) + (const_int 2) (const_int 18) + (const_int 3) (const_int 19) + (const_int 4) (const_int 20) + (const_int 5) (const_int 21) + (const_int 6) (const_int 22) + (const_int 7) (const_int 23)])))] + "ISA_HAS_LSX" + "vilvl.b\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vilvl_h" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (vec_select:V8HI + (vec_concat:V16HI + (match_operand:V8HI 1 "register_operand" "f") + (match_operand:V8HI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 8) + (const_int 1) (const_int 9) + (const_int 2) (const_int 10) + (const_int 3) (const_int 11)])))] + "ISA_HAS_LSX" + "vilvl.h\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vilvl_w" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (vec_select:V4SI + (vec_concat:V8SI + (match_operand:V4SI 1 "register_operand" "f") + (match_operand:V4SI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 4) + (const_int 1) (const_int 5)])))] + "ISA_HAS_LSX" + "vilvl.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vilvl_w_f" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (vec_select:V4SF + (vec_concat:V8SF + (match_operand:V4SF 1 "register_operand" "f") + (match_operand:V4SF 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 4) + (const_int 1) (const_int 5)])))] + "ISA_HAS_LSX" + "vilvl.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vilvl_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (vec_select:V2DI + (vec_concat:V4DI + (match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 2)])))] + "ISA_HAS_LSX" + "vilvl.d\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vilvl_d_f" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (vec_select:V2DF + (vec_concat:V4DF + (match_operand:V2DF 1 "register_operand" "f") + (match_operand:V2DF 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 2)])))] + "ISA_HAS_LSX" + "vilvl.d\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V2DF")]) + +(define_insn "smax3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (smax:ILSX (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))] + "ISA_HAS_LSX" + "@ + vmax.\t%w0,%w1,%w2 + vmaxi.\t%w0,%w1,%E2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "umax3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (umax:ILSX (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))] + "ISA_HAS_LSX" + "@ + vmax.\t%w0,%w1,%w2 + vmaxi.\t%w0,%w1,%B2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "smin3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (smin:ILSX (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))] + "ISA_HAS_LSX" + "@ + vmin.\t%w0,%w1,%w2 + vmini.\t%w0,%w1,%E2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "umin3" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (umin:ILSX (match_operand:ILSX 1 "register_operand" "f,f") + (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))] + "ISA_HAS_LSX" + "@ + vmin.\t%w0,%w1,%w2 + vmini.\t%w0,%w1,%B2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vclo_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")] + UNSPEC_LSX_VCLO))] + "ISA_HAS_LSX" + "vclo.\t%w0,%w1" + [(set_attr "type" "simd_bit") + (set_attr "mode" "")]) + +(define_insn "clz2" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (clz:ILSX (match_operand:ILSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vclz.\t%w0,%w1" + [(set_attr "type" "simd_bit") + (set_attr "mode" "")]) + +(define_insn "lsx_nor_" + [(set (match_operand:ILSX 0 "register_operand" "=f,f") + (and:ILSX (not:ILSX (match_operand:ILSX 1 "register_operand" "f,f")) + (not:ILSX (match_operand:ILSX 2 "reg_or_vector_same_val_operand" "f,Urv8"))))] + "ISA_HAS_LSX" + "@ + vnor.v\t%w0,%w1,%w2 + vnori.b\t%w0,%w1,%B2" + [(set_attr "type" "simd_logic") + (set_attr "mode" "")]) + +(define_insn "lsx_vpickev_b" +[(set (match_operand:V16QI 0 "register_operand" "=f") + (vec_select:V16QI + (vec_concat:V32QI + (match_operand:V16QI 1 "register_operand" "f") + (match_operand:V16QI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14) + (const_int 16) (const_int 18) + (const_int 20) (const_int 22) + (const_int 24) (const_int 26) + (const_int 28) (const_int 30)])))] + "ISA_HAS_LSX" + "vpickev.b\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vpickev_h" +[(set (match_operand:V8HI 0 "register_operand" "=f") + (vec_select:V8HI + (vec_concat:V16HI + (match_operand:V8HI 1 "register_operand" "f") + (match_operand:V8HI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)])))] + "ISA_HAS_LSX" + "vpickev.h\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vpickev_w" +[(set (match_operand:V4SI 0 "register_operand" "=f") + (vec_select:V4SI + (vec_concat:V8SI + (match_operand:V4SI 1 "register_operand" "f") + (match_operand:V4SI 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)])))] + "ISA_HAS_LSX" + "vpickev.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vpickev_w_f" +[(set (match_operand:V4SF 0 "register_operand" "=f") + (vec_select:V4SF + (vec_concat:V8SF + (match_operand:V4SF 1 "register_operand" "f") + (match_operand:V4SF 2 "register_operand" "f")) + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)])))] + "ISA_HAS_LSX" + "vpickev.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vpickod_b" +[(set (match_operand:V16QI 0 "register_operand" "=f") + (vec_select:V16QI + (vec_concat:V32QI + (match_operand:V16QI 1 "register_operand" "f") + (match_operand:V16QI 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15) + (const_int 17) (const_int 19) + (const_int 21) (const_int 23) + (const_int 25) (const_int 27) + (const_int 29) (const_int 31)])))] + "ISA_HAS_LSX" + "vpickod.b\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vpickod_h" +[(set (match_operand:V8HI 0 "register_operand" "=f") + (vec_select:V8HI + (vec_concat:V16HI + (match_operand:V8HI 1 "register_operand" "f") + (match_operand:V8HI 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)])))] + "ISA_HAS_LSX" + "vpickod.h\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vpickod_w" +[(set (match_operand:V4SI 0 "register_operand" "=f") + (vec_select:V4SI + (vec_concat:V8SI + (match_operand:V4SI 1 "register_operand" "f") + (match_operand:V4SI 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)])))] + "ISA_HAS_LSX" + "vpickod.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vpickod_w_f" +[(set (match_operand:V4SF 0 "register_operand" "=f") + (vec_select:V4SF + (vec_concat:V8SF + (match_operand:V4SF 1 "register_operand" "f") + (match_operand:V4SF 2 "register_operand" "f")) + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)])))] + "ISA_HAS_LSX" + "vpickod.w\t%w0,%w2,%w1" + [(set_attr "type" "simd_permute") + (set_attr "mode" "V4SF")]) + +(define_insn "popcount2" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (popcount:ILSX (match_operand:ILSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vpcnt.\t%w0,%w1" + [(set_attr "type" "simd_pcnt") + (set_attr "mode" "")]) + +(define_insn "lsx_vsat_s_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSAT_S))] + "ISA_HAS_LSX" + "vsat.\t%w0,%w1,%2" + [(set_attr "type" "simd_sat") + (set_attr "mode" "")]) + +(define_insn "lsx_vsat_u_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSAT_U))] + "ISA_HAS_LSX" + "vsat.\t%w0,%w1,%2" + [(set_attr "type" "simd_sat") + (set_attr "mode" "")]) + +(define_insn "lsx_vshuf4i_" + [(set (match_operand:LSX_WHB_W 0 "register_operand" "=f") + (vec_select:LSX_WHB_W + (match_operand:LSX_WHB_W 1 "register_operand" "f") + (match_operand 2 "par_const_vector_shf_set_operand" "")))] + "ISA_HAS_LSX" +{ + HOST_WIDE_INT val = 0; + unsigned int i; + + /* We convert the selection to an immediate. */ + for (i = 0; i < 4; i++) + val |= INTVAL (XVECEXP (operands[2], 0, i)) << (2 * i); + + operands[2] = GEN_INT (val); + return "vshuf4i.\t%w0,%w1,%X2"; +} + [(set_attr "type" "simd_shf") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrar_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VSRAR))] + "ISA_HAS_LSX" + "vsrar.\t%w0,%w1,%w2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrari_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSRARI))] + "ISA_HAS_LSX" + "vsrari.\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrlr_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VSRLR))] + "ISA_HAS_LSX" + "vsrlr.\t%w0,%w1,%w2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrlri_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSRLRI))] + "ISA_HAS_LSX" + "vsrlri.\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssub_s_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VSSUB_S))] + "ISA_HAS_LSX" + "vssub.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vssub_u_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VSSUB_U))] + "ISA_HAS_LSX" + "vssub.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vreplve_" + [(set (match_operand:LSX 0 "register_operand" "=f") + (unspec:LSX [(match_operand:LSX 1 "register_operand" "f") + (match_operand:SI 2 "register_operand" "r")] + UNSPEC_LSX_VREPLVE))] + "ISA_HAS_LSX" + "vreplve.\t%w0,%w1,%z2" + [(set_attr "type" "simd_splat") + (set_attr "mode" "")]) + +(define_insn "lsx_vreplvei_" + [(set (match_operand:LSX 0 "register_operand" "=f") + (vec_duplicate:LSX + (vec_select: + (match_operand:LSX 1 "register_operand" "f") + (parallel [(match_operand 2 "const__operand" "")]))))] + "ISA_HAS_LSX" + "vreplvei.\t%w0,%w1,%2" + [(set_attr "type" "simd_splat") + (set_attr "mode" "")]) + +(define_insn "lsx_vreplvei__scalar" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (vec_duplicate:FLSX + (match_operand: 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vreplvei.\t%w0,%w1,0" + [(set_attr "type" "simd_splat") + (set_attr "mode" "")]) + +(define_insn "lsx_vfcvt_h_s" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "f") + (match_operand:V4SF 2 "register_operand" "f")] + UNSPEC_LSX_VFCVT))] + "ISA_HAS_LSX" + "vfcvt.h.s\t%w0,%w1,%w2" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vfcvt_s_d" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (unspec:V4SF [(match_operand:V2DF 1 "register_operand" "f") + (match_operand:V2DF 2 "register_operand" "f")] + UNSPEC_LSX_VFCVT))] + "ISA_HAS_LSX" + "vfcvt.s.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V4SF")]) + +(define_insn "vec_pack_trunc_v2df" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (vec_concat:V4SF + (float_truncate:V2SF (match_operand:V2DF 1 "register_operand" "f")) + (float_truncate:V2SF (match_operand:V2DF 2 "register_operand" "f"))))] + "ISA_HAS_LSX" + "vfcvt.s.d\t%w0,%w2,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vfcvth_s_h" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (unspec:V4SF [(match_operand:V8HI 1 "register_operand" "f")] + UNSPEC_LSX_VFCVTH))] + "ISA_HAS_LSX" + "vfcvth.s.h\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vfcvth_d_s" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (float_extend:V2DF + (vec_select:V2SF + (match_operand:V4SF 1 "register_operand" "f") + (parallel [(const_int 2) (const_int 3)]))))] + "ISA_HAS_LSX" + "vfcvth.d.s\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vfcvtl_s_h" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (unspec:V4SF [(match_operand:V8HI 1 "register_operand" "f")] + UNSPEC_LSX_VFCVTL))] + "ISA_HAS_LSX" + "vfcvtl.s.h\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vfcvtl_d_s" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (float_extend:V2DF + (vec_select:V2SF + (match_operand:V4SF 1 "register_operand" "f") + (parallel [(const_int 0) (const_int 1)]))))] + "ISA_HAS_LSX" + "vfcvtl.d.s\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V2DF")]) + +(define_code_attr lsxbr + [(eq "bz") + (ne "bnz")]) + +(define_code_attr lsxeq_v + [(eq "eqz") + (ne "nez")]) + +(define_code_attr lsxne_v + [(eq "nez") + (ne "eqz")]) + +(define_code_attr lsxeq + [(eq "anyeqz") + (ne "allnez")]) + +(define_code_attr lsxne + [(eq "allnez") + (ne "anyeqz")]) + +(define_insn "lsx__" + [(set (pc) (if_then_else + (equality_op + (unspec:SI [(match_operand:LSX 1 "register_operand" "f")] + UNSPEC_LSX_BRANCH) + (match_operand:SI 2 "const_0_operand")) + (label_ref (match_operand 0)) + (pc))) + (clobber (match_scratch:FCC 3 "=z"))] + "ISA_HAS_LSX" +{ + return loongarch_output_conditional_branch (insn, operands, + "vset.\t%Z3%w1\n\tbcnez\t%Z3%0", + "vset.\t%Z3%w1\n\tbcnez\t%Z3%0"); +} + [(set_attr "type" "simd_branch") + (set_attr "mode" "")]) + +(define_insn "lsx__v_" + [(set (pc) (if_then_else + (equality_op + (unspec:SI [(match_operand:LSX 1 "register_operand" "f")] + UNSPEC_LSX_BRANCH_V) + (match_operand:SI 2 "const_0_operand")) + (label_ref (match_operand 0)) + (pc))) + (clobber (match_scratch:FCC 3 "=z"))] + "ISA_HAS_LSX" +{ + return loongarch_output_conditional_branch (insn, operands, + "vset.v\t%Z3%w1\n\tbcnez\t%Z3%0", + "vset.v\t%Z3%w1\n\tbcnez\t%Z3%0"); +} + [(set_attr "type" "simd_branch") + (set_attr "mode" "TI")]) + +;; vec_concate +(define_expand "vec_concatv2di" + [(set (match_operand:V2DI 0 "register_operand") + (vec_concat:V2DI + (match_operand:DI 1 "register_operand") + (match_operand:DI 2 "register_operand")))] + "ISA_HAS_LSX" +{ + emit_insn (gen_lsx_vinsgr2vr_d (operands[0], operands[1], + operands[0], GEN_INT (0))); + emit_insn (gen_lsx_vinsgr2vr_d (operands[0], operands[2], + operands[0], GEN_INT (1))); + DONE; +}) + + +(define_insn "vandn3" + [(set (match_operand:LSX 0 "register_operand" "=f") + (and:LSX (not:LSX (match_operand:LSX 1 "register_operand" "f")) + (match_operand:LSX 2 "register_operand" "f")))] + "ISA_HAS_LSX" + "vandn.v\t%w0,%w1,%w2" + [(set_attr "type" "simd_logic") + (set_attr "mode" "")]) + +(define_insn "vabs2" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (abs:ILSX (match_operand:ILSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vsigncov.\t%w0,%w1,%w1" + [(set_attr "type" "simd_logic") + (set_attr "mode" "")]) + +(define_insn "vneg2" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (neg:ILSX (match_operand:ILSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vneg.\t%w0,%w1" + [(set_attr "type" "simd_logic") + (set_attr "mode" "")]) + +(define_insn "lsx_vmuh_s_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VMUH_S))] + "ISA_HAS_LSX" + "vmuh.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vmuh_u_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VMUH_U))] + "ISA_HAS_LSX" + "vmuh.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vextw_s_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "f")] + UNSPEC_LSX_VEXTW_S))] + "ISA_HAS_LSX" + "vextw_s.d\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vextw_u_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "f")] + UNSPEC_LSX_VEXTW_U))] + "ISA_HAS_LSX" + "vextw_u.d\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vsllwil_s__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_WHB 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSLLWIL_S))] + "ISA_HAS_LSX" + "vsllwil..\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsllwil_u__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_WHB 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSLLWIL_U))] + "ISA_HAS_LSX" + "vsllwil..\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsran__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSRAN))] + "ISA_HAS_LSX" + "vsran..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vssran_s__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSSRAN_S))] + "ISA_HAS_LSX" + "vssran..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vssran_u__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSSRAN_U))] + "ISA_HAS_LSX" + "vssran..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrain_" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSRAIN))] + "ISA_HAS_LSX" + "vsrain.\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +;; FIXME: bitimm +(define_insn "lsx_vsrains_s_" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSRAINS_S))] + "ISA_HAS_LSX" + "vsrains_s.\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +;; FIXME: bitimm +(define_insn "lsx_vsrains_u_" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VSRAINS_U))] + "ISA_HAS_LSX" + "vsrains_u.\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrarn__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSRARN))] + "ISA_HAS_LSX" + "vsrarn..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrarn_s__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSSRARN_S))] + "ISA_HAS_LSX" + "vssrarn..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrarn_u__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSSRARN_U))] + "ISA_HAS_LSX" + "vssrarn..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrln__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSRLN))] + "ISA_HAS_LSX" + "vsrln..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrln_u__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSSRLN_U))] + "ISA_HAS_LSX" + "vssrln..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrlrn__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSRLRN))] + "ISA_HAS_LSX" + "vsrlrn..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrlrn_u__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSSRLRN_U))] + "ISA_HAS_LSX" + "vssrlrn..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vfrstpi_" + [(set (match_operand:ILSX_HB 0 "register_operand" "=f") + (unspec:ILSX_HB [(match_operand:ILSX_HB 1 "register_operand" "0") + (match_operand:ILSX_HB 2 "register_operand" "f") + (match_operand 3 "const_uimm5_operand" "")] + UNSPEC_LSX_VFRSTPI))] + "ISA_HAS_LSX" + "vfrstpi.\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vfrstp_" + [(set (match_operand:ILSX_HB 0 "register_operand" "=f") + (unspec:ILSX_HB [(match_operand:ILSX_HB 1 "register_operand" "0") + (match_operand:ILSX_HB 2 "register_operand" "f") + (match_operand:ILSX_HB 3 "register_operand" "f")] + UNSPEC_LSX_VFRSTP))] + "ISA_HAS_LSX" + "vfrstp.\t%w0,%w2,%w3" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vshuf4i_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0") + (match_operand:V2DI 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand")] + UNSPEC_LSX_VSHUF4I))] + "ISA_HAS_LSX" + "vshuf4i.d\t%w0,%w2,%3" + [(set_attr "type" "simd_sld") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vbsrl_" + [(set (match_operand:LSX 0 "register_operand" "=f") + (unspec:LSX [(match_operand:LSX 1 "register_operand" "f") + (match_operand 2 "const_uimm5_operand" "")] + UNSPEC_LSX_VBSRL_V))] + "ISA_HAS_LSX" + "vbsrl.v\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vbsll_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const_uimm5_operand" "")] + UNSPEC_LSX_VBSLL_V))] + "ISA_HAS_LSX" + "vbsll.v\t%w0,%w1,%2" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vextrins_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VEXTRINS))] + "ISA_HAS_LSX" + "vextrins.\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vmskltz_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")] + UNSPEC_LSX_VMSKLTZ))] + "ISA_HAS_LSX" + "vmskltz.\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsigncov_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VSIGNCOV))] + "ISA_HAS_LSX" + "vsigncov.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_expand "copysign3" + [(set (match_dup 4) + (and:FLSX + (not:FLSX (match_dup 3)) + (match_operand:FLSX 1 "register_operand"))) + (set (match_dup 5) + (and:FLSX (match_dup 3) + (match_operand:FLSX 2 "register_operand"))) + (set (match_operand:FLSX 0 "register_operand") + (ior:FLSX (match_dup 4) (match_dup 5)))] + "ISA_HAS_LSX" +{ + operands[3] = loongarch_build_signbit_mask (mode, 1, 0); + + operands[4] = gen_reg_rtx (mode); + operands[5] = gen_reg_rtx (mode); +}) + +(define_insn "absv2df2" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (abs:V2DF (match_operand:V2DF 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vbitclri.d\t%w0,%w1,63" + [(set_attr "type" "simd_logic") + (set_attr "mode" "V2DF")]) + +(define_insn "absv4sf2" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (abs:V4SF (match_operand:V4SF 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vbitclri.w\t%w0,%w1,31" + [(set_attr "type" "simd_logic") + (set_attr "mode" "V4SF")]) + +(define_insn "vfmadd4" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (fma:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f") + (match_operand:FLSX 3 "register_operand" "f")))] + "ISA_HAS_LSX" + "vfmadd.\t%w0,%w1,$w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "")]) + +(define_insn "vfmsub4" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (fma:FLSX (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f") + (neg:FLSX (match_operand:FLSX 3 "register_operand" "f"))))] + "ISA_HAS_LSX" + "vfmsub.\t%w0,%w1,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "")]) + +(define_insn "vfnmsub4_nmsub4" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (neg:FLSX + (fma:FLSX + (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f") + (neg:FLSX (match_operand:FLSX 3 "register_operand" "f")))))] + "ISA_HAS_LSX" + "vfnmsub.\t%w0,%w1,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "")]) + + +(define_insn "vfnmadd4_nmadd4" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (neg:FLSX + (fma:FLSX + (match_operand:FLSX 1 "register_operand" "f") + (match_operand:FLSX 2 "register_operand" "f") + (match_operand:FLSX 3 "register_operand" "f"))))] + "ISA_HAS_LSX" + "vfnmadd.\t%w0,%w1,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "")]) + +(define_insn "lsx_vftintrne_w_s" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRNE))] + "ISA_HAS_LSX" + "vftintrne.w.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrne_l_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRNE))] + "ISA_HAS_LSX" + "vftintrne.l.d\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vftintrp_w_s" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRP))] + "ISA_HAS_LSX" + "vftintrp.w.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrp_l_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRP))] + "ISA_HAS_LSX" + "vftintrp.l.d\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vftintrm_w_s" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRM))] + "ISA_HAS_LSX" + "vftintrm.w.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrm_l_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRM))] + "ISA_HAS_LSX" + "vftintrm.l.d\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vftint_w_d" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f") + (match_operand:V2DF 2 "register_operand" "f")] + UNSPEC_LSX_VFTINT_W_D))] + "ISA_HAS_LSX" + "vftint.w.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vffint_s_l" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (unspec:V4SF [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VFFINT_S_L))] + "ISA_HAS_LSX" + "vffint.s.l\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vftintrz_w_d" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f") + (match_operand:V2DF 2 "register_operand" "f")] + UNSPEC_LSX_VFTINTRZ_W_D))] + "ISA_HAS_LSX" + "vftintrz.w.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vftintrp_w_d" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f") + (match_operand:V2DF 2 "register_operand" "f")] + UNSPEC_LSX_VFTINTRP_W_D))] + "ISA_HAS_LSX" + "vftintrp.w.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vftintrm_w_d" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f") + (match_operand:V2DF 2 "register_operand" "f")] + UNSPEC_LSX_VFTINTRM_W_D))] + "ISA_HAS_LSX" + "vftintrm.w.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vftintrne_w_d" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f") + (match_operand:V2DF 2 "register_operand" "f")] + UNSPEC_LSX_VFTINTRNE_W_D))] + "ISA_HAS_LSX" + "vftintrne.w.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vftinth_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTH_L_H))] + "ISA_HAS_LSX" + "vftinth.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintl_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTL_L_S))] + "ISA_HAS_LSX" + "vftintl.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vffinth_d_w" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (unspec:V2DF [(match_operand:V4SI 1 "register_operand" "f")] + UNSPEC_LSX_VFFINTH_D_W))] + "ISA_HAS_LSX" + "vffinth.d.w\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vffintl_d_w" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (unspec:V2DF [(match_operand:V4SI 1 "register_operand" "f")] + UNSPEC_LSX_VFFINTL_D_W))] + "ISA_HAS_LSX" + "vffintl.d.w\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vftintrzh_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRZH_L_S))] + "ISA_HAS_LSX" + "vftintrzh.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrzl_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRZL_L_S))] + "ISA_HAS_LSX" + "vftintrzl.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrph_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRPH_L_S))] + "ISA_HAS_LSX" + "vftintrph.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrpl_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRPL_L_S))] + "ISA_HAS_LSX" + "vftintrpl.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrmh_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRMH_L_S))] + "ISA_HAS_LSX" + "vftintrmh.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrml_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRML_L_S))] + "ISA_HAS_LSX" + "vftintrml.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrneh_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRNEH_L_S))] + "ISA_HAS_LSX" + "vftintrneh.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vftintrnel_l_s" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFTINTRNEL_L_S))] + "ISA_HAS_LSX" + "vftintrnel.l.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vfrintrne_s" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFRINTRNE_S))] + "ISA_HAS_LSX" + "vfrintrne.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vfrintrne_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")] + UNSPEC_LSX_VFRINTRNE_D))] + "ISA_HAS_LSX" + "vfrintrne.d\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vfrintrz_s" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFRINTRZ_S))] + "ISA_HAS_LSX" + "vfrintrz.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vfrintrz_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")] + UNSPEC_LSX_VFRINTRZ_D))] + "ISA_HAS_LSX" + "vfrintrz.d\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vfrintrp_s" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFRINTRP_S))] + "ISA_HAS_LSX" + "vfrintrp.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vfrintrp_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")] + UNSPEC_LSX_VFRINTRP_D))] + "ISA_HAS_LSX" + "vfrintrp.d\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V2DF")]) + +(define_insn "lsx_vfrintrm_s" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")] + UNSPEC_LSX_VFRINTRM_S))] + "ISA_HAS_LSX" + "vfrintrm.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "lsx_vfrintrm_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")] + UNSPEC_LSX_VFRINTRM_D))] + "ISA_HAS_LSX" + "vfrintrm.d\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V2DF")]) + +;; Vector versions of the floating-point frint patterns. +;; Expands to btrunc, ceil, floor, rint. +(define_insn "v4sf2" + [(set (match_operand:V4SF 0 "register_operand" "=f") + (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")] + FRINT_S))] + "ISA_HAS_LSX" + "vfrint.s\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V4SF")]) + +(define_insn "v2df2" + [(set (match_operand:V2DF 0 "register_operand" "=f") + (unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")] + FRINT_D))] + "ISA_HAS_LSX" + "vfrint.d\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "V2DF")]) + +;; Expands to round. +(define_insn "round2" + [(set (match_operand:FLSX 0 "register_operand" "=f") + (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")] + UNSPEC_LSX_VFRINT))] + "ISA_HAS_LSX" + "vfrint.\t%w0,%w1" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +;; Offset load and broadcast +(define_expand "lsx_vldrepl_" + [(match_operand:LSX 0 "register_operand") + (match_operand 1 "pmode_register_operand") + (match_operand 2 "aq12_operand")] + "ISA_HAS_LSX" +{ + emit_insn (gen_lsx_vldrepl__insn + (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_insn "lsx_vldrepl__insn" + [(set (match_operand:LSX 0 "register_operand" "=f") + (vec_duplicate:LSX + (mem: (plus:DI (match_operand:DI 1 "register_operand" "r") + (match_operand 2 "aq12_operand")))))] + "ISA_HAS_LSX" +{ + return "vldrepl.\t%w0,%1,%2"; +} + [(set_attr "type" "simd_load") + (set_attr "mode" "") + (set_attr "length" "4")]) + +(define_insn "lsx_vldrepl__insn_0" + [(set (match_operand:LSX 0 "register_operand" "=f") + (vec_duplicate:LSX + (mem: (match_operand:DI 1 "register_operand" "r"))))] + "ISA_HAS_LSX" +{ + return "vldrepl.\t%w0,%1,0"; +} + [(set_attr "type" "simd_load") + (set_attr "mode" "") + (set_attr "length" "4")]) + +;; Offset store by sel +(define_expand "lsx_vstelm_" + [(match_operand:LSX 0 "register_operand") + (match_operand 3 "const__operand") + (match_operand 2 "aq8_operand") + (match_operand 1 "pmode_register_operand")] + "ISA_HAS_LSX" +{ + emit_insn (gen_lsx_vstelm__insn + (operands[1], operands[2], operands[0], operands[3])); + DONE; +}) + +(define_insn "lsx_vstelm__insn" + [(set (mem: (plus:DI (match_operand:DI 0 "register_operand" "r") + (match_operand 1 "aq8_operand"))) + (vec_select: + (match_operand:LSX 2 "register_operand" "f") + (parallel [(match_operand 3 "const__operand" "")])))] + + "ISA_HAS_LSX" +{ + return "vstelm.\t%w2,%0,%1,%3"; +} + [(set_attr "type" "simd_store") + (set_attr "mode" "") + (set_attr "length" "4")]) + +;; Offset is "0" +(define_insn "lsx_vstelm__insn_0" + [(set (mem: (match_operand:DI 0 "register_operand" "r")) + (vec_select: + (match_operand:LSX 1 "register_operand" "f") + (parallel [(match_operand:SI 2 "const__operand")])))] + "ISA_HAS_LSX" +{ + return "vstelm.\t%w1,%0,0,%2"; +} + [(set_attr "type" "simd_store") + (set_attr "mode" "") + (set_attr "length" "4")]) + +(define_expand "lsx_vld" + [(match_operand:V16QI 0 "register_operand") + (match_operand 1 "pmode_register_operand") + (match_operand 2 "aq12b_operand")] + "ISA_HAS_LSX" +{ + rtx addr = plus_constant (GET_MODE (operands[1]), operands[1], + INTVAL (operands[2])); + loongarch_emit_move (operands[0], gen_rtx_MEM (V16QImode, addr)); + DONE; +}) + +(define_expand "lsx_vst" + [(match_operand:V16QI 0 "register_operand") + (match_operand 1 "pmode_register_operand") + (match_operand 2 "aq12b_operand")] + "ISA_HAS_LSX" +{ + rtx addr = plus_constant (GET_MODE (operands[1]), operands[1], + INTVAL (operands[2])); + loongarch_emit_move (gen_rtx_MEM (V16QImode, addr), operands[0]); + DONE; +}) + +(define_insn "lsx_vssrln__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSSRLN))] + "ISA_HAS_LSX" + "vssrln..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + + +(define_insn "lsx_vssrlrn__" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: [(match_operand:ILSX_DWH 1 "register_operand" "f") + (match_operand:ILSX_DWH 2 "register_operand" "f")] + UNSPEC_LSX_VSSRLRN))] + "ISA_HAS_LSX" + "vssrlrn..\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "vorn3" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (ior:ILSX (not:ILSX (match_operand:ILSX 2 "register_operand" "f")) + (match_operand:ILSX 1 "register_operand" "f")))] + "ISA_HAS_LSX" + "vorn.v\t%w0,%w1,%w2" + [(set_attr "type" "simd_logic") + (set_attr "mode" "")]) + +(define_insn "lsx_vldi" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI[(match_operand 1 "const_imm13_operand")] + UNSPEC_LSX_VLDI))] + "ISA_HAS_LSX" +{ + HOST_WIDE_INT val = INTVAL (operands[1]); + if (val < 0) + { + HOST_WIDE_INT modeVal = (val & 0xf00) >> 8; + if (modeVal < 13) + return "vldi\t%w0,%1"; + else + sorry ("imm13 only support 0000 ~ 1100 in bits 9 ~ 12 when bit '13' is 1"); + return "#"; + } + else + return "vldi\t%w0,%1"; +} + [(set_attr "type" "simd_load") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vshuf_b" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f") + (match_operand:V16QI 2 "register_operand" "f") + (match_operand:V16QI 3 "register_operand" "f")] + UNSPEC_LSX_VSHUF_B))] + "ISA_HAS_LSX" + "vshuf.b\t%w0,%w1,%w2,%w3" + [(set_attr "type" "simd_shf") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vldx" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (unspec:V16QI [(match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "reg_or_0_operand" "rJ")] + UNSPEC_LSX_VLDX))] + "ISA_HAS_LSX" +{ + return "vldx\t%w0,%1,%z2"; +} + [(set_attr "type" "simd_load") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vstx" + [(set (mem:V16QI (plus:DI (match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "reg_or_0_operand" "rJ"))) + (unspec: V16QI[(match_operand:V16QI 0 "register_operand" "f")] + UNSPEC_LSX_VSTX))] + + "ISA_HAS_LSX" +{ + return "vstx\t%w0,%1,%z2"; +} + [(set_attr "type" "simd_store") + (set_attr "mode" "DI")]) + +(define_insn "lsx_vextl_qu_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")] + UNSPEC_LSX_VEXTL_QU_DU))] + "ISA_HAS_LSX" + "vextl.qu.du\t%w0,%w1" + [(set_attr "type" "simd_bit") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vseteqz_v" + [(set (match_operand:FCC 0 "register_operand" "=z") + (eq:FCC + (unspec:SI [(match_operand:V16QI 1 "register_operand" "f")] + UNSPEC_LSX_VSETEQZ_V) + (match_operand:SI 2 "const_0_operand")))] + "ISA_HAS_LSX" +{ + return "vseteqz.v\t%0,%1"; +} + [(set_attr "type" "simd_fcmp") + (set_attr "mode" "FCC")]) + +;; Vector reduction operation +(define_expand "reduc_plus_scal_v2di" + [(match_operand:DI 0 "register_operand") + (match_operand:V2DI 1 "register_operand")] + "ISA_HAS_LSX" +{ + rtx tmp = gen_reg_rtx (V2DImode); + emit_insn (gen_lsx_vhaddw_q_d (tmp, operands[1], operands[1])); + emit_insn (gen_vec_extractv2didi (operands[0], tmp, const0_rtx)); + DONE; +}) + +(define_expand "reduc_plus_scal_v4si" + [(match_operand:SI 0 "register_operand") + (match_operand:V4SI 1 "register_operand")] + "ISA_HAS_LSX" +{ + rtx tmp = gen_reg_rtx (V2DImode); + rtx tmp1 = gen_reg_rtx (V2DImode); + emit_insn (gen_lsx_vhaddw_d_w (tmp, operands[1], operands[1])); + emit_insn (gen_lsx_vhaddw_q_d (tmp1, tmp, tmp)); + emit_insn (gen_vec_extractv4sisi (operands[0], gen_lowpart (V4SImode,tmp1), + const0_rtx)); + DONE; +}) + +(define_expand "reduc_plus_scal_" + [(match_operand: 0 "register_operand") + (match_operand:FLSX 1 "register_operand")] + "ISA_HAS_LSX" +{ + rtx tmp = gen_reg_rtx (mode); + loongarch_expand_vector_reduc (gen_add3, tmp, operands[1]); + emit_insn (gen_vec_extract (operands[0], tmp, + const0_rtx)); + DONE; +}) + +(define_expand "reduc__scal_" + [(any_bitwise: + (match_operand: 0 "register_operand") + (match_operand:ILSX 1 "register_operand"))] + "ISA_HAS_LSX" +{ + rtx tmp = gen_reg_rtx (mode); + loongarch_expand_vector_reduc (gen_3, tmp, operands[1]); + emit_insn (gen_vec_extract (operands[0], tmp, + const0_rtx)); + DONE; +}) + +(define_expand "reduc_smax_scal_" + [(match_operand: 0 "register_operand") + (match_operand:LSX 1 "register_operand")] + "ISA_HAS_LSX" +{ + rtx tmp = gen_reg_rtx (mode); + loongarch_expand_vector_reduc (gen_smax3, tmp, operands[1]); + emit_insn (gen_vec_extract (operands[0], tmp, + const0_rtx)); + DONE; +}) + +(define_expand "reduc_smin_scal_" + [(match_operand: 0 "register_operand") + (match_operand:LSX 1 "register_operand")] + "ISA_HAS_LSX" +{ + rtx tmp = gen_reg_rtx (mode); + loongarch_expand_vector_reduc (gen_smin3, tmp, operands[1]); + emit_insn (gen_vec_extract (operands[0], tmp, + const0_rtx)); + DONE; +}) + +(define_expand "reduc_umax_scal_" + [(match_operand: 0 "register_operand") + (match_operand:ILSX 1 "register_operand")] + "ISA_HAS_LSX" +{ + rtx tmp = gen_reg_rtx (mode); + loongarch_expand_vector_reduc (gen_umax3, tmp, operands[1]); + emit_insn (gen_vec_extract (operands[0], tmp, + const0_rtx)); + DONE; +}) + +(define_expand "reduc_umin_scal_" + [(match_operand: 0 "register_operand") + (match_operand:ILSX 1 "register_operand")] + "ISA_HAS_LSX" +{ + rtx tmp = gen_reg_rtx (mode); + loongarch_expand_vector_reduc (gen_umin3, tmp, operands[1]); + emit_insn (gen_vec_extract (operands[0], tmp, + const0_rtx)); + DONE; +}) + +(define_insn "lsx_vwev_d_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (addsubmul:V2DI + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 1 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2)]))) + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2)])))))] + "ISA_HAS_LSX" + "vwev.d.w\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vwev_w_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (addsubmul:V4SI + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 1 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)]))) + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)])))))] + "ISA_HAS_LSX" + "vwev.w.h\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vwev_h_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (addsubmul:V8HI + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 1 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)]))) + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)])))))] + "ISA_HAS_LSX" + "vwev.h.b\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vwod_d_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (addsubmul:V2DI + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 1 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3)]))) + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "f") + (parallel [(const_int 1) (const_int 3)])))))] + "ISA_HAS_LSX" + "vwod.d.w\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vwod_w_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (addsubmul:V4SI + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 1 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)]))) + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)])))))] + "ISA_HAS_LSX" + "vwod.w.h\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vwod_h_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (addsubmul:V8HI + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 1 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)]))) + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)])))))] + "ISA_HAS_LSX" + "vwod.h.b\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vwev_d_wu_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (addmul:V2DI + (zero_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 1 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2)]))) + (sign_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2)])))))] + "ISA_HAS_LSX" + "vwev.d.wu.w\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vwev_w_hu_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (addmul:V4SI + (zero_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 1 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)]))) + (sign_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)])))))] + "ISA_HAS_LSX" + "vwev.w.hu.h\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vwev_h_bu_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (addmul:V8HI + (zero_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 1 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)]))) + (sign_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)])))))] + "ISA_HAS_LSX" + "vwev.h.bu.b\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vwod_d_wu_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (addmul:V2DI + (zero_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 1 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3)]))) + (sign_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "f") + (parallel [(const_int 1) (const_int 3)])))))] + "ISA_HAS_LSX" + "vwod.d.wu.w\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vwod_w_hu_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (addmul:V4SI + (zero_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 1 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)]))) + (sign_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)])))))] + "ISA_HAS_LSX" + "vwod.w.hu.h\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vwod_h_bu_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (addmul:V8HI + (zero_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 1 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)]))) + (sign_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)])))))] + "ISA_HAS_LSX" + "vwod.h.bu.b\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vaddwev_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VADDWEV))] + "ISA_HAS_LSX" + "vaddwev.q.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vaddwev_q_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VADDWEV2))] + "ISA_HAS_LSX" + "vaddwev.q.du\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vaddwod_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VADDWOD))] + "ISA_HAS_LSX" + "vaddwod.q.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vaddwod_q_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VADDWOD2))] + "ISA_HAS_LSX" + "vaddwod.q.du\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vsubwev_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VSUBWEV))] + "ISA_HAS_LSX" + "vsubwev.q.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vsubwev_q_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VSUBWEV2))] + "ISA_HAS_LSX" + "vsubwev.q.du\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vsubwod_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VSUBWOD))] + "ISA_HAS_LSX" + "vsubwod.q.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vsubwod_q_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VSUBWOD2))] + "ISA_HAS_LSX" + "vsubwod.q.du\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vaddwev_q_du_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VADDWEV3))] + "ISA_HAS_LSX" + "vaddwev.q.du.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vaddwod_q_du_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VADDWOD3))] + "ISA_HAS_LSX" + "vaddwod.q.du.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmulwev_q_du_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VMULWEV3))] + "ISA_HAS_LSX" + "vmulwev.q.du.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmulwod_q_du_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VMULWOD3))] + "ISA_HAS_LSX" + "vmulwod.q.du.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmulwev_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VMULWEV))] + "ISA_HAS_LSX" + "vmulwev.q.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmulwev_q_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VMULWEV2))] + "ISA_HAS_LSX" + "vmulwev.q.du\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmulwod_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VMULWOD))] + "ISA_HAS_LSX" + "vmulwod.q.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmulwod_q_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VMULWOD2))] + "ISA_HAS_LSX" + "vmulwod.q.du\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vhaddw_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VHADDW_Q_D))] + "ISA_HAS_LSX" + "vhaddw.q.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vhaddw_qu_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VHADDW_QU_DU))] + "ISA_HAS_LSX" + "vhaddw.qu.du\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vhsubw_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VHSUBW_Q_D))] + "ISA_HAS_LSX" + "vhsubw.q.d\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vhsubw_qu_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VHSUBW_QU_DU))] + "ISA_HAS_LSX" + "vhsubw.qu.du\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwev_d_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (plus:V2DI + (match_operand:V2DI 1 "register_operand" "0") + (mult:V2DI + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2)]))) + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 3 "register_operand" "f") + (parallel [(const_int 0) (const_int 2)]))))))] + "ISA_HAS_LSX" + "vmaddwev.d.w\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwev_w_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (plus:V4SI + (match_operand:V4SI 1 "register_operand" "0") + (mult:V4SI + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)]))) + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 3 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)]))))))] + "ISA_HAS_LSX" + "vmaddwev.w.h\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vmaddwev_h_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (plus:V8HI + (match_operand:V8HI 1 "register_operand" "0") + (mult:V8HI + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)]))) + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 3 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)]))))))] + "ISA_HAS_LSX" + "vmaddwev.h.b\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vmaddwod_d_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (plus:V2DI + (match_operand:V2DI 1 "register_operand" "0") + (mult:V2DI + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3)]))) + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 3 "register_operand" "f") + (parallel [(const_int 1) (const_int 3)]))))))] + "ISA_HAS_LSX" + "vmaddwod.d.w\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwod_w_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (plus:V4SI + (match_operand:V4SI 1 "register_operand" "0") + (mult:V4SI + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)]))) + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 3 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)]))))))] + "ISA_HAS_LSX" + "vmaddwod.w.h\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vmaddwod_h_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (plus:V8HI + (match_operand:V8HI 1 "register_operand" "0") + (mult:V8HI + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)]))) + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 3 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)]))))))] + "ISA_HAS_LSX" + "vmaddwod.h.b\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vmaddwev_d_wu_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (plus:V2DI + (match_operand:V2DI 1 "register_operand" "0") + (mult:V2DI + (zero_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2)]))) + (sign_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 3 "register_operand" "f") + (parallel [(const_int 0) (const_int 2)]))))))] + "ISA_HAS_LSX" + "vmaddwev.d.wu.w\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwev_w_hu_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (plus:V4SI + (match_operand:V4SI 1 "register_operand" "0") + (mult:V4SI + (zero_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)]))) + (sign_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 3 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6)]))))))] + "ISA_HAS_LSX" + "vmaddwev.w.hu.h\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vmaddwev_h_bu_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (plus:V8HI + (match_operand:V8HI 1 "register_operand" "0") + (mult:V8HI + (zero_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "%f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)]))) + (sign_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 3 "register_operand" "f") + (parallel [(const_int 0) (const_int 2) + (const_int 4) (const_int 6) + (const_int 8) (const_int 10) + (const_int 12) (const_int 14)]))))))] + "ISA_HAS_LSX" + "vmaddwev.h.bu.b\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vmaddwod_d_wu_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (plus:V2DI + (match_operand:V2DI 1 "register_operand" "0") + (mult:V2DI + (zero_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 2 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3)]))) + (sign_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 3 "register_operand" "f") + (parallel [(const_int 1) (const_int 3)]))))))] + "ISA_HAS_LSX" + "vmaddwod.d.wu.w\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwod_w_hu_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (plus:V4SI + (match_operand:V4SI 1 "register_operand" "0") + (mult:V4SI + (zero_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 2 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)]))) + (sign_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 3 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7)]))))))] + "ISA_HAS_LSX" + "vmaddwod.w.hu.h\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vmaddwod_h_bu_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (plus:V8HI + (match_operand:V8HI 1 "register_operand" "0") + (mult:V8HI + (zero_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 2 "register_operand" "%f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)]))) + (sign_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 3 "register_operand" "f") + (parallel [(const_int 1) (const_int 3) + (const_int 5) (const_int 7) + (const_int 9) (const_int 11) + (const_int 13) (const_int 15)]))))))] + "ISA_HAS_LSX" + "vmaddwod.h.bu.b\t%w0,%w2,%w3" + [(set_attr "type" "simd_fmadd") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vmaddwev_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0") + (match_operand:V2DI 2 "register_operand" "f") + (match_operand:V2DI 3 "register_operand" "f")] + UNSPEC_LSX_VMADDWEV))] + "ISA_HAS_LSX" + "vmaddwev.q.d\t%w0,%w2,%w3" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwod_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0") + (match_operand:V2DI 2 "register_operand" "f") + (match_operand:V2DI 3 "register_operand" "f")] + UNSPEC_LSX_VMADDWOD))] + "ISA_HAS_LSX" + "vmaddwod.q.d\t%w0,%w2,%w3" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwev_q_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0") + (match_operand:V2DI 2 "register_operand" "f") + (match_operand:V2DI 3 "register_operand" "f")] + UNSPEC_LSX_VMADDWEV2))] + "ISA_HAS_LSX" + "vmaddwev.q.du\t%w0,%w2,%w3" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwod_q_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0") + (match_operand:V2DI 2 "register_operand" "f") + (match_operand:V2DI 3 "register_operand" "f")] + UNSPEC_LSX_VMADDWOD2))] + "ISA_HAS_LSX" + "vmaddwod.q.du\t%w0,%w2,%w3" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwev_q_du_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0") + (match_operand:V2DI 2 "register_operand" "f") + (match_operand:V2DI 3 "register_operand" "f")] + UNSPEC_LSX_VMADDWEV3))] + "ISA_HAS_LSX" + "vmaddwev.q.du.d\t%w0,%w2,%w3" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmaddwod_q_du_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0") + (match_operand:V2DI 2 "register_operand" "f") + (match_operand:V2DI 3 "register_operand" "f")] + UNSPEC_LSX_VMADDWOD3))] + "ISA_HAS_LSX" + "vmaddwod.q.du.d\t%w0,%w2,%w3" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vrotr_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand:ILSX 2 "register_operand" "f")] + UNSPEC_LSX_VROTR))] + "ISA_HAS_LSX" + "vrotr.\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "")]) + +(define_insn "lsx_vadd_q" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VADD_Q))] + "ISA_HAS_LSX" + "vadd.q\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vsub_q" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f") + (match_operand:V2DI 2 "register_operand" "f")] + UNSPEC_LSX_VSUB_Q))] + "ISA_HAS_LSX" + "vsub.q\t%w0,%w1,%w2" + [(set_attr "type" "simd_int_arith") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vmskgez_b" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f")] + UNSPEC_LSX_VMSKGEZ))] + "ISA_HAS_LSX" + "vmskgez.b\t%w0,%w1" + [(set_attr "type" "simd_bit") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vmsknz_b" + [(set (match_operand:V16QI 0 "register_operand" "=f") + (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f")] + UNSPEC_LSX_VMSKNZ))] + "ISA_HAS_LSX" + "vmsknz.b\t%w0,%w1" + [(set_attr "type" "simd_bit") + (set_attr "mode" "V16QI")]) + +(define_insn "lsx_vexth_h_b" + [(set (match_operand:V8HI 0 "register_operand" "=f") + (any_extend:V8HI + (vec_select:V8QI + (match_operand:V16QI 1 "register_operand" "f") + (parallel [(const_int 8) (const_int 9) + (const_int 10) (const_int 11) + (const_int 12) (const_int 13) + (const_int 14) (const_int 15)]))))] + "ISA_HAS_LSX" + "vexth.h.b\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V8HI")]) + +(define_insn "lsx_vexth_w_h" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (any_extend:V4SI + (vec_select:V4HI + (match_operand:V8HI 1 "register_operand" "f") + (parallel [(const_int 4) (const_int 5) + (const_int 6) (const_int 7)]))))] + "ISA_HAS_LSX" + "vexth.w.h\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V4SI")]) + +(define_insn "lsx_vexth_d_w" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (any_extend:V2DI + (vec_select:V2SI + (match_operand:V4SI 1 "register_operand" "f") + (parallel [(const_int 2) (const_int 3)]))))] + "ISA_HAS_LSX" + "vexth.d.w\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vexth_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")] + UNSPEC_LSX_VEXTH_Q_D))] + "ISA_HAS_LSX" + "vexth.q.d\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vexth_qu_du" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")] + UNSPEC_LSX_VEXTH_QU_DU))] + "ISA_HAS_LSX" + "vexth.qu.du\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vrotri_" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f") + (match_operand 2 "const__operand" "")] + UNSPEC_LSX_VROTRI))] + "ISA_HAS_LSX" + "vrotri.\t%w0,%w1,%2" + [(set_attr "type" "simd_shf") + (set_attr "mode" "")]) + +(define_insn "lsx_vextl_q_d" + [(set (match_operand:V2DI 0 "register_operand" "=f") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")] + UNSPEC_LSX_VEXTL_Q_D))] + "ISA_HAS_LSX" + "vextl.q.d\t%w0,%w1" + [(set_attr "type" "simd_fcvt") + (set_attr "mode" "V2DI")]) + +(define_insn "lsx_vsrlni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSRLNI))] + "ISA_HAS_LSX" + "vsrlni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrlrni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSRLRNI))] + "ISA_HAS_LSX" + "vsrlrni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrlni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSSRLNI))] + "ISA_HAS_LSX" + "vssrlni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrlni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSSRLNI2))] + "ISA_HAS_LSX" + "vssrlni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrlrni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSSRLRNI))] + "ISA_HAS_LSX" + "vssrlrni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrlrni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSSRLRNI2))] + "ISA_HAS_LSX" + "vssrlrni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrani__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSRANI))] + "ISA_HAS_LSX" + "vsrani..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vsrarni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSRARNI))] + "ISA_HAS_LSX" + "vsrarni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrani__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSSRANI))] + "ISA_HAS_LSX" + "vssrani..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrani__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSSRANI2))] + "ISA_HAS_LSX" + "vssrani..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrarni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSSRARNI))] + "ISA_HAS_LSX" + "vssrarni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vssrarni__" + [(set (match_operand:ILSX 0 "register_operand" "=f") + (unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0") + (match_operand:ILSX 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VSSRARNI2))] + "ISA_HAS_LSX" + "vssrarni..\t%w0,%w2,%3" + [(set_attr "type" "simd_shift") + (set_attr "mode" "")]) + +(define_insn "lsx_vpermi_w" + [(set (match_operand:V4SI 0 "register_operand" "=f") + (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "0") + (match_operand:V4SI 2 "register_operand" "f") + (match_operand 3 "const_uimm8_operand" "")] + UNSPEC_LSX_VPERMI))] + "ISA_HAS_LSX" + "vpermi.w\t%w0,%w2,%3" + [(set_attr "type" "simd_bit") + (set_attr "mode" "V4SI")]) diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md index 510973aa339..f430629825e 100644 --- a/gcc/config/loongarch/predicates.md +++ b/gcc/config/loongarch/predicates.md @@ -87,10 +87,42 @@ (define_predicate "const_immalsl_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 1, 4)"))) +(define_predicate "const_lsx_branch_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), -1024, 1023)"))) + +(define_predicate "const_uimm3_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 7)"))) + +(define_predicate "const_8_to_11_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 8, 11)"))) + +(define_predicate "const_12_to_15_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 12, 15)"))) + +(define_predicate "const_uimm4_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 15)"))) + (define_predicate "const_uimm5_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 0, 31)"))) +(define_predicate "const_uimm6_operand" + (and (match_code "const_int") + (match_test "UIMM6_OPERAND (INTVAL (op))"))) + +(define_predicate "const_uimm7_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 127)"))) + +(define_predicate "const_uimm8_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 255)"))) + (define_predicate "const_uimm14_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 0, 16383)"))) @@ -99,10 +131,74 @@ (define_predicate "const_uimm15_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 0, 32767)"))) +(define_predicate "const_imm5_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), -16, 15)"))) + +(define_predicate "const_imm10_operand" + (and (match_code "const_int") + (match_test "IMM10_OPERAND (INTVAL (op))"))) + (define_predicate "const_imm12_operand" (and (match_code "const_int") (match_test "IMM12_OPERAND (INTVAL (op))"))) +(define_predicate "const_imm13_operand" + (and (match_code "const_int") + (match_test "IMM13_OPERAND (INTVAL (op))"))) + +(define_predicate "reg_imm10_operand" + (ior (match_operand 0 "const_imm10_operand") + (match_operand 0 "register_operand"))) + +(define_predicate "aq8b_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)"))) + +(define_predicate "aq8h_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 1)"))) + +(define_predicate "aq8w_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 2)"))) + +(define_predicate "aq8d_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)"))) + +(define_predicate "aq10b_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 0)"))) + +(define_predicate "aq10h_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 1)"))) + +(define_predicate "aq10w_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)"))) + +(define_predicate "aq10d_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 3)"))) + +(define_predicate "aq12b_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 12, 0)"))) + +(define_predicate "aq12h_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 11, 1)"))) + +(define_predicate "aq12w_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)"))) + +(define_predicate "aq12d_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 9, 3)"))) + (define_predicate "sle_operand" (and (match_code "const_int") (match_test "IMM12_OPERAND (INTVAL (op) + 1)"))) @@ -112,29 +208,206 @@ (define_predicate "sleu_operand" (match_test "INTVAL (op) + 1 != 0"))) (define_predicate "const_0_operand" - (and (match_code "const_int,const_double,const_vector") + (and (match_code "const_int,const_wide_int,const_double,const_vector") (match_test "op == CONST0_RTX (GET_MODE (op))"))) +(define_predicate "const_m1_operand" + (and (match_code "const_int,const_wide_int,const_double,const_vector") + (match_test "op == CONSTM1_RTX (GET_MODE (op))"))) + +(define_predicate "reg_or_m1_operand" + (ior (match_operand 0 "const_m1_operand") + (match_operand 0 "register_operand"))) + (define_predicate "reg_or_0_operand" (ior (match_operand 0 "const_0_operand") (match_operand 0 "register_operand"))) (define_predicate "const_1_operand" - (and (match_code "const_int,const_double,const_vector") + (and (match_code "const_int,const_wide_int,const_double,const_vector") (match_test "op == CONST1_RTX (GET_MODE (op))"))) (define_predicate "reg_or_1_operand" (ior (match_operand 0 "const_1_operand") (match_operand 0 "register_operand"))) +;; These are used in vec_merge, hence accept bitmask as const_int. +(define_predicate "const_exp_2_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 1)"))) + +(define_predicate "const_exp_4_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 3)"))) + +(define_predicate "const_exp_8_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 7)"))) + +(define_predicate "const_exp_16_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 15)"))) + +(define_predicate "const_exp_32_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 31)"))) + +;; This is used for indexing into vectors, and hence only accepts const_int. +(define_predicate "const_0_or_1_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 1)"))) + +(define_predicate "const_0_to_3_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 3)"))) + +(define_predicate "const_0_to_7_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 7)"))) + +(define_predicate "const_2_or_3_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 2, 3)"))) + +(define_predicate "const_4_to_7_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 4, 7)"))) + +(define_predicate "const_8_to_15_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 7)"))) + +(define_predicate "const_16_to_31_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 7)"))) + +(define_predicate "qi_mask_operand" + (and (match_code "const_int") + (match_test "UINTVAL (op) == 0xff"))) + +(define_predicate "hi_mask_operand" + (and (match_code "const_int") + (match_test "UINTVAL (op) == 0xffff"))) + (define_predicate "lu52i_mask_operand" (and (match_code "const_int") (match_test "UINTVAL (op) == 0xfffffffffffff"))) +(define_predicate "si_mask_operand" + (and (match_code "const_int") + (match_test "UINTVAL (op) == 0xffffffff"))) + (define_predicate "low_bitmask_operand" (and (match_code "const_int") (match_test "low_bitmask_len (mode, INTVAL (op)) > 12"))) +(define_predicate "d_operand" + (and (match_code "reg") + (match_test "GP_REG_P (REGNO (op))"))) + +(define_predicate "db4_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 4, 0)"))) + +(define_predicate "db7_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 7, 0)"))) + +(define_predicate "db8_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 8, 0)"))) + +(define_predicate "ib3_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op) - 1, 3, 0)"))) + +(define_predicate "sb4_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 4, 0)"))) + +(define_predicate "sb5_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 5, 0)"))) + +(define_predicate "sb8_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)"))) + +(define_predicate "sd8_operand" + (and (match_code "const_int") + (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)"))) + +(define_predicate "ub4_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 0)"))) + +(define_predicate "ub8_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 0)"))) + +(define_predicate "uh4_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 1)"))) + +(define_predicate "uw4_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 2)"))) + +(define_predicate "uw5_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 5, 2)"))) + +(define_predicate "uw6_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 6, 2)"))) + +(define_predicate "uw8_operand" + (and (match_code "const_int") + (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 2)"))) + +(define_predicate "addiur2_operand" + (and (match_code "const_int") + (ior (match_test "INTVAL (op) == -1") + (match_test "INTVAL (op) == 1") + (match_test "INTVAL (op) == 4") + (match_test "INTVAL (op) == 8") + (match_test "INTVAL (op) == 12") + (match_test "INTVAL (op) == 16") + (match_test "INTVAL (op) == 20") + (match_test "INTVAL (op) == 24")))) + +(define_predicate "addiusp_operand" + (and (match_code "const_int") + (ior (match_test "(IN_RANGE (INTVAL (op), 2, 257))") + (match_test "(IN_RANGE (INTVAL (op), -258, -3))")))) + +(define_predicate "andi16_operand" + (and (match_code "const_int") + (ior (match_test "IN_RANGE (INTVAL (op), 1, 4)") + (match_test "IN_RANGE (INTVAL (op), 7, 8)") + (match_test "IN_RANGE (INTVAL (op), 15, 16)") + (match_test "IN_RANGE (INTVAL (op), 31, 32)") + (match_test "IN_RANGE (INTVAL (op), 63, 64)") + (match_test "INTVAL (op) == 255") + (match_test "INTVAL (op) == 32768") + (match_test "INTVAL (op) == 65535")))) + +(define_predicate "movep_src_register" + (and (match_code "reg") + (ior (match_test ("IN_RANGE (REGNO (op), 2, 3)")) + (match_test ("IN_RANGE (REGNO (op), 16, 20)"))))) + +(define_predicate "movep_src_operand" + (ior (match_operand 0 "const_0_operand") + (match_operand 0 "movep_src_register"))) + +(define_predicate "fcc_reload_operand" + (and (match_code "reg,subreg") + (match_test "FCC_REG_P (true_regnum (op))"))) + +(define_predicate "muldiv_target_operand" + (match_operand 0 "register_operand")) + (define_predicate "const_call_insn_operand" (match_code "const,symbol_ref,label_ref") { @@ -303,3 +576,59 @@ (define_predicate "small_data_pattern" (define_predicate "non_volatile_mem_operand" (and (match_operand 0 "memory_operand") (not (match_test "MEM_VOLATILE_P (op)")))) + +(define_predicate "const_vector_same_val_operand" + (match_code "const_vector") +{ + return loongarch_const_vector_same_val_p (op, mode); +}) + +(define_predicate "const_vector_same_simm5_operand" + (match_code "const_vector") +{ + return loongarch_const_vector_same_int_p (op, mode, -16, 15); +}) + +(define_predicate "const_vector_same_uimm5_operand" + (match_code "const_vector") +{ + return loongarch_const_vector_same_int_p (op, mode, 0, 31); +}) + +(define_predicate "const_vector_same_ximm5_operand" + (match_code "const_vector") +{ + return loongarch_const_vector_same_int_p (op, mode, -31, 31); +}) + +(define_predicate "const_vector_same_uimm6_operand" + (match_code "const_vector") +{ + return loongarch_const_vector_same_int_p (op, mode, 0, 63); +}) + +(define_predicate "par_const_vector_shf_set_operand" + (match_code "parallel") +{ + return loongarch_const_vector_shuffle_set_p (op, mode); +}) + +(define_predicate "reg_or_vector_same_val_operand" + (ior (match_operand 0 "register_operand") + (match_operand 0 "const_vector_same_val_operand"))) + +(define_predicate "reg_or_vector_same_simm5_operand" + (ior (match_operand 0 "register_operand") + (match_operand 0 "const_vector_same_simm5_operand"))) + +(define_predicate "reg_or_vector_same_uimm5_operand" + (ior (match_operand 0 "register_operand") + (match_operand 0 "const_vector_same_uimm5_operand"))) + +(define_predicate "reg_or_vector_same_ximm5_operand" + (ior (match_operand 0 "register_operand") + (match_operand 0 "const_vector_same_ximm5_operand"))) + +(define_predicate "reg_or_vector_same_uimm6_operand" + (ior (match_operand 0 "register_operand") + (match_operand 0 "const_vector_same_uimm6_operand"))) -- 2.36.0