From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 40067 invoked by alias); 27 Oct 2017 13:27:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 40055 invoked by uid 89); 27 Oct 2017 13:27:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.6 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS,UNSUBSCRIBE_BODY autolearn=ham version=3.3.2 spammy=4554, 1280 X-HELO: mail-wm0-f44.google.com Received: from mail-wm0-f44.google.com (HELO mail-wm0-f44.google.com) (74.125.82.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 27 Oct 2017 13:27:15 +0000 Received: by mail-wm0-f44.google.com with SMTP id b9so3848085wmh.0 for ; Fri, 27 Oct 2017 06:27:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:cc:subject:references :date:in-reply-to:message-id:user-agent:mime-version; bh=O3KBdgg2p7cyaZ9Jj4+MUuMpPPA2YQ/C9Er6XVxt/xM=; b=J86pKYZZ1TLl6qXJJjB+B2lFIL0dVN+5vRWtvKtWuAxAD/KOU1wblxmxQimx82gkYh r2PdRuwvKlKp2QDOUw5Tt/L7W/6hc956lGJMRYgRjcv5Ojqwq+pKwqPO0BauYxnhCu+i fUi+DkyEVGxpJ+g+LCFVk1b7arqnBHwW5+WWWx568821vYSdx6hSbiW0gHTiPdIoGq0M Wkh0lAXJMut92D7cng4Jbl0uzuF2dSsaHKiqqfhjIXoasOg3WfKMsnQZw1qOen7GfDYY gB8LnqyuFEDTsdSld2J2lsz5I8CXjKChZ5EqGTAvtY+vfen5hj2vh8ScrOOV4Zhp2Yxn i08Q== X-Gm-Message-State: AMCzsaWHXVEcJ9JENg5ekZd3yvZcvePjKIjyzIpSmhsj30gHbyyZYk7T SBuYT5rJBG2IZz95afXE7jcElvUe2/w= X-Google-Smtp-Source: ABhQp+QCNPCcd6cmuUKVyF+/V3DmMvRndG426QyERssbNbsCvjdfIxNPiPBPLZcRYZ9CfTNHTcrtKQ== X-Received: by 10.28.126.146 with SMTP id z140mr470973wmc.126.1509110832037; Fri, 27 Oct 2017 06:27:12 -0700 (PDT) Received: from localhost (188.29.164.51.threembb.co.uk. [188.29.164.51]) by smtp.gmail.com with ESMTPSA id x73sm1484336wme.34.2017.10.27.06.27.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 27 Oct 2017 06:27:11 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, james.greenhalgh@arm.com, marcus.shawcroft@arm.com, richard.sandiford@linaro.org Cc: richard.earnshaw@arm.com, james.greenhalgh@arm.com, marcus.shawcroft@arm.com Subject: [05/nn] [AArch64] Rewrite aarch64_simd_valid_immediate References: <873764d8y3.fsf@linaro.org> Date: Fri, 27 Oct 2017 13:28:00 -0000 In-Reply-To: <873764d8y3.fsf@linaro.org> (Richard Sandiford's message of "Fri, 27 Oct 2017 14:19:48 +0100") Message-ID: <87efpobu1f.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SW-Source: 2017-10/txt/msg02069.txt.bz2 This patch reworks aarch64_simd_valid_immediate so that it's easier to add SVE support. The main changes are: - make simd_immediate_info easier to construct - replace the while (1) { ... break; } blocks with checks that use the full 64-bit value of the constant - treat floating-point modes as integers if they aren't valid as floating-point values 2017-10-26 Richard Sandiford Alan Hayward David Sherwood gcc/ * config/aarch64/aarch64-protos.h (aarch64_output_simd_mov_immediate): Remove the mode argument. (aarch64_simd_valid_immediate): Remove the mode and inverse arguments. * config/aarch64/iterators.md (bitsize): New iterator. * config/aarch64/aarch64-simd.md (*aarch64_simd_mov, and3) (ior3): Update calls to aarch64_output_simd_mov_immediate. * config/aarch64/constraints.md (Do, Db, Dn): Update calls to aarch64_simd_valid_immediate. * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): Likewise. (aarch64_reg_or_bic_imm): Likewise. * config/aarch64/aarch64.c (simd_immediate_info): Replace mvn with an insn_type enum and msl with a modifier_type enum. Replace element_width with a scalar_mode. Change the shift to unsigned int. Add constructors for scalar_float_mode and scalar_int_mode elements. (aarch64_vect_float_const_representable_p): Delete. (aarch64_can_const_movi_rtx_p, aarch64_legitimate_constant_p) (aarch64_simd_scalar_immediate_valid_for_move) (aarch64_simd_make_constant): Update call to aarch64_simd_valid_immediate. (aarch64_advsimd_valid_immediate_hs): New function. (aarch64_advsimd_valid_immediate): Likewise. (aarch64_simd_valid_immediate): Remove mode and inverse arguments. Rewrite to use the above. Use const_vec_duplicate_p to detect duplicated constants and use aarch64_float_const_zero_rtx_p and aarch64_float_const_representable_p on the result. (aarch64_output_simd_mov_immediate): Remove mode argument. Update call to aarch64_simd_valid_immediate and use of simd_immediate_info. (aarch64_output_scalar_simd_mov_immediate): Update call accordingly. gcc/testsuite/ * gcc.target/aarch64/vect-movi.c (movi_float_lsl24): New function. (main): Call it. Index: gcc/config/aarch64/aarch64-protos.h =================================================================== *** gcc/config/aarch64/aarch64-protos.h 2017-10-27 14:06:16.157803281 +0100 --- gcc/config/aarch64/aarch64-protos.h 2017-10-27 14:26:40.949165813 +0100 *************** bool aarch64_mov_operand_p (rtx, machine *** 368,374 **** rtx aarch64_reverse_mask (machine_mode); bool aarch64_offset_7bit_signed_scaled_p (machine_mode, HOST_WIDE_INT); char *aarch64_output_scalar_simd_mov_immediate (rtx, scalar_int_mode); ! char *aarch64_output_simd_mov_immediate (rtx, machine_mode, unsigned, enum simd_immediate_check w = AARCH64_CHECK_MOV); bool aarch64_pad_reg_upward (machine_mode, const_tree, bool); bool aarch64_regno_ok_for_base_p (int, bool); --- 368,374 ---- rtx aarch64_reverse_mask (machine_mode); bool aarch64_offset_7bit_signed_scaled_p (machine_mode, HOST_WIDE_INT); char *aarch64_output_scalar_simd_mov_immediate (rtx, scalar_int_mode); ! char *aarch64_output_simd_mov_immediate (rtx, unsigned, enum simd_immediate_check w = AARCH64_CHECK_MOV); bool aarch64_pad_reg_upward (machine_mode, const_tree, bool); bool aarch64_regno_ok_for_base_p (int, bool); *************** bool aarch64_simd_check_vect_par_cnst_ha *** 379,386 **** bool aarch64_simd_imm_zero_p (rtx, machine_mode); bool aarch64_simd_scalar_immediate_valid_for_move (rtx, scalar_int_mode); bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool); ! bool aarch64_simd_valid_immediate (rtx, machine_mode, bool, ! struct simd_immediate_info *, enum simd_immediate_check w = AARCH64_CHECK_MOV); bool aarch64_split_dimode_const_store (rtx, rtx); bool aarch64_symbolic_address_p (rtx); --- 379,385 ---- bool aarch64_simd_imm_zero_p (rtx, machine_mode); bool aarch64_simd_scalar_immediate_valid_for_move (rtx, scalar_int_mode); bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool); ! bool aarch64_simd_valid_immediate (rtx, struct simd_immediate_info *, enum simd_immediate_check w = AARCH64_CHECK_MOV); bool aarch64_split_dimode_const_store (rtx, rtx); bool aarch64_symbolic_address_p (rtx); Index: gcc/config/aarch64/iterators.md =================================================================== *** gcc/config/aarch64/iterators.md 2017-10-27 14:05:38.185854661 +0100 --- gcc/config/aarch64/iterators.md 2017-10-27 14:26:40.949165813 +0100 *************** (define_mode_attr vw2 [(DI "") (QI "h") *** 438,443 **** --- 438,450 ---- (define_mode_attr rtn [(DI "d") (SI "")]) (define_mode_attr vas [(DI "") (SI ".2s")]) + ;; Map a mode to the number of bits in it, if the size of the mode + ;; is constant. + (define_mode_attr bitsize [(V8QI "64") (V16QI "128") + (V4HI "64") (V8HI "128") + (V2SI "64") (V4SI "128") + (V2DI "128")]) + ;; Map a floating point mode to the appropriate register name prefix (define_mode_attr s [(HF "h") (SF "s") (DF "d")]) Index: gcc/config/aarch64/aarch64-simd.md =================================================================== *** gcc/config/aarch64/aarch64-simd.md 2017-10-27 14:05:38.185854661 +0100 --- gcc/config/aarch64/aarch64-simd.md 2017-10-27 14:26:40.949165813 +0100 *************** (define_insn "*aarch64_simd_mov" *** 121,128 **** case 5: return "fmov\t%d0, %1"; case 6: return "mov\t%0, %1"; case 7: ! return aarch64_output_simd_mov_immediate (operands[1], ! mode, 64); default: gcc_unreachable (); } } --- 121,127 ---- case 5: return "fmov\t%d0, %1"; case 6: return "mov\t%0, %1"; case 7: ! return aarch64_output_simd_mov_immediate (operands[1], 64); default: gcc_unreachable (); } } *************** (define_insn "*aarch64_simd_mov" *** 155,161 **** case 6: return "#"; case 7: ! return aarch64_output_simd_mov_immediate (operands[1], mode, 128); default: gcc_unreachable (); } --- 154,160 ---- case 6: return "#"; case 7: ! return aarch64_output_simd_mov_immediate (operands[1], 128); default: gcc_unreachable (); } *************** (define_insn "and3" *** 651,658 **** case 0: return "and\t%0., %1., %2."; case 1: ! return aarch64_output_simd_mov_immediate (operands[2], ! mode, GET_MODE_BITSIZE (mode), AARCH64_CHECK_BIC); default: gcc_unreachable (); } --- 650,657 ---- case 0: return "and\t%0., %1., %2."; case 1: ! return aarch64_output_simd_mov_immediate (operands[2], , ! AARCH64_CHECK_BIC); default: gcc_unreachable (); } *************** (define_insn "ior3" *** 672,679 **** case 0: return "orr\t%0., %1., %2."; case 1: ! return aarch64_output_simd_mov_immediate (operands[2], ! mode, GET_MODE_BITSIZE (mode), AARCH64_CHECK_ORR); default: gcc_unreachable (); } --- 671,678 ---- case 0: return "orr\t%0., %1., %2."; case 1: ! return aarch64_output_simd_mov_immediate (operands[2], , ! AARCH64_CHECK_ORR); default: gcc_unreachable (); } Index: gcc/config/aarch64/constraints.md =================================================================== *** gcc/config/aarch64/constraints.md 2017-10-27 14:11:54.071011147 +0100 --- gcc/config/aarch64/constraints.md 2017-10-27 14:11:56.995515870 +0100 *************** (define_constraint "Do" *** 194,215 **** "@internal A constraint that matches vector of immediates for orr." (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, mode, false, ! NULL, AARCH64_CHECK_ORR)"))) (define_constraint "Db" "@internal A constraint that matches vector of immediates for bic." (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, mode, false, ! NULL, AARCH64_CHECK_BIC)"))) (define_constraint "Dn" "@internal A constraint that matches vector of immediates." (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, GET_MODE (op), ! false, NULL)"))) (define_constraint "Dh" "@internal --- 194,214 ---- "@internal A constraint that matches vector of immediates for orr." (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, NULL, ! AARCH64_CHECK_ORR)"))) (define_constraint "Db" "@internal A constraint that matches vector of immediates for bic." (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, NULL, ! AARCH64_CHECK_BIC)"))) (define_constraint "Dn" "@internal A constraint that matches vector of immediates." (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, NULL)"))) (define_constraint "Dh" "@internal Index: gcc/config/aarch64/predicates.md =================================================================== *** gcc/config/aarch64/predicates.md 2017-10-27 14:06:16.159815485 +0100 --- gcc/config/aarch64/predicates.md 2017-10-27 14:11:56.995515870 +0100 *************** (define_predicate "aarch64_reg_zero_or_m *** 72,85 **** (define_predicate "aarch64_reg_or_orr_imm" (ior (match_operand 0 "register_operand") (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, mode, false, ! NULL, AARCH64_CHECK_ORR)")))) (define_predicate "aarch64_reg_or_bic_imm" (ior (match_operand 0 "register_operand") (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, mode, false, ! NULL, AARCH64_CHECK_BIC)")))) (define_predicate "aarch64_fp_compare_operand" (ior (match_operand 0 "register_operand") --- 72,85 ---- (define_predicate "aarch64_reg_or_orr_imm" (ior (match_operand 0 "register_operand") (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, NULL, ! AARCH64_CHECK_ORR)")))) (define_predicate "aarch64_reg_or_bic_imm" (ior (match_operand 0 "register_operand") (and (match_code "const_vector") ! (match_test "aarch64_simd_valid_immediate (op, NULL, ! AARCH64_CHECK_BIC)")))) (define_predicate "aarch64_fp_compare_operand" (ior (match_operand 0 "register_operand") Index: gcc/config/aarch64/aarch64.c =================================================================== *** gcc/config/aarch64/aarch64.c 2017-10-27 14:11:14.425034427 +0100 --- gcc/config/aarch64/aarch64.c 2017-10-27 14:26:40.949165813 +0100 *************** struct aarch64_address_info { *** 117,130 **** enum aarch64_symbol_type symbol_type; }; struct simd_immediate_info { rtx value; ! int shift; ! int element_width; ! bool mvn; ! bool msl; ! }; /* The current code model. */ enum aarch64_code_model aarch64_cmodel; --- 117,168 ---- enum aarch64_symbol_type symbol_type; }; + /* Information about a legitimate vector immediate operand. */ struct simd_immediate_info { + enum insn_type { MOV, MVN }; + enum modifier_type { LSL, MSL }; + + simd_immediate_info () {} + simd_immediate_info (scalar_float_mode, rtx); + simd_immediate_info (scalar_int_mode, unsigned HOST_WIDE_INT, + insn_type = MOV, modifier_type = LSL, + unsigned int = 0); + + /* The mode of the elements. */ + scalar_mode elt_mode; + + /* The value of each element. */ rtx value; ! ! /* The instruction to use to move the immediate into a vector. */ ! insn_type insn; ! ! /* The kind of shift modifier to use, and the number of bits to shift. ! This is (LSL, 0) if no shift is needed. */ ! modifier_type modifier; ! unsigned int shift; ! }; ! ! /* Construct a floating-point immediate in which each element has mode ! ELT_MODE_IN and value VALUE_IN. */ ! inline simd_immediate_info ! ::simd_immediate_info (scalar_float_mode elt_mode_in, rtx value_in) ! : elt_mode (elt_mode_in), value (value_in), insn (MOV), ! modifier (LSL), shift (0) ! {} ! ! /* Construct an integer immediate in which each element has mode ELT_MODE_IN ! and value VALUE_IN. The other parameters are as for the structure ! fields. */ ! inline simd_immediate_info ! ::simd_immediate_info (scalar_int_mode elt_mode_in, ! unsigned HOST_WIDE_INT value_in, ! insn_type insn_in, modifier_type modifier_in, ! unsigned int shift_in) ! : elt_mode (elt_mode_in), value (gen_int_mode (value_in, elt_mode_in)), ! insn (insn_in), modifier (modifier_in), shift (shift_in) ! {} /* The current code model. */ enum aarch64_code_model aarch64_cmodel; *************** aarch64_can_const_movi_rtx_p (rtx x, mac *** 5083,5089 **** vmode = aarch64_simd_container_mode (imode, width); rtx v_op = aarch64_simd_gen_const_vector_dup (vmode, ival); ! return aarch64_simd_valid_immediate (v_op, vmode, false, NULL); } --- 5121,5127 ---- vmode = aarch64_simd_container_mode (imode, width); rtx v_op = aarch64_simd_gen_const_vector_dup (vmode, ival); ! return aarch64_simd_valid_immediate (v_op, NULL); } *************** aarch64_legitimate_constant_p (machine_m *** 10623,10629 **** As such we have to prevent the compiler from forcing these to memory. */ if ((GET_CODE (x) == CONST_VECTOR ! && aarch64_simd_valid_immediate (x, mode, false, NULL)) || CONST_INT_P (x) || aarch64_valid_floating_const (x) || aarch64_can_const_movi_rtx_p (x, mode) --- 10661,10667 ---- As such we have to prevent the compiler from forcing these to memory. */ if ((GET_CODE (x) == CONST_VECTOR ! && aarch64_simd_valid_immediate (x, NULL)) || CONST_INT_P (x) || aarch64_valid_floating_const (x) || aarch64_can_const_movi_rtx_p (x, mode) *************** sizetochar (int size) *** 11698,11897 **** } } ! /* Return true iff x is a uniform vector of floating-point ! constants, and the constant can be represented in ! quarter-precision form. Note, as aarch64_float_const_representable ! rejects both +0.0 and -0.0, we will also reject +0.0 and -0.0. */ ! static bool ! aarch64_vect_float_const_representable_p (rtx x) ! { ! rtx elt; ! return (GET_MODE_CLASS (GET_MODE (x)) == MODE_VECTOR_FLOAT ! && const_vec_duplicate_p (x, &elt) ! && aarch64_float_const_representable_p (elt)); ! } ! ! /* Return true for valid and false for invalid. */ ! bool ! aarch64_simd_valid_immediate (rtx op, machine_mode mode, bool inverse, ! struct simd_immediate_info *info, ! enum simd_immediate_check which) ! { ! #define CHECK(STRIDE, ELSIZE, CLASS, TEST, SHIFT, NEG) \ ! matches = 1; \ ! for (i = 0; i < idx; i += (STRIDE)) \ ! if (!(TEST)) \ ! matches = 0; \ ! if (matches) \ ! { \ ! immtype = (CLASS); \ ! elsize = (ELSIZE); \ ! eshift = (SHIFT); \ ! emvn = (NEG); \ ! break; \ ! } ! ! unsigned int i, elsize = 0, idx = 0, n_elts = CONST_VECTOR_NUNITS (op); ! unsigned int innersize = GET_MODE_UNIT_SIZE (mode); ! unsigned char bytes[16]; ! int immtype = -1, matches; ! unsigned int invmask = inverse ? 0xff : 0; ! int eshift, emvn; ! ! if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT) ! { ! if (! (aarch64_simd_imm_zero_p (op, mode) ! || aarch64_vect_float_const_representable_p (op))) ! return false; ! if (info) ! { ! rtx elt = CONST_VECTOR_ELT (op, 0); ! scalar_float_mode elt_mode ! = as_a (GET_MODE (elt)); ! ! info->value = elt; ! info->element_width = GET_MODE_BITSIZE (elt_mode); ! info->mvn = false; ! info->shift = 0; } ! return true; ! } ! /* Splat vector constant out into a byte vector. */ ! for (i = 0; i < n_elts; i++) ! { ! /* The vector is provided in gcc endian-neutral fashion. For aarch64_be, ! it must be laid out in the vector register in reverse order. */ ! rtx el = CONST_VECTOR_ELT (op, BYTES_BIG_ENDIAN ? (n_elts - 1 - i) : i); ! unsigned HOST_WIDE_INT elpart; ! gcc_assert (CONST_INT_P (el)); ! elpart = INTVAL (el); ! for (unsigned int byte = 0; byte < innersize; byte++) { ! bytes[idx++] = (elpart & 0xff) ^ invmask; ! elpart >>= BITS_PER_UNIT; } - } ! /* Sanity check. */ ! gcc_assert (idx == GET_MODE_SIZE (mode)); ! ! do { ! if (which & AARCH64_CHECK_ORR) { ! CHECK (4, 32, 0, bytes[i] == bytes[0] && bytes[i + 1] == 0 ! && bytes[i + 2] == 0 && bytes[i + 3] == 0, 0, 0); ! ! CHECK (4, 32, 1, bytes[i] == 0 && bytes[i + 1] == bytes[1] ! && bytes[i + 2] == 0 && bytes[i + 3] == 0, 8, 0); ! ! CHECK (4, 32, 2, bytes[i] == 0 && bytes[i + 1] == 0 ! && bytes[i + 2] == bytes[2] && bytes[i + 3] == 0, 16, 0); ! ! CHECK (4, 32, 3, bytes[i] == 0 && bytes[i + 1] == 0 ! && bytes[i + 2] == 0 && bytes[i + 3] == bytes[3], 24, 0); ! ! CHECK (2, 16, 4, bytes[i] == bytes[0] && bytes[i + 1] == 0, 0, 0); ! ! CHECK (2, 16, 5, bytes[i] == 0 && bytes[i + 1] == bytes[1], 8, 0); ! } ! ! if (which & AARCH64_CHECK_BIC) ! { ! CHECK (4, 32, 6, bytes[i] == bytes[0] && bytes[i + 1] == 0xff ! && bytes[i + 2] == 0xff && bytes[i + 3] == 0xff, 0, 1); ! ! CHECK (4, 32, 7, bytes[i] == 0xff && bytes[i + 1] == bytes[1] ! && bytes[i + 2] == 0xff && bytes[i + 3] == 0xff, 8, 1); ! ! CHECK (4, 32, 8, bytes[i] == 0xff && bytes[i + 1] == 0xff ! && bytes[i + 2] == bytes[2] && bytes[i + 3] == 0xff, 16, 1); ! ! CHECK (4, 32, 9, bytes[i] == 0xff && bytes[i + 1] == 0xff ! && bytes[i + 2] == 0xff && bytes[i + 3] == bytes[3], 24, 1); ! ! CHECK (2, 16, 10, bytes[i] == bytes[0] && bytes[i + 1] == 0xff, 0, 1); ! ! CHECK (2, 16, 11, bytes[i] == 0xff && bytes[i + 1] == bytes[1], 8, 1); } ! ! /* Shifting ones / 8-bit / 64-bit variants only checked ! for 'ALL' (MOVI/MVNI). */ ! if (which == AARCH64_CHECK_MOV) { ! CHECK (4, 32, 12, bytes[i] == 0xff && bytes[i + 1] == bytes[1] ! && bytes[i + 2] == 0 && bytes[i + 3] == 0, 8, 0); ! ! CHECK (4, 32, 13, bytes[i] == 0 && bytes[i + 1] == bytes[1] ! && bytes[i + 2] == 0xff && bytes[i + 3] == 0xff, 8, 1); ! ! CHECK (4, 32, 14, bytes[i] == 0xff && bytes[i + 1] == 0xff ! && bytes[i + 2] == bytes[2] && bytes[i + 3] == 0, 16, 0); ! ! CHECK (4, 32, 15, bytes[i] == 0 && bytes[i + 1] == 0 ! && bytes[i + 2] == bytes[2] && bytes[i + 3] == 0xff, 16, 1); ! ! CHECK (1, 8, 16, bytes[i] == bytes[0], 0, 0); ! ! CHECK (1, 64, 17, (bytes[i] == 0 || bytes[i] == 0xff) ! && bytes[i] == bytes[(i + 8) % idx], 0, 0); } } ! while (0); ! if (immtype == -1) return false; ! if (info) { ! info->element_width = elsize; ! info->mvn = emvn != 0; ! info->shift = eshift; ! ! unsigned HOST_WIDE_INT imm = 0; ! if (immtype >= 12 && immtype <= 15) ! info->msl = true; ! /* Un-invert bytes of recognized vector, if necessary. */ ! if (invmask != 0) ! for (i = 0; i < idx; i++) ! bytes[i] ^= invmask; ! if (immtype == 17) ! { ! /* FIXME: Broken on 32-bit H_W_I hosts. */ ! gcc_assert (sizeof (HOST_WIDE_INT) == 8); ! for (i = 0; i < 8; i++) ! imm |= (unsigned HOST_WIDE_INT) (bytes[i] ? 0xff : 0) ! << (i * BITS_PER_UNIT); ! info->value = GEN_INT (imm); ! } ! else { ! for (i = 0; i < elsize / BITS_PER_UNIT; i++) ! imm |= (unsigned HOST_WIDE_INT) bytes[i] << (i * BITS_PER_UNIT); ! ! /* Construct 'abcdefgh' because the assembler cannot handle ! generic constants. */ ! if (info->mvn) ! imm = ~imm; ! imm = (imm >> info->shift) & 0xff; ! info->value = GEN_INT (imm); } } ! return true; ! #undef CHECK } /* Check of immediate shift constants are within range. */ --- 11736,11920 ---- } } ! /* Return true if replicating VAL32 is a valid 2-byte or 4-byte immediate ! for the Advanced SIMD operation described by WHICH and INSN. If INFO ! is nonnull, use it to describe valid immediates. */ ! static bool ! aarch64_advsimd_valid_immediate_hs (unsigned int val32, ! simd_immediate_info *info, ! enum simd_immediate_check which, ! simd_immediate_info::insn_type insn) ! { ! /* Try a 4-byte immediate with LSL. */ ! for (unsigned int shift = 0; shift < 32; shift += 8) ! if ((val32 & (0xff << shift)) == val32) ! { ! if (info) ! *info = simd_immediate_info (SImode, val32 >> shift, insn, ! simd_immediate_info::LSL, shift); ! return true; ! } ! /* Try a 2-byte immediate with LSL. */ ! unsigned int imm16 = val32 & 0xffff; ! if (imm16 == (val32 >> 16)) ! for (unsigned int shift = 0; shift < 16; shift += 8) ! if ((imm16 & (0xff << shift)) == imm16) ! { ! if (info) ! *info = simd_immediate_info (HImode, imm16 >> shift, insn, ! simd_immediate_info::LSL, shift); ! return true; } ! /* Try a 4-byte immediate with MSL, except for cases that MVN ! can handle. */ ! if (which == AARCH64_CHECK_MOV) ! for (unsigned int shift = 8; shift < 24; shift += 8) ! { ! unsigned int low = (1 << shift) - 1; ! if (((val32 & (0xff << shift)) | low) == val32) ! { ! if (info) ! *info = simd_immediate_info (SImode, val32 >> shift, insn, ! simd_immediate_info::MSL, shift); ! return true; ! } ! } ! return false; ! } ! /* Return true if replicating VAL64 is a valid immediate for the ! Advanced SIMD operation described by WHICH. If INFO is nonnull, ! use it to describe valid immediates. */ ! static bool ! aarch64_advsimd_valid_immediate (unsigned HOST_WIDE_INT val64, ! simd_immediate_info *info, ! enum simd_immediate_check which) ! { ! unsigned int val32 = val64 & 0xffffffff; ! unsigned int val16 = val64 & 0xffff; ! unsigned int val8 = val64 & 0xff; ! ! if (val32 == (val64 >> 32)) ! { ! if ((which & AARCH64_CHECK_ORR) != 0 ! && aarch64_advsimd_valid_immediate_hs (val32, info, which, ! simd_immediate_info::MOV)) ! return true; ! ! if ((which & AARCH64_CHECK_BIC) != 0 ! && aarch64_advsimd_valid_immediate_hs (~val32, info, which, ! simd_immediate_info::MVN)) ! return true; ! /* Try using a replicated byte. */ ! if (which == AARCH64_CHECK_MOV ! && val16 == (val32 >> 16) ! && val8 == (val16 >> 8)) { ! if (info) ! *info = simd_immediate_info (QImode, val8); ! return true; } } ! /* Try using a bit-to-bytemask. */ ! if (which == AARCH64_CHECK_MOV) { ! unsigned int i; ! for (i = 0; i < 64; i += 8) { ! unsigned char byte = (val64 >> i) & 0xff; ! if (byte != 0 && byte != 0xff) ! break; } ! if (i == 64) { ! if (info) ! *info = simd_immediate_info (DImode, val64); ! return true; } } ! return false; ! } ! /* Return true if OP is a valid SIMD immediate for the operation ! described by WHICH. If INFO is nonnull, use it to describe valid ! immediates. */ ! bool ! aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info, ! enum simd_immediate_check which) ! { ! rtx elt = NULL; ! unsigned int n_elts; ! if (const_vec_duplicate_p (op, &elt)) ! n_elts = 1; ! else if (GET_CODE (op) == CONST_VECTOR) ! n_elts = CONST_VECTOR_NUNITS (op); ! else return false; ! machine_mode mode = GET_MODE (op); ! scalar_mode elt_mode = GET_MODE_INNER (mode); ! scalar_float_mode elt_float_mode; ! if (elt ! && is_a (elt_mode, &elt_float_mode) ! && (aarch64_float_const_zero_rtx_p (elt) ! || aarch64_float_const_representable_p (elt))) { ! if (info) ! *info = simd_immediate_info (elt_float_mode, elt); ! return true; ! } ! unsigned int elt_size = GET_MODE_SIZE (elt_mode); ! if (elt_size > 8) ! return false; ! scalar_int_mode elt_int_mode = int_mode_for_mode (elt_mode).require (); ! /* Expand the vector constant out into a byte vector, with the least ! significant byte of the register first. */ ! auto_vec bytes; ! bytes.reserve (n_elts * elt_size); ! for (unsigned int i = 0; i < n_elts; i++) ! { ! if (!elt || n_elts != 1) ! /* The vector is provided in gcc endian-neutral fashion. ! For aarch64_be, it must be laid out in the vector register ! in reverse order. */ ! elt = CONST_VECTOR_ELT (op, BYTES_BIG_ENDIAN ? (n_elts - 1 - i) : i); ! if (elt_mode != elt_int_mode) ! elt = gen_lowpart (elt_int_mode, elt); + if (!CONST_INT_P (elt)) + return false; ! unsigned HOST_WIDE_INT elt_val = INTVAL (elt); ! for (unsigned int byte = 0; byte < elt_size; byte++) { ! bytes.quick_push (elt_val & 0xff); ! elt_val >>= BITS_PER_UNIT; } } ! /* The immediate must repeat every eight bytes. */ ! unsigned int nbytes = bytes.length (); ! for (unsigned i = 8; i < nbytes; ++i) ! if (bytes[i] != bytes[i - 8]) ! return false; ! ! /* Get the repeating 8-byte value as an integer. No endian correction ! is needed here because bytes is already in lsb-first order. */ ! unsigned HOST_WIDE_INT val64 = 0; ! for (unsigned int i = 0; i < 8; i++) ! val64 |= ((unsigned HOST_WIDE_INT) bytes[i % nbytes] ! << (i * BITS_PER_UNIT)); ! ! return aarch64_advsimd_valid_immediate (val64, info, which); } /* Check of immediate shift constants are within range. */ *************** aarch64_simd_scalar_immediate_valid_for_ *** 11963,11969 **** vmode = aarch64_preferred_simd_mode (mode); rtx op_v = aarch64_simd_gen_const_vector_dup (vmode, INTVAL (op)); ! return aarch64_simd_valid_immediate (op_v, vmode, false, NULL); } /* Construct and return a PARALLEL RTX vector with elements numbering the --- 11986,11992 ---- vmode = aarch64_preferred_simd_mode (mode); rtx op_v = aarch64_simd_gen_const_vector_dup (vmode, INTVAL (op)); ! return aarch64_simd_valid_immediate (op_v, NULL); } /* Construct and return a PARALLEL RTX vector with elements numbering the *************** aarch64_simd_make_constant (rtx vals) *** 12201,12207 **** gcc_unreachable (); if (const_vec != NULL_RTX ! && aarch64_simd_valid_immediate (const_vec, mode, false, NULL)) /* Load using MOVI/MVNI. */ return const_vec; else if ((const_dup = aarch64_simd_dup_constant (vals)) != NULL_RTX) --- 12224,12230 ---- gcc_unreachable (); if (const_vec != NULL_RTX ! && aarch64_simd_valid_immediate (const_vec, NULL)) /* Load using MOVI/MVNI. */ return const_vec; else if ((const_dup = aarch64_simd_dup_constant (vals)) != NULL_RTX) *************** aarch64_float_const_representable_p (rtx *** 13239,13247 **** immediate with a CONST_VECTOR of MODE and WIDTH. WHICH selects whether to output MOVI/MVNI, ORR or BIC immediate. */ char* ! aarch64_output_simd_mov_immediate (rtx const_vector, ! machine_mode mode, ! unsigned width, enum simd_immediate_check which) { bool is_valid; --- 13262,13268 ---- immediate with a CONST_VECTOR of MODE and WIDTH. WHICH selects whether to output MOVI/MVNI, ORR or BIC immediate. */ char* ! aarch64_output_simd_mov_immediate (rtx const_vector, unsigned width, enum simd_immediate_check which) { bool is_valid; *************** aarch64_output_simd_mov_immediate (rtx c *** 13251,13273 **** unsigned int lane_count = 0; char element_char; ! struct simd_immediate_info info = { NULL_RTX, 0, 0, false, false }; /* This will return true to show const_vector is legal for use as either a AdvSIMD MOVI instruction (or, implicitly, MVNI), ORR or BIC immediate. It will also update INFO to show how the immediate should be generated. WHICH selects whether to check for MOVI/MVNI, ORR or BIC. */ ! is_valid = aarch64_simd_valid_immediate (const_vector, mode, false, ! &info, which); gcc_assert (is_valid); ! element_char = sizetochar (info.element_width); ! lane_count = width / info.element_width; ! mode = GET_MODE_INNER (mode); ! if (GET_MODE_CLASS (mode) == MODE_FLOAT) { ! gcc_assert (info.shift == 0 && ! info.mvn); /* For FP zero change it to a CONST_INT 0 and use the integer SIMD move immediate path. */ if (aarch64_float_const_zero_rtx_p (info.value)) --- 13272,13292 ---- unsigned int lane_count = 0; char element_char; ! struct simd_immediate_info info; /* This will return true to show const_vector is legal for use as either a AdvSIMD MOVI instruction (or, implicitly, MVNI), ORR or BIC immediate. It will also update INFO to show how the immediate should be generated. WHICH selects whether to check for MOVI/MVNI, ORR or BIC. */ ! is_valid = aarch64_simd_valid_immediate (const_vector, &info, which); gcc_assert (is_valid); ! element_char = sizetochar (GET_MODE_BITSIZE (info.elt_mode)); ! lane_count = width / GET_MODE_BITSIZE (info.elt_mode); ! if (GET_MODE_CLASS (info.elt_mode) == MODE_FLOAT) { ! gcc_assert (info.shift == 0 && info.insn == simd_immediate_info::MOV); /* For FP zero change it to a CONST_INT 0 and use the integer SIMD move immediate path. */ if (aarch64_float_const_zero_rtx_p (info.value)) *************** aarch64_output_simd_mov_immediate (rtx c *** 13278,13284 **** char float_buf[buf_size] = {'\0'}; real_to_decimal_for_mode (float_buf, CONST_DOUBLE_REAL_VALUE (info.value), ! buf_size, buf_size, 1, mode); if (lane_count == 1) snprintf (templ, sizeof (templ), "fmov\t%%d0, %s", float_buf); --- 13297,13303 ---- char float_buf[buf_size] = {'\0'}; real_to_decimal_for_mode (float_buf, CONST_DOUBLE_REAL_VALUE (info.value), ! buf_size, buf_size, 1, info.elt_mode); if (lane_count == 1) snprintf (templ, sizeof (templ), "fmov\t%%d0, %s", float_buf); *************** aarch64_output_simd_mov_immediate (rtx c *** 13293,13300 **** if (which == AARCH64_CHECK_MOV) { ! mnemonic = info.mvn ? "mvni" : "movi"; ! shift_op = info.msl ? "msl" : "lsl"; if (lane_count == 1) snprintf (templ, sizeof (templ), "%s\t%%d0, " HOST_WIDE_INT_PRINT_HEX, mnemonic, UINTVAL (info.value)); --- 13312,13319 ---- if (which == AARCH64_CHECK_MOV) { ! mnemonic = info.insn == simd_immediate_info::MVN ? "mvni" : "movi"; ! shift_op = info.modifier == simd_immediate_info::MSL ? "msl" : "lsl"; if (lane_count == 1) snprintf (templ, sizeof (templ), "%s\t%%d0, " HOST_WIDE_INT_PRINT_HEX, mnemonic, UINTVAL (info.value)); *************** aarch64_output_simd_mov_immediate (rtx c *** 13310,13316 **** else { /* For AARCH64_CHECK_BIC and AARCH64_CHECK_ORR. */ ! mnemonic = info.mvn ? "bic" : "orr"; if (info.shift) snprintf (templ, sizeof (templ), "%s\t%%0.%d%c, #" HOST_WIDE_INT_PRINT_DEC ", %s #%d", mnemonic, lane_count, --- 13329,13335 ---- else { /* For AARCH64_CHECK_BIC and AARCH64_CHECK_ORR. */ ! mnemonic = info.insn == simd_immediate_info::MVN ? "bic" : "orr"; if (info.shift) snprintf (templ, sizeof (templ), "%s\t%%0.%d%c, #" HOST_WIDE_INT_PRINT_DEC ", %s #%d", mnemonic, lane_count, *************** aarch64_output_scalar_simd_mov_immediate *** 13344,13350 **** vmode = aarch64_simd_container_mode (mode, width); rtx v_op = aarch64_simd_gen_const_vector_dup (vmode, INTVAL (immediate)); ! return aarch64_output_simd_mov_immediate (v_op, vmode, width); } /* Split operands into moves from op[1] + op[2] into op[0]. */ --- 13363,13369 ---- vmode = aarch64_simd_container_mode (mode, width); rtx v_op = aarch64_simd_gen_const_vector_dup (vmode, INTVAL (immediate)); ! return aarch64_output_simd_mov_immediate (v_op, width); } /* Split operands into moves from op[1] + op[2] into op[0]. */ Index: gcc/testsuite/gcc.target/aarch64/vect-movi.c =================================================================== *** gcc/testsuite/gcc.target/aarch64/vect-movi.c 2017-10-27 14:05:38.185854661 +0100 --- gcc/testsuite/gcc.target/aarch64/vect-movi.c 2017-10-27 14:11:56.995515870 +0100 *************** mvni_msl16 (int *__restrict a) *** 45,54 **** --- 45,65 ---- a[i] = 0xff540000; } + static void + movi_float_lsl24 (float * a) + { + int i; + + /* { dg-final { scan-assembler {\tmovi\tv[0-9]+\.[42]s, 0x43, lsl 24\n} } } */ + for (i = 0; i < N; i++) + a[i] = 128.0; + } + int main (void) { int a[N] = { 0 }; + float b[N] = { 0 }; int i; #define CHECK_ARRAY(a, val) \ *************** #define CHECK_ARRAY(a, val) \ *** 68,73 **** --- 79,87 ---- mvni_msl16 (a); CHECK_ARRAY (a, 0xff540000); + movi_float_lsl24 (b); + CHECK_ARRAY (b, 128.0); + return 0; }