From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx07-00178001.pphosted.com (mx07-00178001.pphosted.com [185.132.182.106]) by sourceware.org (Postfix) with ESMTPS id 5D6AC385F01C for ; Fri, 16 Jul 2021 14:32:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5D6AC385F01C Received: from pps.filterd (m0241204.ppops.net [127.0.0.1]) by mx07-00178001.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16GEQVY8002240; Fri, 16 Jul 2021 16:32:34 +0200 Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx07-00178001.pphosted.com with ESMTP id 39tw1gvbh6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 16 Jul 2021 16:32:34 +0200 Received: from euls16034.sgp.st.com (euls16034.sgp.st.com [10.75.44.20]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 95CFE10002A; Fri, 16 Jul 2021 16:32:33 +0200 (CEST) Received: from Webmail-eu.st.com (sfhdag2node3.st.com [10.75.127.6]) by euls16034.sgp.st.com (STMicroelectronics) with ESMTP id 6A8B722F7D1; Fri, 16 Jul 2021 16:32:33 +0200 (CEST) Received: from [10.131.138.188] (10.75.127.44) by SFHDAG2NODE3.st.com (10.75.127.6) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 16 Jul 2021 16:32:32 +0200 Subject: Re: [PATCH 1/2] arm: Fix vcond_mask expander for MVE (PR target/100757) To: gcc Patches , References: <20210608213844.6218-1-christophe.lyon@linaro.org> From: Christophe LYON Message-ID: Date: Fri, 16 Jul 2021 16:32:32 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Originating-IP: [10.75.127.44] X-ClientProxiedBy: SFHDAG1NODE3.st.com (10.75.127.3) To SFHDAG2NODE3.st.com (10.75.127.6) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-16_05:2021-07-16, 2021-07-16 signatures=0 X-Spam-Status: No, score=-8.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Jul 2021 14:32:39 -0000 On 16/07/2021 16:06, Richard Sandiford via Gcc-patches wrote: > Hi, > > Sorry for the slow review. I'd initially held off from reviewing > because it sounded like you were trying to treat predicates as > MODE_VECTOR_BOOL instead. Is that right? If so, how did that go? Yes, that's part of PR 101325. It's still WIP as I wrote in the PR, I'm not sure I got it right yet. At the moment it seems it would imply a lot of changes, I'll have to look at AArch64's implementation in more details. I hoped this fix could be merged before switching to MODE_VECTOR_BOOL. > It does feel like the right long-term direction. Treating masks as > integers for AVX seems to make some things more difficult than they > should be. Also, RTL like: OK, I see, good to know. > >> +(define_expand "vec_cmphi" >> + [(set (match_operand:HI 0 "s_register_operand") >> + (match_operator:HI 1 "comparison_operator" >> + [(match_operand:MVE_VLD_ST 2 "s_register_operand") >> + (match_operand:MVE_VLD_ST 3 "reg_or_zero_operand")]))] >> + "TARGET_HAVE_MVE >> + && (! || flag_unsafe_math_optimizations)" >> +{ >> + arm_expand_vector_compare (operands[0], GET_CODE (operands[1]), >> + operands[2], operands[3], false, false); >> + DONE; >> +}) > seems kind-of suspect, since (as I think you said in a PR), > (eq:HI X Y) would normally be a single boolean. Having MODE_VECTOR_BOOL > means that we can represent the comparisons “properly”, even in > define_insns. > > But since this usage is confined to define_expands, I guess it > doesn't matter much for the purposes of this patch. Any switch > to MODE_VECTOR_BOOL would leave most of the patch in tact (contrary > to what I'd initially assumed). > >> @@ -31061,13 +31065,7 @@ arm_expand_vector_compare (rtx target, rtx_code code, rtx op0, rtx op1, >> >> /* If we are not expanding a vcond, build the result here. */ >> if (!vcond_mve) >> - { >> - rtx zero = gen_reg_rtx (cmp_result_mode); >> - rtx one = gen_reg_rtx (cmp_result_mode); >> - emit_move_insn (zero, CONST0_RTX (cmp_result_mode)); >> - emit_move_insn (one, CONST1_RTX (cmp_result_mode)); >> - emit_insn (gen_mve_vpselq (VPSELQ_S, cmp_result_mode, target, one, zero, vpr_p0)); >> - } >> + emit_move_insn (target, vpr_p0); > The code above this is: > > if (vcond_mve) > vpr_p0 = target; > else > vpr_p0 = gen_reg_rtx (HImode); > > so couldn't we simply use vpr_p0 = target unconditionally (or drop > vpr_p0 altogether)? Same for the other cases. Probably, I'll check that. >> @@ -31178,20 +31164,21 @@ arm_expand_vcond (rtx *operands, machine_mode cmp_result_mode) >> mask, operands[1], operands[2])); >> else >> { >> - machine_mode cmp_mode = GET_MODE (operands[4]); >> + machine_mode cmp_mode = GET_MODE (operands[0]); >> rtx vpr_p0 = mask; >> - rtx zero = gen_reg_rtx (cmp_mode); >> - rtx one = gen_reg_rtx (cmp_mode); >> - emit_move_insn (zero, CONST0_RTX (cmp_mode)); >> - emit_move_insn (one, CONST1_RTX (cmp_mode)); >> + >> switch (GET_MODE_CLASS (cmp_mode)) >> { >> case MODE_VECTOR_INT: >> - emit_insn (gen_mve_vpselq (VPSELQ_S, cmp_result_mode, operands[0], one, zero, vpr_p0)); >> + emit_insn (gen_mve_vpselq (VPSELQ_S, cmp_mode, operands[0], >> + operands[1], operands[2], vpr_p0)); >> break; >> case MODE_VECTOR_FLOAT: >> if (TARGET_HAVE_MVE_FLOAT) >> - emit_insn (gen_mve_vpselq_f (cmp_mode, operands[0], one, zero, vpr_p0)); >> + emit_insn (gen_mve_vpselq_f (cmp_mode, operands[0], >> + operands[1], operands[2], vpr_p0)); >> + else >> + gcc_unreachable (); >> break; >> default: >> gcc_unreachable (); > Here too vpr_p0 feels a bit redundant now. Indeed, but it seemed clearer to me :-) >> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md >> index e393518ea88..a9840408bdd 100644 >> --- a/gcc/config/arm/mve.md >> +++ b/gcc/config/arm/mve.md >> @@ -10516,3 +10516,58 @@ (define_insn "*movmisalign_mve_load" >> "vldr.\t%q0, %E1" >> [(set_attr "type" "mve_load")] >> ) >> + >> +;; Expanders for vec_cmp and vcond >> + >> +(define_expand "vec_cmphi" >> + [(set (match_operand:HI 0 "s_register_operand") >> + (match_operator:HI 1 "comparison_operator" >> + [(match_operand:MVE_VLD_ST 2 "s_register_operand") >> + (match_operand:MVE_VLD_ST 3 "reg_or_zero_operand")]))] >> + "TARGET_HAVE_MVE >> + && (! || flag_unsafe_math_optimizations)" > Is flag_unsafe_math_optimizations needed for MVE? For Neon we had > it because of flush to zero (at least, I think that was the only reason). Right, I inherited this from the vec_cmp in neon.md. I do not have the ARM ARM at hand right now to check. However, your question makes me wonder about the other vec_cmp pattern in vec-common.md, which is common to Neon and MVE. It might need to be adjusted too. >> +{ >> + arm_expand_vector_compare (operands[0], GET_CODE (operands[1]), >> + operands[2], operands[3], false, false); >> + DONE; >> +}) > The (snipped) tests look good, but if we do support > !flag_unsafe_math_optimizations, it would be good to have some tests > for unordered comparisons too. E.g.: > > #define ordered(A, B) (!__builtin_isunordered (A, B)) > #define unordered(A, B) (__builtin_isunordered (A, B)) > #define ueq(A, B) (!__builtin_islessgreater (A, B)) > #define ult(A, B) (__builtin_isless (A, B)) > #define ule(A, B) (__builtin_islessequal (A, B)) > #define uge(A, B) (__builtin_isgreaterequal (A, B)) > #define ugt(A, B) (__builtin_isgreater (A, B)) > #define nueq(A, B) (__builtin_islessgreater (A, B)) > #define nult(A, B) (!__builtin_isless (A, B)) > #define nule(A, B) (!__builtin_islessequal (A, B)) > #define nuge(A, B) (!__builtin_isgreaterequal (A, B)) > #define nugt(A, B) (!__builtin_isgreater (A, B)) > > The main thing is just to check that they don't ICE. It might be > difficult to check the assembly for correctness. Thanks I'll check that in August, when I'm back from holidays. Christophe > > Thanks, > Richard