From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id A764D3858405 for ; Tue, 24 Aug 2021 13:03:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A764D3858405 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 17OD27cH003242; Tue, 24 Aug 2021 08:02:07 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 17OD26jR003239; Tue, 24 Aug 2021 08:02:06 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Tue, 24 Aug 2021 08:02:06 -0500 From: Segher Boessenkool To: "Kewen.Lin" Cc: wschmidt@linux.ibm.com, David Edelsohn , GCC Patches Subject: Re: [PATCH v2] rs6000: Add vec_unpacku_{hi,lo}_v4si Message-ID: <20210824130206.GE1583@gate.crashing.org> References: <0068d8dd-0e30-e78f-8893-dd24f0f5250a@linux.ibm.com> <48675437-01a4-af04-3cb0-8f68c675b9eb@linux.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, TXREP, T_SPF_HELO_PERMERROR, T_SPF_PERMERROR autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Aug 2021 13:03:18 -0000 Hi Ke Wen, On Mon, Aug 09, 2021 at 10:53:00AM +0800, Kewen.Lin wrote: > on 2021/8/6 下午9:10, Bill Schmidt wrote: > > On 8/4/21 9:06 PM, Kewen.Lin wrote: > >> The existing vec_unpacku_{hi,lo} supports emulated unsigned > >> unpacking for short and char but misses the support for int. > >> This patch adds the support for vec_unpacku_{hi,lo}_v4si. > * config/rs6000/altivec.md (vec_unpacku_hi_v16qi): Remove. > (vec_unpacku_hi_v8hi): Likewise. > (vec_unpacku_lo_v16qi): Likewise. > (vec_unpacku_lo_v8hi): Likewise. > (vec_unpacku_hi_): New define_expand. > (vec_unpacku_lo_): Likewise. > -(define_expand "vec_unpacku_hi_v16qi" > - [(set (match_operand:V8HI 0 "register_operand" "=v") > - (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v")] > - UNSPEC_VUPKHUB))] > - "TARGET_ALTIVEC" > -{ > - rtx vzero = gen_reg_rtx (V8HImode); > - rtx mask = gen_reg_rtx (V16QImode); > - rtvec v = rtvec_alloc (16); > - bool be = BYTES_BIG_ENDIAN; > - > - emit_insn (gen_altivec_vspltish (vzero, const0_rtx)); > - > - RTVEC_ELT (v, 0) = gen_rtx_CONST_INT (QImode, be ? 16 : 7); > - RTVEC_ELT (v, 1) = gen_rtx_CONST_INT (QImode, be ? 0 : 16); > - RTVEC_ELT (v, 2) = gen_rtx_CONST_INT (QImode, be ? 16 : 6); > - RTVEC_ELT (v, 3) = gen_rtx_CONST_INT (QImode, be ? 1 : 16); > - RTVEC_ELT (v, 4) = gen_rtx_CONST_INT (QImode, be ? 16 : 5); > - RTVEC_ELT (v, 5) = gen_rtx_CONST_INT (QImode, be ? 2 : 16); > - RTVEC_ELT (v, 6) = gen_rtx_CONST_INT (QImode, be ? 16 : 4); > - RTVEC_ELT (v, 7) = gen_rtx_CONST_INT (QImode, be ? 3 : 16); > - RTVEC_ELT (v, 8) = gen_rtx_CONST_INT (QImode, be ? 16 : 3); > - RTVEC_ELT (v, 9) = gen_rtx_CONST_INT (QImode, be ? 4 : 16); > - RTVEC_ELT (v, 10) = gen_rtx_CONST_INT (QImode, be ? 16 : 2); > - RTVEC_ELT (v, 11) = gen_rtx_CONST_INT (QImode, be ? 5 : 16); > - RTVEC_ELT (v, 12) = gen_rtx_CONST_INT (QImode, be ? 16 : 1); > - RTVEC_ELT (v, 13) = gen_rtx_CONST_INT (QImode, be ? 6 : 16); > - RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 0); > - RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16); > - > - emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v))); > - emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask)); > - DONE; > -}) So I wonder if all this still generates good code. The unspecs cannot be optimised properly, the RTL can (in principle, anyway: it is possible it makes more opportunities to use unpack etc. insns invisible than that it helps over unspec. This needs to be tested, and the usual idioms need testcases, is that what you add here? (/me reads on...) > + if (BYTES_BIG_ENDIAN) > + emit_insn (gen_altivec_vmrgh (res, vzero, op1)); > + else > + emit_insn (gen_altivec_vmrgl (res, op1, vzero)); Ah, so it is *not* using unspecs? Excellent. Okay for trunk. Thank you! Segher