From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21607 invoked by alias); 29 Sep 2014 22:26:22 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 21596 invoked by uid 89); 29 Sep 2014 22:26:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: e38.co.us.ibm.com Received: from e38.co.us.ibm.com (HELO e38.co.us.ibm.com) (32.97.110.159) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 29 Sep 2014 22:26:20 +0000 Received: from /spool/local by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 29 Sep 2014 16:26:17 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e38.co.us.ibm.com (192.168.1.138) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 29 Sep 2014 16:26:15 -0600 Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 845BD19D803E for ; Mon, 29 Sep 2014 16:14:59 -0600 (MDT) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s8TMQF1q49414348 for ; Tue, 30 Sep 2014 00:26:15 +0200 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s8TMQEfG028738 for ; Mon, 29 Sep 2014 16:26:14 -0600 Received: from [9.80.9.80] ([9.80.9.80]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id s8TMQD5N028675; Mon, 29 Sep 2014 16:26:14 -0600 Message-ID: <1412029574.2986.42.camel@gnopaine> Subject: [PATCH, rs6000] Generate LE code for vec_lvsl and vec_lvsr that is compatible with BE code From: Bill Schmidt To: gcc-patches@gcc.gnu.org Cc: dje.gcc@gmail.com Date: Mon, 29 Sep 2014 22:26:00 -0000 Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14092922-1344-0000-0000-0000048EC960 X-IsSubscribed: yes X-SW-Source: 2014-09/txt/msg02588.txt.bz2 Hi, Up till now we have not attempted to generate code for LE usage of vec_lvsl and vec_lvsr that is compatible with expected BE usage. The LE code sequence corresponding to lvsl/vperm is not good, and we encourage programmers to convert those sequences to use direct assignment and the type system for unaligned loads. However, the issue comes up frequently enough that it seems best to provide this sequence together with a warning message (in a previous patch submission) to avoid confusion. The method used in this patch is to perform a byte-reversal of the result of the lvsl/lvsr. This is accomplished by loading the vector char constant {0,1,...,15}, which will appear in the register from left to right as {15,...,1,0}. A vperm instruction (which uses BE element ordering) is applied to the result of the lvsl/lvsr using the loaded constant as the permute control vector. Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. Is this ok for trunk? Thanks, Bill [gcc] 2014-09-29 Bill Schmidt * altivec.md (altivec_lvsl): New define_expand. (altivec_lvsl_direct): Rename define_insn from altivec_lvsl. (altivec_lvsr): New define_expand. (altivec_lvsr_direct): Rename define_insn from altivec_lvsr. * rs6000.c (rs6000_expand_builtin): Change to use altivec_lvs[lr]_direct; remove commented-out code. [gcc/testsuite] 2014-09-29 Bill Schmidt * gcc.target/powerpc/lvsl-lvsr.c: New test. Index: gcc/config/rs6000/altivec.md =================================================================== --- gcc/config/rs6000/altivec.md (revision 215689) +++ gcc/config/rs6000/altivec.md (working copy) @@ -2297,7 +2297,32 @@ "dststt %0,%1,%2" [(set_attr "type" "vecsimple")]) -(define_insn "altivec_lvsl" +(define_expand "altivec_lvsl" + [(use (match_operand:V16QI 0 "register_operand" "")) + (use (match_operand:V16QI 1 "memory_operand" "Z"))] + "TARGET_ALTIVEC" + " +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_lvsl_direct (operands[0], operands[1])); + else + { + int i; + rtx mask, perm[16], constv, vperm; + mask = gen_reg_rtx (V16QImode); + emit_insn (gen_altivec_lvsl_direct (mask, operands[1])); + for (i = 0; i < 16; ++i) + perm[i] = GEN_INT (i); + constv = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, perm)); + constv = force_reg (V16QImode, constv); + vperm = gen_rtx_UNSPEC (V16QImode, gen_rtvec (3, mask, mask, constv), + UNSPEC_VPERM); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], vperm)); + } + DONE; +}") + +(define_insn "altivec_lvsl_direct" [(set (match_operand:V16QI 0 "register_operand" "=v") (unspec:V16QI [(match_operand:V16QI 1 "memory_operand" "Z")] UNSPEC_LVSL))] @@ -2305,7 +2330,32 @@ "lvsl %0,%y1" [(set_attr "type" "vecload")]) -(define_insn "altivec_lvsr" +(define_expand "altivec_lvsr" + [(use (match_operand:V16QI 0 "register_operand" "")) + (use (match_operand:V16QI 1 "memory_operand" "Z"))] + "TARGET_ALTIVEC" + " +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_lvsr_direct (operands[0], operands[1])); + else + { + int i; + rtx mask, perm[16], constv, vperm; + mask = gen_reg_rtx (V16QImode); + emit_insn (gen_altivec_lvsr_direct (mask, operands[1])); + for (i = 0; i < 16; ++i) + perm[i] = GEN_INT (i); + constv = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, perm)); + constv = force_reg (V16QImode, constv); + vperm = gen_rtx_UNSPEC (V16QImode, gen_rtvec (3, mask, mask, constv), + UNSPEC_VPERM); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], vperm)); + } + DONE; +}") + +(define_insn "altivec_lvsr_direct" [(set (match_operand:V16QI 0 "register_operand" "=v") (unspec:V16QI [(match_operand:V16QI 1 "memory_operand" "Z")] UNSPEC_LVSR))] Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 215689) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -13898,8 +13898,8 @@ rs6000_expand_builtin (tree exp, rtx target, rtx s case ALTIVEC_BUILTIN_MASK_FOR_LOAD: case ALTIVEC_BUILTIN_MASK_FOR_STORE: { - int icode = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr - : (int) CODE_FOR_altivec_lvsl); + int icode = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr_direct + : (int) CODE_FOR_altivec_lvsl_direct); enum machine_mode tmode = insn_data[icode].operand[0].mode; enum machine_mode mode = insn_data[icode].operand[1].mode; tree arg; @@ -13927,7 +13927,6 @@ rs6000_expand_builtin (tree exp, rtx target, rtx s || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) target = gen_reg_rtx (tmode); - /*pat = gen_altivec_lvsr (target, op);*/ pat = GEN_FCN (icode) (target, op); if (!pat) return 0; Index: gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c (working copy) @@ -0,0 +1,19 @@ +/* Test expected code generation for lvsl and lvsr on little endian. */ + +/* { dg-do compile { target { powerpc64le-*-* } } } */ +/* { dg-options "-O0 -Wno-deprecated" } */ +/* { dg-final { scan-assembler-times "lvsl" 1 } } */ +/* { dg-final { scan-assembler-times "lvsr" 1 } } */ +/* { dg-final { scan-assembler-times "lxvd2x" 2 } } */ +/* { dg-final { scan-assembler-times "vperm" 2 } } */ + + +#include + +float f[20]; + +void foo () +{ + vector unsigned char a = vec_lvsl (4, f); + vector unsigned char b = vec_lvsr (8, f); +}