From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 86033 invoked by alias); 15 Aug 2017 21:20:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 81041 invoked by uid 89); 15 Aug 2017 21:20:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.4 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 15 Aug 2017 21:20:04 +0000 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v7FLJ8Tn174970 for ; Tue, 15 Aug 2017 17:19:58 -0400 Received: from e38.co.us.ibm.com (e38.co.us.ibm.com [32.97.110.159]) by mx0b-001b2d01.pphosted.com with ESMTP id 2cc6f9608h-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 15 Aug 2017 17:19:58 -0400 Received: from localhost by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 15 Aug 2017 15:19:57 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (9.17.130.16) by e38.co.us.ibm.com (192.168.1.138) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 15 Aug 2017 15:19:55 -0600 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v7FLJtpu7995864; Tue, 15 Aug 2017 14:19:55 -0700 Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 209C06A03F; Tue, 15 Aug 2017 15:19:55 -0600 (MDT) Received: from bigmac.rchland.ibm.com (unknown [9.10.86.172]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTPS id C7BAF6A03B; Tue, 15 Aug 2017 15:19:54 -0600 (MDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: [PATCH, rs6000] Fix endianness issue with vmrgew and vmrgow permute constant recognition From: Bill Schmidt In-Reply-To: Date: Tue, 15 Aug 2017 22:09:00 -0000 Cc: Segher Boessenkool , David Edelsohn , cel@linux.vnet.ibm.com Content-Transfer-Encoding: quoted-printable References: To: GCC Patches X-TM-AS-GCONF: 00 x-cbid: 17081521-0028-0000-0000-00000831140C X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007551; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000221; SDB=6.00902831; UDB=6.00452212; IPR=6.00683046; BA=6.00005534; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016710; XFM=3.00000015; UTC=2017-08-15 21:19:57 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17081521-0029-0000-0000-00003723BB5C Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-08-15_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1708150351 X-IsSubscribed: yes X-SW-Source: 2017-08/txt/msg00969.txt.bz2 Hi, I forgot to mention that I didn't include a test case. Carl's upcoming pat= ch will cause this to be well tested with the existing test suite, so I think that's not = needed. Let me know if you disagree. Thanks, Bill > On Aug 15, 2017, at 4:14 PM, Bill Schmidt w= rote: >=20 > Hi, >=20 > One of Carl Love's proposed built-in function patches exposed a bug in th= e Power > code that recognizes specific permute control vector patterns for a permu= te, and > changes the permute to a more specific and more efficient instruction. T= he > patterns for p8_vmrgew_v4si and p8_vmrgow are generated regardless of end= ianness, > leading to problems on the little-endian port. >=20 > The normal way that would cause us to generate these patterns is via the > vec_widen_[su]mult_{even,odd}_ interfaces, which are not yet instan= tiated > for Power; hence it appears that we've gotten lucky not to run into this = before. > Carl's proposed patch instantiated these interfaces, triggering the disco= very of > the problem. >=20 > This patch simply changes the handling for p8_vmrg[eo]w to match how it's= done > for all of the other common pack/merge/etc. patterns. >=20 > In altivec.md, we already had a p8_vmrgew_v4sf_direct insn that does what= we want. > I generalized this for both V4SF and V4SI modes. I then added a similar > p8_vmrgow__direct define_insn. >=20 > The use in rs6000.c of p8_vmrgew_v4sf_direct, rather than p8_vmrgew_v4si_= direct, > is arbitrary. The existing code already handles converting (for free) a = V4SI > operand to a V4SF one, so there's no need to specify the mode directly; a= nd it > would actually complicate the code to extract the mode so the "proper" pa= ttern > would match. I think what I have here is better, but if you disagree I c= an > change it. >=20 > Bootstrapped and tested on powerpc64le-linux-gnu (P8 64-bit) and on > powerpc64-linux-gnu (P7 32- and 64-bit) with no regressions. Is this oka= y for > trunk? >=20 > Thanks, > Bill >=20 >=20 > 2017-08-15 Bill Schmidt >=20 > * config/rs6000/altivec.md (UNSPEC_VMRGOW_DIRECT): New constant. > (p8_vmrgew_v4sf_direct): Generalize to p8_vmrgew__direct. > (p8_vmrgow__direct): New define_insn. > * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Properly > handle endianness for vmrgew and vmrgow permute patterns. >=20 >=20 > Index: gcc/config/rs6000/altivec.md > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- gcc/config/rs6000/altivec.md (revision 250965) > +++ gcc/config/rs6000/altivec.md (working copy) > @@ -148,6 +148,7 @@ > UNSPEC_VMRGL_DIRECT > UNSPEC_VSPLT_DIRECT > UNSPEC_VMRGEW_DIRECT > + UNSPEC_VMRGOW_DIRECT > UNSPEC_VSUMSWS_DIRECT > UNSPEC_VADDCUQ > UNSPEC_VADDEUQM > @@ -1357,15 +1358,24 @@ > } > [(set_attr "type" "vecperm")]) >=20 > -(define_insn "p8_vmrgew_v4sf_direct" > - [(set (match_operand:V4SF 0 "register_operand" "=3Dv") > - (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v") > - (match_operand:V4SF 2 "register_operand" "v")] > +(define_insn "p8_vmrgew__direct" > + [(set (match_operand:VSX_W 0 "register_operand" "=3Dv") > + (unspec:VSX_W [(match_operand:VSX_W 1 "register_operand" "v") > + (match_operand:VSX_W 2 "register_operand" "v")] > UNSPEC_VMRGEW_DIRECT))] > "TARGET_P8_VECTOR" > "vmrgew %0,%1,%2" > [(set_attr "type" "vecperm")]) >=20 > +(define_insn "p8_vmrgow__direct" > + [(set (match_operand:VSX_W 0 "register_operand" "=3Dv") > + (unspec:VSX_W [(match_operand:VSX_W 1 "register_operand" "v") > + (match_operand:VSX_W 2 "register_operand" "v")] > + UNSPEC_VMRGOW_DIRECT))] > + "TARGET_P8_VECTOR" > + "vmrgow %0,%1,%2" > + [(set_attr "type" "vecperm")]) > + > (define_expand "vec_widen_umult_even_v16qi" > [(use (match_operand:V8HI 0 "register_operand" "")) > (use (match_operand:V16QI 1 "register_operand" "")) > Index: gcc/config/rs6000/rs6000.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- gcc/config/rs6000/rs6000.c (revision 250965) > +++ gcc/config/rs6000/rs6000.c (working copy) > @@ -35209,9 +35209,13 @@ altivec_expand_vec_perm_const (rtx operands[4]) > (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw_direct > : CODE_FOR_altivec_vmrghw_direct), > { 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } = }, > - { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew_v4si, > + { OPTION_MASK_P8_VECTOR, > + (BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgew_v4sf_direct > + : CODE_FOR_p8_vmrgow_v4sf_direct), > { 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27 } = }, > - { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgow, > + { OPTION_MASK_P8_VECTOR, > + (BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgow_v4sf_direct > + : CODE_FOR_p8_vmrgew_v4sf_direct), > { 4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31 } } > }; >=20 >=20