From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 82681 invoked by alias); 20 Dec 2018 23:49:01 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 82375 invoked by uid 89); 20 Dec 2018 23:48:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS,TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.2 spammy=Hx-languages-length:2084 X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 20 Dec 2018 23:48:22 +0000 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wBKNSfvL014059 for ; Thu, 20 Dec 2018 18:48:07 -0500 Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by mx0a-001b2d01.pphosted.com with ESMTP id 2pgmfm0xch-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 20 Dec 2018 18:48:07 -0500 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 20 Dec 2018 23:48:07 -0000 Received: from b03cxnp08025.gho.boulder.ibm.com (9.17.130.17) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 20 Dec 2018 23:48:05 -0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wBKNm4en20906174 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 20 Dec 2018 23:48:05 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DE519C6062; Thu, 20 Dec 2018 23:48:04 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A8063C605A; Thu, 20 Dec 2018 23:47:54 +0000 (GMT) Received: from ragesh4.local (unknown [9.211.108.112]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 20 Dec 2018 23:47:54 +0000 (GMT) Subject: Re: [PATCH][rs6000] avoid using unaligned vsx or lxvd2x/stxvd2x for memcpy/memmove inline expansion To: Segher Boessenkool Cc: GCC Patches , David Edelsohn , Bill Schmidt References: <0a17416b-57a0-99e7-2e7e-90a63da66fe6@linux.ibm.com> <20181220095119.GP3803@gate.crashing.org> <30fd466c-43c7-86aa-81f2-181a9d9ca7fc@linux.ibm.com> <20181220234402.GX3803@gate.crashing.org> From: Aaron Sawdey Date: Fri, 21 Dec 2018 01:00:00 -0000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.3.3 MIME-Version: 1.0 In-Reply-To: <20181220234402.GX3803@gate.crashing.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit x-cbid: 18122023-0012-0000-0000-000016ED9588 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010257; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000272; SDB=6.01134685; UDB=6.00589978; IPR=6.00914858; MB=3.00024776; MTD=3.00000008; XFM=3.00000015; UTC=2018-12-20 23:48:06 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18122023-0013-0000-0000-000055884E5D Message-Id: X-IsSubscribed: yes X-SW-Source: 2018-12/txt/msg01522.txt.bz2 On 12/20/18 5:44 PM, Segher Boessenkool wrote: > On Thu, Dec 20, 2018 at 05:34:54PM -0600, Aaron Sawdey wrote: >> On 12/20/18 3:51 AM, Segher Boessenkool wrote: >>> On Wed, Dec 19, 2018 at 01:53:05PM -0600, Aaron Sawdey wrote: >>>> Because of POWER9 dd2.1 issues with certain unaligned vsx instructions >>>> to cache inhibited memory, here is a patch that keeps memmove (and memcpy) >>>> inline expansion from doing unaligned vector or using vector load/store >>>> other than lvx/stvx. More description of the issue is here: >>>> >>>> https://patchwork.ozlabs.org/patch/814059/ >>>> >>>> OK for trunk if bootstrap/regtest ok? >>> >>> Okay, but see below. >>> >> [snip] >>> >>> This is extraordinarily clumsy :-) Maybe something like: >>> >>> static rtx >>> gen_lvx_v4si_move (rtx dest, rtx src) >>> { >>> gcc_assert (!(MEM_P (dest) && MEM_P (src)); >>> gcc_assert (GET_MODE (dest) == V4SImode && GET_MODE (src) == V4SImode); >>> if (MEM_P (dest)) >>> return gen_altivec_stvx_v4si_internal (dest, src); >>> else if (MEM_P (src)) >>> return gen_altivec_lvx_v4si_internal (dest, src); >>> else >>> gcc_unreachable (); >>> } >>> >>> (Or do you allow VOIDmode for src as well?) Anyway, at least get rid of >>> the useless extra variable. >> >> I think this should be better: > > The gcc_unreachable at the end catches the non-mem to non-mem case. > >> static rtx >> gen_lvx_v4si_move (rtx dest, rtx src) >> { >> gcc_assert ((MEM_P (dest) && !MEM_P (src)) || (MEM_P (src) && !MEM_P(dest))); > > But if you prefer this, how about > > { > gcc_assert (MEM_P (dest) ^ MEM_P (src)); > gcc_assert (GET_MODE (dest) == V4SImode && GET_MODE (src) == V4SImode); > > if (MEM_P (dest)) > return gen_altivec_stvx_v4si_internal (dest, src); > else > return gen_altivec_lvx_v4si_internal (dest, src); > } > > :-) > > > Segher > I like that even better, thanks! -- Aaron Sawdey, Ph.D. acsawdey@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain