From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-492837-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 51655 invoked by alias); 19 Dec 2018 19:53:23 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 51634 invoked by uid 89); 19 Dec 2018 19:53:22 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-12.6 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=PhD, phd, Aaron, aaron
X-HELO: mx0a-001b2d01.pphosted.com
Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 19 Dec 2018 19:53:20 +0000
Received: from pps.filterd (m0098410.ppops.net [127.0.0.1])	by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wBJJnIId023911	for <gcc-patches@gcc.gnu.org>; Wed, 19 Dec 2018 14:53:19 -0500
Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154])	by mx0a-001b2d01.pphosted.com with ESMTP id 2pfsqnhrbb-1	(version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT)	for <gcc-patches@gcc.gnu.org>; Wed, 19 Dec 2018 14:53:19 -0500
Received: from localhost	by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted	for <gcc-patches@gcc.gnu.org> from <acsawdey@linux.ibm.com>;	Wed, 19 Dec 2018 19:53:18 -0000
Received: from b03cxnp07029.gho.boulder.ibm.com (9.17.130.16)	by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;	(version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256)	Wed, 19 Dec 2018 19:53:17 -0000
Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237])	by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wBJJrG0e20185306	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL);	Wed, 19 Dec 2018 19:53:16 GMT
Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1])	by IMSVA (Postfix) with ESMTP id 78DD2C6065;	Wed, 19 Dec 2018 19:53:16 +0000 (GMT)
Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1])	by IMSVA (Postfix) with ESMTP id 48F82C6055;	Wed, 19 Dec 2018 19:53:06 +0000 (GMT)
Received: from ragesh4.local (unknown [9.211.108.112])	by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP;	Wed, 19 Dec 2018 19:53:06 +0000 (GMT)
To: GCC Patches <gcc-patches@gcc.gnu.org>,        Segher Boessenkool <segher@kernel.crashing.org>,        David Edelsohn <dje.gcc@gmail.com>,        Bill Schmidt <wschmidt@linux.ibm.com>
From: Aaron Sawdey <acsawdey@linux.ibm.com>
Subject: [PATCH][rs6000] avoid using unaligned vsx or lxvd2x/stxvd2x for memcpy/memmove inline expansion
Date: Wed, 19 Dec 2018 19:53:00 -0000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.3.3
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
x-cbid: 18121919-0020-0000-0000-00000E9CCC12
X-IBM-SpamModules-Scores:
X-IBM-SpamModules-Versions: BY=3.00010250; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000271; SDB=6.01134127; UDB=6.00589643; IPR=6.00914301; MB=3.00024755; MTD=3.00000008; XFM=3.00000015; UTC=2018-12-19 19:53:18
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 18121919-0021-0000-0000-0000641C722E
Message-Id: <0a17416b-57a0-99e7-2e7e-90a63da66fe6@linux.ibm.com>
X-IsSubscribed: yes
X-SW-Source: 2018-12/txt/msg01411.txt.bz2

Because of POWER9 dd2.1 issues with certain unaligned vsx instructions
to cache inhibited memory, here is a patch that keeps memmove (and memcpy)
inline expansion from doing unaligned vector or using vector load/store
other than lvx/stvx. More description of the issue is here:

https://patchwork.ozlabs.org/patch/814059/

OK for trunk if bootstrap/regtest ok?

Thanks!
   Aaron

2018-12-19  Aaron Sawdey  <acsawdey@linux.ibm.com>

	* config/rs6000/rs6000-string.c (expand_block_move): Don't use
	unaligned vsx and avoid lxvd2x/stxvd2x.
	(gen_lvx_v4si_move): New function.


Index: gcc/config/rs6000/rs6000-string.c
===================================================================
--- gcc/config/rs6000/rs6000-string.c	(revision 267055)
+++ gcc/config/rs6000/rs6000-string.c	(working copy)
@@ -2669,6 +2669,35 @@
   return true;
 }

+/* Generate loads and stores for a move of v4si mode using lvx/stvx.
+   This uses altivec_{l,st}vx_<mode>_internal which use unspecs to
+   keep combine from changing what instruction gets used.
+
+   DEST is the destination for the data.
+   SRC is the source of the data for the move.  */
+
+static rtx
+gen_lvx_v4si_move (rtx dest, rtx src)
+{
+  rtx rv = NULL;
+  if (MEM_P (dest))
+    {
+      gcc_assert (!MEM_P (src));
+      gcc_assert (GET_MODE (src) == V4SImode);
+      rv = gen_altivec_stvx_v4si_internal (dest, src);
+    }
+  else if (MEM_P (src))
+    {
+      gcc_assert (!MEM_P (dest));
+      gcc_assert (GET_MODE (dest) == V4SImode);
+      rv = gen_altivec_lvx_v4si_internal (dest, src);
+    }
+  else
+    gcc_unreachable ();
+
+  return rv;
+}
+
 /* Expand a block move operation, and return 1 if successful.  Return 0
    if we should let the compiler generate normal code.

@@ -2721,11 +2750,11 @@

       /* Altivec first, since it will be faster than a string move
 	 when it applies, and usually not significantly larger.  */
-      if (TARGET_ALTIVEC && bytes >= 16 && (TARGET_EFFICIENT_UNALIGNED_VSX || align >= 128))
+      if (TARGET_ALTIVEC && bytes >= 16 && align >= 128)
 	{
 	  move_bytes = 16;
 	  mode = V4SImode;
-	  gen_func.mov = gen_movv4si;
+	  gen_func.mov = gen_lvx_v4si_move;
 	}
       else if (bytes >= 8 && TARGET_POWERPC64
 	       && (align >= 64 || !STRICT_ALIGNMENT))


-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain