From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rearnsha@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id D60DD3858405
 for <gcc-patches@gcc.gnu.org>; Fri, 10 Sep 2021 14:48:55 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D60DD3858405
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 79E9B1FB;
 Fri, 10 Sep 2021 07:48:55 -0700 (PDT)
Received: from e126323.arm.com (unknown [10.57.22.219])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F2C3D3F59C;
 Fri, 10 Sep 2021 07:48:54 -0700 (PDT)
From: Richard Earnshaw <rearnsha@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: Richard Earnshaw <rearnsha@arm.com>
Subject: [PATCH v3 3/3] gimple: allow more folding of memcpy [PR102125]
Date: Fri, 10 Sep 2021 15:48:41 +0100
Message-Id: <20210910144841.3139174-4-rearnsha@arm.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20210910144841.3139174-1-rearnsha@arm.com>
References: <20210910144841.3139174-1-rearnsha@arm.com>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.25.1"
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Sep 2021 14:48:57 -0000

This is a multi-part message in MIME format.
--------------2.25.1
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


The current restriction on folding memcpy to a single element of size
MOVE_MAX is excessively cautious on most machines and limits some
significant further optimizations.  So relax the restriction provided
the copy size does not exceed MOVE_MAX * MOVE_RATIO and that a SET
insn exists for moving the value into machine registers.

Note that there were already checks in place for having misaligned
move operations when one or more of the operands were unaligned.

On Arm this now permits optimizing

uint64_t bar64(const uint8_t *rData1)
{
    uint64_t buffer;
    memcpy(&buffer, rData1, sizeof(buffer));
    return buffer;
}

from
        ldr     r2, [r0]        @ unaligned
        sub     sp, sp, #8
        ldr     r3, [r0, #4]    @ unaligned
        strd    r2, [sp]
        ldrd    r0, [sp]
        add     sp, sp, #8

to
        mov     r3, r0
        ldr     r0, [r0]        @ unaligned
        ldr     r1, [r3, #4]    @ unaligned

PR target/102125 - (ARM Cortex-M3 and newer) missed optimization. memcpy not needed operations

gcc/ChangeLog:

	PR target/102125
	* gimple-fold.c (gimple_fold_builtin_memory_op): Allow folding
	memcpy if the size is not more than MOVE_MAX * MOVE_RATIO.
---
 gcc/gimple-fold.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)


--------------2.25.1
Content-Type: text/x-patch; name="v3-0003-gimple-allow-more-folding-of-memcpy-PR102125.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="v3-0003-gimple-allow-more-folding-of-memcpy-PR102125.patch"

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 3f2c176cff6..d9ffb5006f5 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -67,6 +67,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-vector-builder.h"
 #include "tree-ssa-strlen.h"
 #include "varasm.h"
+#include "memmodel.h"
+#include "optabs.h"
 
 enum strlen_range_kind {
   /* Compute the exact constant string length.  */
@@ -957,14 +959,17 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	= build_int_cst (build_pointer_type_for_mode (char_type_node,
 						      ptr_mode, true), 0);
 
-      /* If we can perform the copy efficiently with first doing all loads
-         and then all stores inline it that way.  Currently efficiently
-	 means that we can load all the memory into a single integer
-	 register which is what MOVE_MAX gives us.  */
+      /* If we can perform the copy efficiently with first doing all loads and
+	 then all stores inline it that way.  Currently efficiently means that
+	 we can load all the memory with a single set operation and that the
+	 total size is less than MOVE_MAX * MOVE_RATIO.  */
       src_align = get_pointer_alignment (src);
       dest_align = get_pointer_alignment (dest);
       if (tree_fits_uhwi_p (len)
-	  && compare_tree_int (len, MOVE_MAX) <= 0
+	  && (compare_tree_int
+	      (len, (MOVE_MAX
+		     * MOVE_RATIO (optimize_function_for_size_p (cfun))))
+	      <= 0)
 	  /* FIXME: Don't transform copies from strings with known length.
 	     Until GCC 9 this prevented a case in gcc.dg/strlenopt-8.c
 	     from being handled, and the case was XFAILed for that reason.
@@ -1000,6 +1005,7 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
 	      if (type
 		  && is_a <scalar_int_mode> (TYPE_MODE (type), &mode)
 		  && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
+		  && have_insn_for (SET, mode)
 		  /* If the destination pointer is not aligned we must be able
 		     to emit an unaligned store.  */
 		  && (dest_align >= GET_MODE_ALIGNMENT (mode)

--------------2.25.1--