From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 992B93858C39 for ; Fri, 10 Sep 2021 14:48:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 992B93858C39 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2C56B1FB; Fri, 10 Sep 2021 07:48:53 -0700 (PDT) Received: from e126323.arm.com (unknown [10.57.22.219]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 824473F59C; Fri, 10 Sep 2021 07:48:52 -0700 (PDT) From: Richard Earnshaw To: gcc-patches@gcc.gnu.org Cc: Richard Earnshaw , richard.guenther@gmail.com Subject: [PATCH v3 0/3] lower more cases of memcpy [PR102125] Date: Fri, 10 Sep 2021 15:48:38 +0100 Message-Id: <20210910144841.3139174-1-rearnsha@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Sep 2021 14:48:54 -0000 Changes since version 2: patch 1 is reworked again. patch 2 is unchanged from v2, it's included here only for completeness. patch 3 is unchanged, it's included here only for completeness. ------------- This short patch series is designed to address some more cases where we can usefully lower memcpy operations during gimple fold. The current code restricts this lowering to a maximum size of MOVE_MAX, ie the size of a single integer register on the machine, but with modern architectures this is likely too restrictive. The motivating example is uint64_t bar64(const uint8_t *rData1) { uint64_t buffer; __builtin_memcpy(&buffer, rData1, sizeof(buffer)); return buffer; } which on a 32-bit machine ends up with an inlined memcpy followed by a load from the copied buffer. The patch series is in three parts, although the middle patch is an arm-specific tweak to handle unaligned 64-bit moves on more versions of the Arm architecture. Patch 1 changes gen_highpart to directly handle forming the highpart of a MEM by calling adjust_address, this removes the need to validate a MEM on return from simplify_gen_subreg, so we replace that with an assert. Patch 2 addresses an issue in the arm backend. Currently movmisaligndi only supports vector targets. This patch reworks the code so that the pattern can work on any architecture version that supports misaligned accesses. Patch 3 then relaxes the gimple fold simplification of memcpy to allow larger memcpy operations to be folded away, provided that the total size is less than MOVE_MAX * MOVE_RATIO and provided that the machine has a suitable SET insn for the appropriate integer mode. With these three changes, the testcase above now optimizes to mov r3, r0 ldr r0, [r0] @ unaligned ldr r1, [r3, #4] @ unaligned bx lr R. Richard Earnshaw (3): rtl: directly handle MEM in gen_highpart [PR102125] arm: expand handling of movmisalign for DImode [PR102125] gimple: allow more folding of memcpy [PR102125] gcc/config/arm/arm.md | 16 ++++++++++++++++ gcc/config/arm/vec-common.md | 4 ++-- gcc/emit-rtl.c | 23 +++++++++++++---------- gcc/gimple-fold.c | 16 +++++++++++----- 4 files changed, 42 insertions(+), 17 deletions(-) -- 2.25.1