From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-return-92019-listarch-libc-alpha=sources.redhat.com@sourceware.org>
Received: (qmail 3980 invoked by alias); 3 May 2018 17:52:29 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Received: (qmail 3945 invoked by uid 89); 3 May 2018 17:52:28 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-6.1 required=5.0 tests=BAYES_00,GIT_PATCH_2,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL autolearn=ham version=3.3.2 spammy=train, Hx-languages-length:902, hints, poyarekar
X-HELO: homiemail-a56.g.dreamhost.com
From: Siddhesh Poyarekar <siddhesh@sourceware.org>
To: libc-alpha@sourceware.org
Subject: [PATCH 0/2] aarch64,falkor: memcpy/memmove performance improvements
Date: Thu, 03 May 2018 17:52:00 -0000
Message-Id: <20180503175209.2943-1-siddhesh@sourceware.org>
X-SW-Source: 2018-05/txt/msg00074.txt.bz2

Hi,

Here are a couple of patches to improve performance of the falkor memcpy
and memmove implementations based on testing on the latest hardware.
The theme of the optimization is to avoid trying to train the hardware
prefetcher for smaller sizes and in the loop tail since that just
mis-trains the prefetcher.  Instead, use multiple registers to aid
reordering wherever possible.  Testing showed that regressions in these
sizes compared to generic memcpy are resolved with this patch.

Siddhesh

Siddhesh Poyarekar (2):
  aarch64,falkor: Ignore prefetcher hints for memmove tail
  Ignore prefetcher tagging for smaller copies

 sysdeps/aarch64/multiarch/memcpy_falkor.S  | 68 ++++++++++++++++++------------
 sysdeps/aarch64/multiarch/memmove_falkor.S | 48 ++++++++++++---------
 2 files changed, 70 insertions(+), 46 deletions(-)

-- 
2.14.3