From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22687 invoked by alias); 9 Sep 2013 16:06:10 -0000 Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org Received: (qmail 22675 invoked by uid 89); 9 Sep 2013 16:06:09 -0000 Received: from mail-ie0-f171.google.com (HELO mail-ie0-f171.google.com) (209.85.223.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 09 Sep 2013 16:06:09 +0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED,NO_RELAYS autolearn=ham version=3.3.2 X-HELO: mail-ie0-f171.google.com Received: by mail-ie0-f171.google.com with SMTP id 16so2211856iea.2 for ; Mon, 09 Sep 2013 09:06:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=7ML+luiaBoU8v5ntG4iAx9bJamkU9rDqdYgH/3i54D0=; b=ff01ZoMNjdjoTqp1e/CG1kkMvow06cd8HbFYgguMBGx33tUYKSWa1d8Fz8W8kvnco9 73HuQn+794hMcab8e/Y4rN9t14nZZkpS4Y1pfOHlNIbrbahHog6g/6sSpTYeH52ZGo/H WoaaLnMANgbNfjHN8J+AGClFdQ7oTFDRWiJ1jB4shz4boXmW6tKVZmZXfiUxUgx1W3GL izB1cnFT6wL3bOS/sWr4RGyAzMM0uGK76yewNv0Z1qciTLFHmQymyzJRSRAvqfrp00Ql qCMo6DBD+EiBD6xW3KzqElRDEU85RazYm2JmQgrriTOT3F1dKsjUO6gRDjb7+7zWehL5 Elrw== X-Gm-Message-State: ALoCoQnekd60Q1n5ss97kmLvlIIs5k+KWepadrKFKxTJmwPdJmDHatH1PIn+/wRoljaPvGRE7gkw MIME-Version: 1.0 X-Received: by 10.43.88.3 with SMTP id ay3mr704732icc.61.1378742767089; Mon, 09 Sep 2013 09:06:07 -0700 (PDT) Received: by 10.64.23.35 with HTTP; Mon, 9 Sep 2013 09:06:07 -0700 (PDT) In-Reply-To: References: <522D977E.2000906@linaro.org> Date: Mon, 09 Sep 2013 16:06:00 -0000 Message-ID: Subject: Re: [PATCH v3] ARM: Improve armv7 memcpy performance. From: Will Newton To: "Joseph S. Myers" Cc: "libc-ports@sourceware.org" , Patch Tracking Content-Type: text/plain; charset=ISO-8859-1 X-IsSubscribed: yes X-SW-Source: 2013-09/txt/msg00071.txt.bz2 On 9 September 2013 14:39, Joseph S. Myers wrote: > On Mon, 9 Sep 2013, Will Newton wrote: > >> Only enter the aligned copy loop with buffers that can be 8-byte >> aligned. This improves performance slightly on Cortex-A9 and >> Cortex-A15 cores for large copies with buffers that are 4-byte >> aligned but not 8-byte aligned. > > Did you conclude that the comment about needing unaligned word access for > ldrd/strd is still accurate after this patch (and if so, for which uses)? No, I overlooked that, I'll submit a new patch. > There was a long discussion on benchmarking starting from this patch. > Could you summarise the conclusions of that discussion as they relate to > the appropriate benchmarks to apply to this patch, and give pointers to > your before-and-after performance results? I believe the glibc memcpy benchmark is not capable in its present form of showing the difference between this version of the code and the current one: 1. The variety of alignments benchmarked is not adequate 2. The variability of the benchmark results is quite high (more runs required and page allocation issue) 3. The output of the benchmark contains no measure of variance 4. There is no means of showing graphically the output of the benchmark (for subtle differences this is necessary IMO) These are all surmountable problems but I would rather not gate acceptance of this code on a satisfactory resolution of the above issues. I can provide output from the cortex-strings benchmark quite instead though. -- Will Newton Toolchain Working Group, Linaro