From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29754 invoked by alias); 30 Aug 2013 14:56:37 -0000 Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org Received: (qmail 29744 invoked by uid 89); 30 Aug 2013 14:56:37 -0000 Received: from mail-pb0-f41.google.com (HELO mail-pb0-f41.google.com) (209.85.160.41) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 30 Aug 2013 14:56:37 +0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.2 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED,NO_RELAYS autolearn=ham version=3.3.2 X-HELO: mail-pb0-f41.google.com Received: by mail-pb0-f41.google.com with SMTP id rp2so1983164pbb.0 for ; Fri, 30 Aug 2013 07:56:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=qM71+/u2wFtyf46hohcTBMIH7YoZoP89LBLl0d3c1Tw=; b=jxN5RP708gIJmNKJ339B20Ie25iA7T592hGEzYfXgufgBj0sO9pC9s8ZupZfd58kmY KM1g+9cZxJlyOFbgNPoqmhaBCGFsA+6/e8b/oqqS/cijHuwNYHt0o/RVd2ysyocqitrn LyLwr2ZMCNbk96oX+ckd2p2LaCvG8nrOIBxTTl+QdgTuJZVi4L9vTS3kwRaJst7y2WSI kfmbVOT8QVvT0q497ApzFmbQPLf1vwJp12FRQh6iMplAHgcLnP2QobIWkrToYNUrDuqL uA0KNDELWGbHBvP0EGYh68Ez8k5QvXVDllwzS9HXkmyrk6aP6i3AJFOrBjYjuE1FDSY3 rlHw== X-Gm-Message-State: ALoCoQkDdPhJ2XMMUaHjjZU0zJUuTXtr3y3vW9LThJNP5NxJ9iUgocLjFVmoV1mk9ICtwW9s+ctW MIME-Version: 1.0 X-Received: by 10.68.139.201 with SMTP id ra9mr10517536pbb.46.1377874595171; Fri, 30 Aug 2013 07:56:35 -0700 (PDT) Received: by 10.70.53.198 with HTTP; Fri, 30 Aug 2013 07:56:35 -0700 (PDT) In-Reply-To: References: <520894D5.7060207@linaro.org> Date: Fri, 30 Aug 2013 14:56:00 -0000 Message-ID: Subject: Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance. From: Will Newton To: "Joseph S. Myers" Cc: "libc-ports@sourceware.org" , Patch Tracking Content-Type: text/plain; charset=ISO-8859-1 X-SW-Source: 2013-08/txt/msg00081.txt.bz2 On 30 August 2013 00:58, Joseph S. Myers wrote: Hi Joseph, >> A small change to the entry to the aligned copy loop improves >> performance slightly on A9 and A15 cores for certain copies. > > Could you clarify what you mean by "certain copies"? Large copies (> 16kB) where the buffers are 4-byte aligned but not 8-byte aligned. I'll respin the patch with an improved description. > In particular, have you verified that for all three choices in this code > (NEON, VFP or neither), the code for unaligned copies is at least as fast > in this case (common 32-bit alignment, but not common 64-bit alignment) as > the code that would previously have been used in those cases? Yes, the performance is very similar but slightly better in the NEON case and approximately unchanged in the others. > There are various comments regarding alignment, whether stating "LDRD/STRD > support unaligned word accesses" or referring to the mutual alignment that > applies for particular code. Does this patch make any of them out of > date? (If code can now only be reached with common 64-bit alignment, but > in fact requires only 32-bit alignment, the comment should probably state > both those things explicitly.) I've reviewed the comments and they all look ok as far as I can tell. -- Will Newton Toolchain Working Group, Linaro