From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24046 invoked by alias); 2 Sep 2013 14:18:31 -0000 Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org Received: (qmail 24007 invoked by uid 89); 2 Sep 2013 14:18:31 -0000 Received: from mail-pd0-f178.google.com (HELO mail-pd0-f178.google.com) (209.85.192.178) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 02 Sep 2013 14:18:31 +0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.2 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED,NO_RELAYS autolearn=ham version=3.3.2 X-HELO: mail-pd0-f178.google.com Received: by mail-pd0-f178.google.com with SMTP id w10so4780606pde.23 for ; Mon, 02 Sep 2013 07:18:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=W1S10FqvFMMmEtytlNUfdaMp7bS9NfpmAsIcLJwC7YA=; b=HJnhriiuhXj1Qg8go8jkfjtZQ7sFgKWKlukCcmtY/EKkbzxaDii7J2x9r7kKavQZWR AQkw9TYSHKOHaR9I0wphfVaxL23FuQHrSPDN8AH2ruKzguwLTKnKGEkkKL327xLPXS58 l2uAkR2wLv0cQuL7UQP2iek27DgI8xM7JMMmsvWlh5WxcEb66R+NetMywuLQ2ByCXXVt /OD+NeqpVzWtgVxDuxNjyNPQsRxGMSs+UfhX+ZKJBIyG4Lfep1wcFeT98jYx4/gTvo8w 9OoRDlqX7ScqgO+avmNKEU73lDtdVZk+Kr/RLqXe9YLghRLeQTAq7OTx0ePJJv1BkMj4 tPCA== X-Gm-Message-State: ALoCoQkzU64z3nMzgL50scwf8jTkV13L2nBzrElXoFEBnru+KJ/DcQ+SQlOIg1Q+1Z3+SNTOPLnY MIME-Version: 1.0 X-Received: by 10.66.228.234 with SMTP id sl10mr3517381pac.149.1378131508129; Mon, 02 Sep 2013 07:18:28 -0700 (PDT) Received: by 10.70.53.198 with HTTP; Mon, 2 Sep 2013 07:18:28 -0700 (PDT) In-Reply-To: <5220F1F0.80501@redhat.com> References: <520894D5.7060207@linaro.org> <5220D30B.9080306@redhat.com> <5220F1F0.80501@redhat.com> Date: Mon, 02 Sep 2013 14:18:00 -0000 Message-ID: Subject: Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance. From: Will Newton To: "Carlos O'Donell" Cc: "libc-ports@sourceware.org" , Patch Tracking , =?ISO-8859-2?B?T25k+GVqIELtbGth?= , Siddhesh Poyarekar Content-Type: text/plain; charset=ISO-8859-1 X-IsSubscribed: yes X-SW-Source: 2013-09/txt/msg00007.txt.bz2 On 30 August 2013 20:26, Carlos O'Donell wrote: > On 08/30/2013 02:48 PM, Will Newton wrote: >> On 30 August 2013 18:14, Carlos O'Donell wrote: >> >> Hi Carlos, >> >>>>> A small change to the entry to the aligned copy loop improves >>>>> performance slightly on A9 and A15 cores for certain copies. >>>>> >>>>> ports/ChangeLog.arm: >>>>> >>>>> 2013-08-07 Will Newton >>>>> >>>>> * sysdeps/arm/armv7/multiarch/memcpy_impl.S: Tighten check >>>>> on entry to aligned copy loop for improved performance. >>>>> --- >>>>> ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S | 4 ++-- >>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> Ping? >>> >>> How did you test the performance? >>> >>> glibc has a performance microbenchmark, did you use that? >> >> No, I used the cortex-strings package developed by Linaro for >> benchmarking various string functions against one another[1]. >> >> I haven't checked the glibc benchmarks but I'll look into that. It's >> quite a specific case that shows the problem so it may not be obvious >> which one is better however. > > If it's not obvious how is someone supposed to review this patch? :-) > >> [1] https://launchpad.net/cortex-strings > > There are 2 benchmarks. One appears to be dhrystone 2.1, which isn't a string > test in and of itself which should not be used for benchmarking or changing > string functions. The other is called "multi" and appears to run some functions > in a loop and take the time. > > e.g. > http://bazaar.launchpad.net/~linaro-toolchain-dev/cortex-strings/trunk/view/head:/benchmarks/multi/harness.c > > I would not call `multi' exhaustive, and while neither is the glibc performance > benchmark tests the glibc tests have received review from the glibc community > and are our preferred way of demonstrating performance gains when posting > performance patches. > > I would really really like to see you post the results of running your new > implementation with this benchmark and show the numbers that claim this is > faster. Is that possible? The mailing list server does not seem to accept image attachments so I have uploaded the performance graph here: http://people.linaro.org/~will.newton/glibc_memcpy/sizes-memcpy-08-04-2.5.png -- Will Newton Toolchain Working Group, Linaro