From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5424 invoked by alias); 29 Aug 2013 23:58:16 -0000 Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org Received: (qmail 5414 invoked by uid 89); 29 Aug 2013 23:58:16 -0000 Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 29 Aug 2013 23:58:16 +0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED,RDNS_NONE,SPF_HELO_FAIL autolearn=no version=3.3.2 X-HELO: relay1.mentorg.com Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1VFC6O-0001XS-6Z from joseph_myers@mentor.com ; Thu, 29 Aug 2013 16:58:12 -0700 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Thu, 29 Aug 2013 16:58:11 -0700 Received: from digraph.polyomino.org.uk (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.2.247.3; Fri, 30 Aug 2013 00:58:10 +0100 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.76) (envelope-from ) id 1VFC6K-0001yV-Om; Thu, 29 Aug 2013 23:58:08 +0000 Date: Thu, 29 Aug 2013 23:58:00 -0000 From: "Joseph S. Myers" To: Will Newton CC: , Subject: Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance. In-Reply-To: <520894D5.7060207@linaro.org> Message-ID: References: <520894D5.7060207@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" X-SW-Source: 2013-08/txt/msg00072.txt.bz2 On Mon, 12 Aug 2013, Will Newton wrote: > A small change to the entry to the aligned copy loop improves > performance slightly on A9 and A15 cores for certain copies. Could you clarify what you mean by "certain copies"? In particular, have you verified that for all three choices in this code (NEON, VFP or neither), the code for unaligned copies is at least as fast in this case (common 32-bit alignment, but not common 64-bit alignment) as the code that would previously have been used in those cases? There are various comments regarding alignment, whether stating "LDRD/STRD support unaligned word accesses" or referring to the mutual alignment that applies for particular code. Does this patch make any of them out of date? (If code can now only be reached with common 64-bit alignment, but in fact requires only 32-bit alignment, the comment should probably state both those things explicitly.) -- Joseph S. Myers joseph@codesourcery.com