From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20257 invoked by alias); 19 Apr 2013 21:47:14 -0000 Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org Received: (qmail 20246 invoked by uid 89); 19 Apr 2013 21:47:14 -0000 X-Spam-SWARE-Status: No, score=-4.7 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL,TW_CP autolearn=ham version=3.3.1 Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Fri, 19 Apr 2013 21:47:14 +0000 Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1UTJ9E-0005cS-82 from joseph_myers@mentor.com ; Fri, 19 Apr 2013 14:47:12 -0700 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Fri, 19 Apr 2013 14:47:12 -0700 Received: from digraph.polyomino.org.uk (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.2.247.3; Fri, 19 Apr 2013 22:47:09 +0100 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.76) (envelope-from ) id 1UTJ9A-0008Jv-Uu; Fri, 19 Apr 2013 21:47:09 +0000 Date: Fri, 19 Apr 2013 21:47:00 -0000 From: "Joseph S. Myers" To: Will Newton CC: , Subject: Re: [PATCH v2] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC. In-Reply-To: <516D18F0.4060009@linaro.org> Message-ID: References: <516D18F0.4060009@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" X-SW-Source: 2013-04/txt/msg00096.txt.bz2 On Tue, 16 Apr 2013, Will Newton wrote: > Add a high performance memcpy routine optimized for Cortex-A15 with > variants for use in the presence of NEON and VFP hardware selected > at runtime using indirect function support. The functions __aeabi_memcpy, __aeabi_memcpy4 and __aeabi_memcpy8, currently implemented to call memcpy, have their ABI defined to clobber only the core registers permitted to be clobbered by AAPCS, and not the normally call-clobbered VFP/NEON registers. This patch would cause those functions to start clobbering some VFP/NEON registers. So you need to do something to avoid that, whether making the __aeabi_* functions save and restore registers in the affected case, making the new functions do so or some other approach such as making __aeabi_* use a variant of the code with an extra save/restore. As I understand the code, memcpy within ld.so itself will always be a version using the core registers only, so you shouldn't have the extra issue of needing to avoid corrupting such registers when used for argument passing in the VFP ABI variant. Though if you were to support building a glibc version that requires VFP/NEON, where the new code is used unconditionally rather than just through IFUNC - and such a glibc is a perfectly reasonable thing to build, after all if you are building for the VFP ABI then you may as well assume at least VFP to be present everywhere - then you would need to deal with that issue. (Cf. .) -- Joseph S. Myers joseph@codesourcery.com