From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7150 invoked by alias); 17 Apr 2013 15:40:53 -0000 Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org Received: (qmail 7140 invoked by uid 89); 17 Apr 2013 15:40:52 -0000 X-Spam-SWARE-Status: No, score=-5.4 required=5.0 tests=BAYES_00,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,TW_CP,TW_VF autolearn=ham version=3.3.1 Received: from mail-ee0-f49.google.com (HELO mail-ee0-f49.google.com) (74.125.83.49) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 17 Apr 2013 15:40:52 +0000 Received: by mail-ee0-f49.google.com with SMTP id l10so813692eei.8 for ; Wed, 17 Apr 2013 08:40:49 -0700 (PDT) X-Received: by 10.14.0.5 with SMTP id 5mr19416479eea.13.1366213249801; Wed, 17 Apr 2013 08:40:49 -0700 (PDT) Received: from pebble.twiddle.net ([87.111.149.137]) by mx.google.com with ESMTPS id a41sm9860503eei.4.2013.04.17.08.40.48 (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 17 Apr 2013 08:40:48 -0700 (PDT) Message-ID: <516EC27E.8080502@twiddle.net> Date: Wed, 17 Apr 2013 15:40:00 -0000 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130402 Thunderbird/17.0.5 MIME-Version: 1.0 To: Will Newton CC: libc-ports@sourceware.org, patches@linaro.org Subject: Re: [PATCH v2] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC. References: <516D18F0.4060009@linaro.org> In-Reply-To: <516D18F0.4060009@linaro.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2013-04/txt/msg00078.txt.bz2 On 2013-04-16 11:25, Will Newton wrote: > ports/sysdeps/arm/armv7/multiarch/Makefile | 3 + Does this really require v7? From a brief read I didn't see anything in the _arm version that didn't work since v5te (ldrd and pld). Any reason not to put this into armv6 instead? > +ENTRY(memcpy) > + .type memcpy, %gnu_indirect_function > + ldr r1, .Lmemcpy_arm > + tst r0, #HWCAP_ARM_NEON > + it ne > + ldrne r1, .Lmemcpy_neon > + bne 1f Swap vfp and neon tests and you don't need the branch. > +.Lreturn: Unused label? > + ldr tmp1, [src, #-60] /* 15 words to go. */ > + str tmp1, [dst, #-60] These negative offsets mean thumb2 doesn't work. That's fine, but it means that you need care for this in the _arm case. You have two choices: either do the swapping to arm mode by hand in the impl file, or force the entire memcpy.o to arm mode by using #define NO_THUMB at the top, before the #include . If you chose the later, then you don't have to worry about thumb2's restriction on rd=rn when rm=pc, and can avoid the extra move. And the then unnecessary it markup. r~