From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15892 invoked by alias); 17 Apr 2013 15:53:20 -0000 Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org Received: (qmail 15882 invoked by uid 89); 17 Apr 2013 15:53:20 -0000 X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,TW_CP,TW_VF autolearn=ham version=3.3.1 Received: from mail-ie0-f170.google.com (HELO mail-ie0-f170.google.com) (209.85.223.170) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 17 Apr 2013 15:53:19 +0000 Received: by mail-ie0-f170.google.com with SMTP id c11so2104988ieb.15 for ; Wed, 17 Apr 2013 08:53:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:x-gm-message-state; bh=axu2b4vJlXcUQd0nilFbXB3uowB+ySEy33YFhRFtZlk=; b=hLlRnDUadbrpuXVANF3XemSzM7q/ZCV09AkKOczat+2+95JO7yBfeD6lG8VkvrkJOa jjbAkYblZa3YxcgFYuZMXKO0W+CyANkSyeYrYXJ/CFhmSkMFmIldpn7uKGNRzP9660iY htWdfi6OpAYEfDblghkUBWPnAv0nriYNVBIB65Obo4TnIQhZcEDMn0uyrcYPtUekqsWg Uejrl+vuZDEtF+Uh5seOKhVSAv6GxpSrBI/kf8ZlIfJp4yJv5gADFroOF/WJy4Edt+nY zoU3BpsbABjLFUW5G21N/gyZRH0kUNSnliOM+zFX854SuOKqIDHpjYj/H9UUN13XQsEb acwA== MIME-Version: 1.0 X-Received: by 10.50.154.72 with SMTP id vm8mr4728399igb.1.1366213997815; Wed, 17 Apr 2013 08:53:17 -0700 (PDT) Received: by 10.64.100.174 with HTTP; Wed, 17 Apr 2013 08:53:17 -0700 (PDT) In-Reply-To: <516EC27E.8080502@twiddle.net> References: <516D18F0.4060009@linaro.org> <516EC27E.8080502@twiddle.net> Date: Wed, 17 Apr 2013 15:53:00 -0000 Message-ID: Subject: Re: [PATCH v2] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC. From: Will Newton To: Richard Henderson Cc: libc-ports@sourceware.org, Patch Tracking Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQkfdYT0eLDBy/aVDRjz7psuc4KjSqS9WNlxlSNxHzp/ABcsnGh8R1v9ZxMIn3qq0ptmC2ER X-SW-Source: 2013-04/txt/msg00079.txt.bz2 On 17 April 2013 16:40, Richard Henderson wrote: Hi Richard, Thanks for the review! > On 2013-04-16 11:25, Will Newton wrote: >> >> ports/sysdeps/arm/armv7/multiarch/Makefile | 3 + > > > Does this really require v7? From a brief read I didn't see anything in the > _arm version that didn't work since v5te (ldrd and pld). Any reason not to > put this into armv6 instead? >From reading the comments of the code v7 is required for NEON, v6 is required for VFP and unaligned access is required. The unaligned access requirement may be a problem on v5 I'm not sure. NB: I did not write the memcpy code so I have not looked at it in great detail. I also had trouble building an armv6 glibc. I only have armv7 systems to test on and it doesn't seem possible to build for armv6 on an armv7 system as far as I can tell. >> +ENTRY(memcpy) >> + .type memcpy, %gnu_indirect_function >> + ldr r1, .Lmemcpy_arm >> + tst r0, #HWCAP_ARM_NEON >> + it ne >> + ldrne r1, .Lmemcpy_neon >> + bne 1f > > > Swap vfp and neon tests and you don't need the branch. True, I'll do that. >> +.Lreturn: > > > Unused label? Yes, thanks, will fix. >> + ldr tmp1, [src, #-60] /* 15 words to go. */ >> + str tmp1, [dst, #-60] > > > These negative offsets mean thumb2 doesn't work. That's fine, but it means > that you need care for this in the _arm case. > > You have two choices: either do the swapping to arm mode by hand in the impl > file, or force the entire memcpy.o to arm mode by using #define NO_THUMB at > the top, before the #include . It sounds like switching it all to arm mode is the best option, I'll do that. -- Will Newton Toolchain Working Group, Linaro