From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26336 invoked by alias); 12 Sep 2014 11:04:14 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 26321 invoked by uid 89); 12 Sep 2014 11:04:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=no version=3.3.2 X-HELO: service87.mimecast.com From: "Wilco Dijkstra" To: =?iso-8859-2?Q?'Ond=F8ej_B=EDlka'?= Cc: "'Rich Felker'" , "Florian Weimer" , , References: <001301cfcd0a$f0b62670$d2227350$@com> <54108BB0.90902@redhat.com> <20140910180144.GK23797@brightrain.aerifal.cx> <002501cfcdf7$cc046510$640d2f30$@com> <20140912062203.GB19287@domone> In-Reply-To: <20140912062203.GB19287@domone> Subject: RE: [PATCH] Improve performance of strncpy Date: Fri, 12 Sep 2014 11:04:00 -0000 Message-ID: <002601cfce79$461ffd60$d25ff820$@com> MIME-Version: 1.0 X-MC-Unique: 114091212040804701 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2014-09/txt/msg00266.txt.bz2 > Ond=F8ej B=EDlka wrote: > On Thu, Sep 11, 2014 at 08:37:17PM +0100, Wilco Dijkstra wrote: > > I did a quick experiment with strcpy as it's simpler. Replacing it > > with memcpy (d, s, strlen (s) + 1) is 3 times faster even on strings > > of 16Mbytes! Perhaps more surprisingly, it has similar performance on > > these huge strings as an optimized strcpy. > > > What architecture? This could also happen because memcpy has special > case to handle large strings that speeds this up. Its something that I > tried in one-pass strcpy but it harms performance as overhead of checking > size is bigger than benefit of larger size. The 3x happens on all 3 ISAs I tried. On ARM the memcpy/strlen variant even beats the optimized strcmp case for most sizes, on x64 it runs at about 80% of the optimized strcpy for sizes above 4KB. > > So the results are pretty clear, if you don't have a super optimized > > strcpy, then strlen+memcpy is the best way to do it. > > > It is not that clear as you spend considerable amount of time on small > lenghts, what is important is constant overhead of strcpy startup. > However this needs platform specific tricks to decide which alternative > is fastest. The overheads are relatively small on modern cores. The memcpy/strlen is always faster than the single loop for lengths larger than 8-16. Wilco