From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 37110 invoked by alias); 1 Jun 2018 20:23:39 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 37095 invoked by uid 89); 1 Jun 2018 20:23:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.4 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_NUMSUBJECT autolearn=no version=3.3.2 spammy=proposing X-HELO: mga07.intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 Message-ID: <03bdf89c47880fd0734fc5b82213fc3c98eab372.camel@linux.intel.com> Subject: Re: [PATCH v2] x86-64: Optimize strcmp/wcscmp with AVX2 From: Leonardo Sandoval To: Alexander Monakov , "H.J. Lu" Cc: GNU C Library Date: Fri, 01 Jun 2018 20:23:00 -0000 In-Reply-To: References: <20180529185339.11541-1-leonardo.sandoval.gonzalez@linux.intel.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-SW-Source: 2018-06/txt/msg00018.txt.bz2 On Fri, 2018-06-01 at 18:28 +0300, Alexander Monakov wrote: > On Fri, 1 Jun 2018, H.J. Lu wrote: > > Please mention strncmp and wcsncmp in commit subject. OK with this > > change. > > Many Intel CPUs reduce operating frequency upon encountering AVX > code, > and some have a "spin-up" period when frequency is not yet changed > and > AVX code runs at reduced throughput. Thus, why is this change not > detrimental in practice, doesn't it slow down all code (including > other > programs running on the same core) as soon as a program makes a call > to strcmp? this is partially true for AVX2 FMA and AVX512. What I am proposing contains none of the latter instructions, just AVX2 without FMA instructions. In the other hand, some microbenchmarks were done to see the benefit of this effort, which is resumed on the commit description but the complete picture is here https://github.com/lsandoval/strcmp-avx2-benchmark/blob/master/string-c omparison-avx2.png The above numbers are based on a SkyLake platform. > > Alexander