From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 48100 invoked by alias); 26 Nov 2019 15:01:34 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 48086 invoked by uid 89); 26 Nov 2019 15:01:33 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-4.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.1 spammy=expert, expense, products, our X-HELO: us-smtp-delivery-1.mimecast.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574780487; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=cl6/zpm9pHEgwDVdveHgDFeZzffWoY9NwzwblKJGXts=; b=eTgcSWDETHWuKX+A43mCFT7ZW91fxrh4Vu8cMgTur66jglQZ2C7+6lnPtkLkhC0vrEm6zp 43YEaqoUU1/gvmYVrQkWvWdTsuJX3D/qGIKyQVKgFrFteSAIWSF604rUHromwV/7aJyN6L I3ew0PC6y391MYhes6bW2g5/H+rANg8= Return-Path: To: "H.J. Lu" , libc-alpha Cc: Florian Weimer From: Carlos O'Donell Subject: Commit 27d3ce1467990f89126e228559dec8f84b96c60e? Message-ID: <8e209ae2-969e-d4d5-bc00-0111c85198a6@redhat.com> Date: Tue, 26 Nov 2019 15:01:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2019-11/txt/msg00872.txt.bz2 HJ, In commit 27d3ce1467990f89126e228559dec8f84b96c60e we stop setting bit_arch_Fast_Copy_Backward for Intel Core processors as an optimization to improve performance. It turns out that this change also improves performance for Haswell servers. Was it the intent of this change to *also* improve performance for Haswell? The comments don't indicate this and I was worried that it might be an unintentional change in this case. The particular CPU was a E5-2650 v3. If we step back and look at the overall sequence of changes and performance it looks like this: The performance regression is between this change: c3d8dc45c9df199b8334599a6cbd98c9950dba62 - Triggers default: handling + TSX handling. - Causes a 21% lmbench regression for an E5-2650 v3. and this change (the one we are discussing): 27d3ce1467990f89126e228559dec8f84b96c60e - Removes bit_arch_Fast_Copy_Backward. - Restores the performance loss. My worry is that the two are unrelated, and that we've only just made back performance at the expense of the other change and we could be doing better. As our Intel expert what do you think is going on here? -- Cheers, Carlos. [1] https://ark.intel.com/content/www/us/en/ark/products/81705/intel-xeon-processor-e5-2650-v3-25m-cache-2-30-ghz.html