From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28214 invoked by alias); 14 Jun 2011 23:04:42 -0000 Received: (qmail 28190 invoked by uid 22791); 14 Jun 2011 23:04:41 -0000 X-SWARE-Spam-Status: No, hits=-2.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,TW_AV,TW_BD X-Spam-Check-By: sourceware.org Received: from db3ehsobe002.messaging.microsoft.com (HELO DB3EHSOBE002.bigfish.com) (213.199.154.140) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 14 Jun 2011 23:04:26 +0000 Received: from mail103-db3-R.bigfish.com (10.3.81.248) by DB3EHSOBE002.bigfish.com (10.3.84.22) with Microsoft SMTP Server id 14.1.225.22; Tue, 14 Jun 2011 23:04:24 +0000 Received: from mail103-db3 (localhost.localdomain [127.0.0.1]) by mail103-db3-R.bigfish.com (Postfix) with ESMTP id 6C25B10A01E7; Tue, 14 Jun 2011 23:04:24 +0000 (UTC) X-SpamScore: -6 X-BigFish: VPS-6(z1039oz9371M1432N98dKzz1202hzz8275bh8275dhz32i668h839h61h) X-Spam-TCS-SCL: 0:0 X-Forefront-Antispam-Report: CIP:163.181.249.109;KIP:(null);UIP:(null);IPVD:NLI;H:ausb3twp02.amd.com;RD:none;EFVD:NLI Received: from mail103-db3 (localhost.localdomain [127.0.0.1]) by mail103-db3 (MessageSwitch) id 1308092663970460_32108; Tue, 14 Jun 2011 23:04:23 +0000 (UTC) Received: from DB3EHSMHS004.bigfish.com (unknown [10.3.81.242]) by mail103-db3.bigfish.com (Postfix) with ESMTP id DFACFCF004F; Tue, 14 Jun 2011 23:04:23 +0000 (UTC) Received: from ausb3twp02.amd.com (163.181.249.109) by DB3EHSMHS004.bigfish.com (10.3.87.104) with Microsoft SMTP Server id 14.1.225.22; Tue, 14 Jun 2011 23:04:23 +0000 X-M-MSG: Received: from sausexedgep01.amd.com (sausexedgep01-ext.amd.com [163.181.249.72]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ausb3twp02.amd.com (Axway MailGate 3.8.1) with ESMTP id 2581CC82C4; Tue, 14 Jun 2011 18:04:15 -0500 (CDT) Received: from sausexhtp02.amd.com (163.181.3.152) by sausexedgep01.amd.com (163.181.36.54) with Microsoft SMTP Server (TLS) id 8.3.106.1; Tue, 14 Jun 2011 18:04:40 -0500 Received: from SAUSEXMBP01.amd.com ([163.181.3.198]) by sausexhtp02.amd.com ([163.181.3.152]) with mapi; Tue, 14 Jun 2011 18:04:19 -0500 From: "Fang, Changpeng" To: "H.J. Lu" , Jakub Jelinek , "sergos.gnu@gmail.com" CC: Richard Guenther , "gcc-patches@gcc.gnu.org" Date: Tue, 14 Jun 2011 23:22:00 -0000 Subject: RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic Message-ID: References: <20110614101643.GA17079@tyan-ft48-01.lab.bos.redhat.com>, In-Reply-To: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: amd.com Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-06/txt/msg01124.txt.bz2 A similar argument is for software prefetching, which we observed a ~2% ben= efit on greyhound (not that much for Bulldozer). We would also prefer turning on software prefetching at -O3= for -mtune=3Dgeneric. --Changprng ________________________________________ From: H.J. Lu [hjl.tools@gmail.com] Sent: Tuesday, June 14, 2011 8:05 AM To: Jakub Jelinek; sergos.gnu@gmail.com Cc: Richard Guenther; Fang, Changpeng; gcc-patches@gcc.gnu.org Subject: Re: [PATCH, PR 49089] Don't split AVX256 unaligned loads by defaul= t on bdver1 and generic On Tue, Jun 14, 2011 at 3:16 AM, Jakub Jelinek wrote: > On Tue, Jun 14, 2011 at 12:13:47PM +0200, Richard Guenther wrote: >> On Tue, Jun 14, 2011 at 1:59 AM, Fang, Changpeng wrote: >> > The patch ( http://gcc.gnu.org/ml/gcc-patches/2011-02/txt00059.txt ) w= hich introduces splitting avx256 unaligned loads. >> > However, we found that it causes significant regressions for cpu2006 (= http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D49089 ). >> > >> > In this work, we introduce a tune option that sets splitting unaligned= loads default only for such CPUs that such splitting >> > is beneficial. >> > >> > The patch passed bootstrapping and regression tests on x86_64-unknown-= linux-gnu system. >> > >> > Is it OK to commit? >> >> It probably should go to the 4.6 branch as well. Note that I find the >> X86_TUNE_AVX256_SPLIT_UNALIGNED_LOAD_OPTIMAL odd, >> why not call it simply X86_TUNE_AVX256_SPLIT_UNALIGNED_LOAD? > > I also wonder what we should do for -mtune=3Dgeneric. Should we split or= not? > How big improvement is it on Intel chips, how big degradation does it > cause on AMD chips (I assume no other chip maker currently supports AVX)? > Simply turning off 32byte aligned load split, which introduces performance regressions on Intel Sandy Bridge processors, isn't an appropriate solution. I am proposing a different approach so that we can improve -mtune=3Dgeneric performance on current Intel and AMD processors. The current default GCC tuning, -mtune=3Dgeneric, was implemented in 2005 for Intel Pentium 4, Core 2 and AMD K8 processors. Many optimization choices are no longer applicable to the current Intel nor AMD processors. We should choose a set of optimization choices for -mtune=3Dgeneric, including 32byte unaligned load split, for the current Intel and AMD processors, which should improve performance with no performance regressions. -- H.J.