From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12955 invoked by alias); 22 Jul 2010 18:15:36 -0000 Received: (qmail 12937 invoked by uid 22791); 22 Jul 2010 18:15:33 -0000 X-SWARE-Spam-Status: No, hits=-5.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TW_DQ,TW_VD,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 22 Jul 2010 18:15:28 +0000 Received: from int-mx05.intmail.prod.int.phx2.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.18]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o6MIFJod020725 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 22 Jul 2010 14:15:20 -0400 Received: from anchor.twiddle.home (vpn-230-89.phx2.redhat.com [10.3.230.89]) by int-mx05.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o6MIFJBi024057; Thu, 22 Jul 2010 14:15:19 -0400 Message-ID: <4C488AB7.3080505@redhat.com> Date: Thu, 22 Jul 2010 18:15:00 -0000 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Thunderbird/3.0.5 MIME-Version: 1.0 To: Sebastian Pop CC: "H.J. Lu" , Bernd Schmidt , GCC Patches , ubizjak@gmail.com Subject: Re: x86_64 varargs setup jump table References: <4C4035C3.9080305@codesourcery.com> <4C40A5BD.9080208@redhat.com> <4C40F005.3060507@codesourcery.com> <4C41BD52.5040905@codesourcery.com> <4C447222.7080500@redhat.com> <4C44C00F.3070201@redhat.com> <4C45CFA0.9070404@redhat.com> <4C462BB6.3040006@redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2010-07/txt/msg01761.txt.bz2 On 07/22/2010 11:02 AM, Sebastian Pop wrote: > Here are the results on AMD Phenom(tm) 9950 Quad-Core. > > Old: Gcc 4.6.0 revision 162355 > New: Gcc 4.6.0 revision 162355 + this patch. > Flags: -O3 -funroll-loops -fpeel-loops -ffast-math -march=native > > The number is the run time percentage: (old - new) / old * 100 > (positive is better) > > [ no positive results ] Hmm. At least HJ had some positive results. I'm surprised that there are none on the AMD box. Does movaps have reformatting stalls that perhaps movdqa does with that particular micro-architecture? Or are stores exempt from reformatting stalls now? Otherwise the only thing I can think is that the computed jump was in practice very predictable (i.e. lots of calls containing the same sequence of types), and that performing a few less stores makes that difference. r~