From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10904 invoked by alias); 16 Oct 2008 18:36:06 -0000 Received: (qmail 10895 invoked by uid 22791); 16 Oct 2008 18:36:05 -0000 X-Spam-Check-By: sourceware.org Received: from igw3.br.ibm.com (HELO igw3.br.ibm.com) (32.104.18.26) by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 16 Oct 2008 18:35:22 +0000 Received: from d24relay01.br.ibm.com (unknown [9.8.31.16]) by igw3.br.ibm.com (Postfix) with ESMTP id 3ACE73901A7 for ; Thu, 16 Oct 2008 15:12:06 -0300 (BRST) Received: from d24av02.br.ibm.com (d24av02.br.ibm.com [9.18.232.47]) by d24relay01.br.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id m9GIZ3Jw2179256 for ; Thu, 16 Oct 2008 15:35:04 -0300 Received: from d24av02.br.ibm.com (loopback [127.0.0.1]) by d24av02.br.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m9GIZG9V027217 for ; Thu, 16 Oct 2008 16:35:16 -0200 Received: from [9.8.0.218] ([9.8.0.218]) by d24av02.br.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id m9GIZFex025149 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Oct 2008 16:35:15 -0200 Subject: Re: Patch to fix gcc.c-torture/compile/20010102-1.c on IA64 HP-UX From: Luis Machado Reply-To: luisgpm@linux.vnet.ibm.com To: Andrew Pinski Cc: Jeff Law , sje@cup.hp.com, Richard Henderson , Peter Bergner , gcc-patches@gcc.gnu.org, paolo.carlini@oracle.com In-Reply-To: References: <200809161651.m8GGpEM19079@lucas.cup.hp.com> <1221596846.17787.28.camel@hpsje.cup.hp.com> <48D0248E.5030505@redhat.com> <1221601143.17787.40.camel@hpsje.cup.hp.com> <48D2B834.7040701@redhat.com> <1221847674.4972.27.camel@hpsje.cup.hp.com> <48D953DC.6080200@redhat.com> <1222202669.19545.1.camel@hpsje.cup.hp.com> <1223062006.612.13.camel@gargoyle> <48E6BB7D.8000809@redhat.com> Content-Type: text/plain Date: Thu, 16 Oct 2008 21:46:00 -0000 Message-Id: <1224182072.8846.94.camel@gargoyle> Mime-Version: 1.0 X-Mailer: Evolution 2.12.1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2008-10/txt/msg00722.txt.bz2 On Fri, 2008-10-03 at 17:46 -0700, Andrew Pinski wrote: > On Fri, Oct 3, 2008 at 5:40 PM, Jeff Law wrote: > > I'd be hard pressed to see how this patch could cause any kind of > > performance regression since it's merely propagating information to > the > > backend that was getting lost in regrename and I don't see that the > ppc uses > > REG_POINTER in its backend. > > PPC does not use it but the way lwx is used is [base+index] which > should really be done that way, otherwise it will have some stalls in > some cases (on Power 6). > > Thanks, > Andrew Pinski After further investigation, there is some light to this problem. There appears to be a bad interaction between REG_POINTER and RTL alias analysis, as we see below. The epilogue for these two functions from vortex (Mem_GetAddr, Mem_GetWord) are the following, for both revisions: Mem_GetAddr (revision 140615): .L3: mr 3,29 mr 4,30 bl Ut_PrintErr lwz 7,36(1) lwz 26,8(1) * lis 8,Test@ha lwz 27,12(1) lwz 28,16(1) mtlr 7 mr 0,3 lwz 29,20(1) lwz 30,24(1) mr 3,0 lwz 31,28(1) * stw 0,Test@l(8) addi 1,1,32 blr Mem_GetAddr (revision 140616): .L3: mr 3,29 mr 4,30 bl Ut_PrintErr * lis 8,Test@ha mr 0,3 * stw 0,Test@l(8) mr 3,0 lwz 7,36(1) lwz 26,8(1) lwz 27,12(1) lwz 28,16(1) mtlr 7 lwz 29,20(1) lwz 30,24(1) lwz 31,28(1) addi 1,1,32 blr Mem_GetWord (revisions 140615 and 140616): .L8: lwz 5,0(31) li 0,1 cmpwi 6,5,0 bne- 6,.L9 lwz 10,36(1) lwz 26,8(1) * lis 11,Test@ha mr 3,0 lwz 27,12(1) lwz 28,16(1) mtlr 10 * stw 0,Test@l(11) lwz 29,20(1) lwz 30,24(1) lwz 31,28(1) addi 1,1,32 blr Clearly, the code generation for Mem_GetWord is unaffected by the changes from revision 140616. The problem happens with Mem_GetAddr/revision 140616. We can see that "lis 8,Test@ha" and "stw 0,Test@l(8)" are very close to each other, leading to possible stalls in the pipeline, which doesn't happen with the code generated by revision 140615. The positioning of "lis" and "stw" we see in revision 140616 is due to a number of dependencies created wrt the following lwz instructions, so the "stw" needs to execute before that, avoiding the possibility to move "stw" further down. So we're left with the question of why these dependencies are being created. Digging further, revision 140616 enabled some registers to store a value meaning its contents are, in fact, a pointer, and we access this data with REG_POINTER(...). Looking at this specific code inside alias.c:find_base_term: if (REG_P (tmp1) && REG_POINTER (tmp1)) return find_base_term (tmp1); if (REG_P (tmp2) && REG_POINTER (tmp2)) return find_base_term (tmp2); Since up until revision 140615 we didn't store the specific information mentioned above, we just flew by these conditions and returned either tmp1 or tmp2, not going into the recursive call. With revision 140616, we actually have the pointer-in-reg information, and thus we go into the recursive call to either "find_base_term (tmp1)" or "find_base_term (tmp2)". This should work as expected, as it really does for Mem_GetWord, but for Mem_GetAddr, the recursive call to "find_base_term (tmp1)" returns NULL. This leads us back to "init_alias_analysis", where we actually fill up the reg_base_value vector that is used, inside find_base_term, to return the base term we want. Keeping it short, the 8th entry in that vector, that should've contained the correct base term, is blank due to the 8th register being defined and set before the "stw" instruction in the epilogue. This specific situation led to all the others, including the performance degradation. This also relates to the -fno-strict-aliasing flag, which sets aliases to 0 and we need to do extra work to sort out the conflicting pieces. Without -fno-strict-aliasing, the code is generated correctly. So, it seems the RTL alias analysis framework in GCC is not designed to handle REG_POINTER" on hard registers. I would like to suggest that we return to Peter Bergner's proposed solution. Regards, Luis