From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30827 invoked by alias); 12 Jul 2005 17:25:41 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 30732 invoked by uid 48); 12 Jul 2005 17:25:30 -0000 Date: Tue, 12 Jul 2005 17:33:00 -0000 From: "amylaar at gcc dot gnu dot org" To: gcc-bugs@gcc.gnu.org Message-ID: <20050712172525.22445.amylaar@gcc.gnu.org> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug rtl-optimization/22445] New: Optimizations done by cselib depend on pointer values X-Bugzilla-Reason: CC X-SW-Source: 2005-07/txt/msg01482.txt.bz2 List-Id: We found that some of our in-house code was sometimes compiled to different assembler code when the compilation was repeated. There was one base version that we got most of the time, and a slightly smaller variant that was obtained with a lesser frequency, using identical sources, pathnames and compiler. The actual frequency seemed to depend on both host machine and pathname. The variant could not be obtained under debugger control. Here are two excerpts from my analysis: Addresses vary all over the place in the dump files, but the actual code variation first manifests itself in the .postreload dump file. There, only the store [insn 4050] is deleted; the load of r0 with (const_int 68) must be deleted some time later. As far as I can see, of the various subpasses that comprise postreload, only the noop move deletion of reload_cse_regs_1 might have deleted the store. The entire function is a single basic block. r1 gets assigned a new value before we reach the interesting code. Further analysis of the .greg file leads me to believe that we are dealing with a valid optimization here. r11 is set up to be r8 plus 68: (insn 4993 1569 1571 0 STFMessage.h:484 (set (reg/v/f:SI 11 r11 [orig:394 this ] [394]) (reg/v/u/f:SI 8 r8 [orig:158 this ] [158])) 168 {movsi_ie} (nil) (nil)) (insn 1571 4993 1576 0 STFMessage.h:484 (set (reg/v/f:SI 11 r11 [orig:394 this ] [394]) (plus:SI (reg/v/f:SI 11 r11 [orig:394 this ] [394]) (const_int 68 [0x44]))) 41 {*addsi3_compact} (nil) (nil)) and r8 and r11 remain stable beyond insn 4050. The code that gets optimized is: (insn 4028 4026 4039 0 FrontPanelDisplayEU.h:211 (set (mem/s/j:SI (reg/v/f:SI 11 r11 [orig:394 this ] [394] ) [0 ._vptr.STFMessageDispatcher+0 S4 A32]) (reg/f:SI 1 r1 [802])) 168 {movsi_ie} (insn_list 4026 (nil)) (nil)) (insn 4039 4028 4050 0 FrontPanelDisplayEU.h:211 (set (reg:SI 0 r0 [804]) (const_int 68 [0x44])) 168 {movsi_ie} (nil) (expr_list:REG_EQUIV (const_int 68 [0x44]) (nil))) (insn 4050 4039 4062 0 FrontPanelDisplayEU.h:211 (set (mem/s/j:SI (plus:SI (reg/v/u/f:SI 8 r8 [orig:158 this ] [158]) (reg:SI 0 r0 [804])) [0 ._vptr.STFMessageDispatcher+0 S4 A32]) (reg/f:SI 1 r1 [802])) 168 {movsi_ie} (insn_list 4039 (nil)) (nil)) Since r11 is r8 plus 68, insn 4050 stores the same value in the same location as insn 4028, and is thus a no-op move. Of course, if this is an optimization that is doable with the available gcc infrastructure, we would like to get it consistently. ================================================================= reload_cse_regs_1 is able to do the optimiation in principle, except that the hash values don't match. For the first MEM, the address is calculated with: (set (reg:SI 11) (reg:SI 8)) (set (reg:SI 11) (plus:SI (reg:SI 11) (const_int 68))) (reg:SI 11) The value copied from (reg:SI 8) is value number 1, hence its hash value is 1. The hashing of (const_int 68) is done by recursion, and 0 (VOIDmode) is used for its mode, giving a hash value of 8711 for (const_int 68), and a hash value of 8806 for (reg:SI 11). For the second MEM, the address is calculated with: (set (reg:SI 0) (const_int 68)) (plus:SI (reg:SI 8) (reg:SI 0)) Here, the hash of (const_int 68) is calculated using the mode of the destination, Which is SImode, i.e. 6. This gives a hash value of 8717 for (cont_int 68), and of 8812 for (plus:SI (reg:SI 8) (reg:SI 0)) . Thus, the match will generally fail, even though the values are the same. However, hashtab.c uses extensible hashing and a hash collision strategy that looks for free slots at specific intervals in the table. Thus, a hash collision can cause an accidental hash match. rtx_equal_for_cselib will then look what is inside r0 and find (const_int 68) there, and compare that to the identical constant in the other expression. LABEL_REFs and SYMBOL_REFs are hashed by pointer, so address space randomization also randomizes hash collisions. At the critical insn 4050, the table size is 251, and 100 elements are inside. For 8806, hash collision will add 1 + 8806 % 249 == 92, for 8812 it will add 1 + 8812 % 249 == 98. Thus, for n collisions of 8806 and m collisions of 8812, we get an accidental match if (98 * m - 92 * n + 6) % 251 == 0 . A solution would be n == 1, m == 6. Assuming even distribution, that gives a probability of just above 0.1%. [Obviously, for the code and on the machine where we observed the variant compilation, a number of hash collisionsi will happen unconditionally due values that have a consistent hash value independent of pointer addresses.] LABEL_REFs and SYMBOL_REFs are hashed by pointer, so address space randomization also randomizes hash collisions. We should perform the optimization consistently by making sure that CONST_INTs that have the same semantics hash to the same value. While for some contexts (like plus) we could figure out a mode for the CONST_INT, this is not true in general. E.g. the shift amount in a shift has a mode that does not need to bear any relation to the mode of the shift. Using the mode in the hash is not actually necessary. Disambiguation always has to be done in entry_and_rtx_equal_p / rtx_equal_for_cselib, and ignoring the mode of CONST_INTs in the hash is not likely to cause excessive hash collisions. -- Summary: Optimizations done by cselib depend on pointer values Product: gcc Version: 3.4.3 Status: UNCONFIRMED Severity: normal Priority: P2 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: amylaar at gcc dot gnu dot org CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22445