* Fix PR 33870 @ 2007-11-08 0:24 Diego Novillo 2007-11-08 0:35 ` Jakub Jelinek 2007-11-08 10:45 ` Richard Guenther 0 siblings, 2 replies; 18+ messages in thread From: Diego Novillo @ 2007-11-08 0:24 UTC (permalink / raw) To: gcc-patches [-- Attachment #1: Type: text/plain, Size: 3397 bytes --] This patch fixes PR 33870. There were a couple of things at play here related to the way we compute sub-variables for structures and arrays. Particularly in the presence of nesting. When alias analysis computes points-to information for a structure for which we have sub-variables, it never adds all the variables to the set. It only adds the first SFT reachable by the pointer. So, if we have a pointer p_1 pointing into some structure and we dereference a field with p_1->fld, we can find out what offset to use by calling get_base_ref_and_extent. However, if that field is inside a nested structure, the offset computed by get_ref_base_and_extent is the offset from the start of the immediately containing structure. However, to find out what other SFTs are affected by this reference, we need to know the offsets starting at the root structure in the nesting hierarchy. For instance, given the following structure: struct X { int a; struct Y { int b; struct Z { int c[3]; } d; } e; } m; and the following address expression: p_1 = &m.e.d; This structure will receive 5 SFTs, namely 2 for fields 'a' and 'b' and 3 for the array 'c' in struct Z. So, the reference p_1->c[2] and m.e.d.c[2] access the exact same memory location (ie, SFT.5). Now, alias analysis computed the points-to set for pointer p_1 as { SFT.3 } because that is the first field that p_1 actually points to. When the expression p_1->c[2] is analyzed, get_ref_base_and_extent will return an offset of 96 because we are accessing the third element of the array. But the SFT we are looking for is actually at offset 160, counting from the top of struct X. Therefore, we need to adjust the offset given by get_base_ref_and_extent by the offset of VAR so that we can get at all the fields starting at VAR. Notice that this adjustment is ONLY necessary if the pointer is pointing inside a nested structure. If p_1 is pointing at the base structure of the nesting hierarchy, then no adjustment is necessary. Since this adjustment was being applied with every field reference, we were failing to add SFTs when some of the SFTs in the structure had been partitioned away into an MPT. In the specific case of PR 33870, there was a single field 'pDirty' inside a large structure that was being referenced with a pointer and with a regular variable. The field had the subvariable SFT.12 associated with it, with an offset of X bytes. Since SFT.12 was the only field that had not been partitioned away, when we were adding SFTs in the operand scanner, we would blindly adjust the offset of SFT.12 by X bytes and then trying to see if there was any SFT at offset X + X. There wasn't any, of course, and so no VOP was added to the memory load expression, which was then taken out of the loop because it was thought to be loop independent. Leading to the runtime failure. The only reason this didn't occur if partitioning was disabled was that the same blind offset adjustment was adding SFT.12 because it had ajdusted the offset of an SFT that was X bytes earlier in the structure. So, it was working by chance. By recognizing that this offset adjustment is only required for nested structures, we do not need to blindly adjust every offset and so we don't end up adding SFTs by accident. Tested on x86_64. Committed. [-- Attachment #2: 20071107-33870.diff.txt --] [-- Type: text/plain, Size: 7676 bytes --] 2007-11-07 Diego Novillo <dnovillo@google.com> PR 33870 * tree.h (struct tree_struct_field_tag): Add field in_nested_struct. (SFT_IN_NESTED_STRUCT): Define. * tree-dfa.c (dump_subvars_for): Show offset of each sub-var. * tree-flow.h (struct fieldoff): Add field in_nested_struct. * tree-ssa-structalias.c (struct variable_info): Likewise. (push_fields_onto_fieldstack): If OFFSET is positive, set in_nested_struct. (create_variable_info_for): Copy setting of in_nested_struct from the field offset object. (set_uids_in_ptset): Set SFT_IN_NESTED_STRUCT from the variable info object. * tree-ssa-operands.c (add_vars_for_offset): If VAR belongs to a nested structure, adjust OFFSET by SFT_OFFSET(VAR). testsuite/ChangeLog * gcc.c-torture/execute/pr33870.x: Remove. Index: tree.h =================================================================== --- tree.h (revision 129956) +++ tree.h (working copy) @@ -2573,15 +2573,21 @@ struct tree_struct_field_tag GTY(()) /* Size of the field. */ unsigned HOST_WIDE_INT size; + /* True if this SFT is for a field in a nested structure. */ + unsigned int in_nested_struct : 1; + /* Alias set for a DECL_NONADDRESSABLE_P field. Otherwise -1. */ alias_set_type alias_set; }; + #define SFT_PARENT_VAR(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.parent_var) #define SFT_OFFSET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.offset) #define SFT_SIZE(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.size) #define SFT_NONADDRESSABLE_P(NODE) \ (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1) #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set) +#define SFT_IN_NESTED_STRUCT(NODE) \ + (STRUCT_FIELD_TAG_CHECK (NODE)->sft.in_nested_struct) /* Memory Partition Tags (MPTs) group memory symbols under one common name for the purposes of placing memory PHI nodes. */ Index: testsuite/gcc.c-torture/execute/pr33870.x =================================================================== --- testsuite/gcc.c-torture/execute/pr33870.x (revision 129956) +++ testsuite/gcc.c-torture/execute/pr33870.x (working copy) @@ -1,9 +0,0 @@ -# The test breaks because of wrong alias info for -O2 and -Os - -set torture_eval_before_compile { - if {[string match {*-O[2s]*} "$option"]} { - set torture_execute_xfail "*-*-*" - } -} - -return 0 Index: tree-dfa.c =================================================================== --- tree-dfa.c (revision 129956) +++ tree-dfa.c (working copy) @@ -287,7 +287,7 @@ dump_subvars_for (FILE *file, tree var) for (i = 0; VEC_iterate (tree, sv, i, subvar); ++i) { print_generic_expr (file, subvar, dump_flags); - fprintf (file, " "); + fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED " ", SFT_OFFSET (subvar)); } fprintf (file, "}"); Index: tree-flow.h =================================================================== --- tree-flow.h (revision 129956) +++ tree-flow.h (working copy) @@ -1159,6 +1159,10 @@ struct fieldoff /* Field. */ tree decl; + /* True if this field is inside a structure nested inside the base + containing object. */ + unsigned int in_nested_struct : 1; + /* Offset from the base of the base containing object to this field. */ HOST_WIDE_INT offset; Index: tree-ssa-structalias.c =================================================================== --- tree-ssa-structalias.c (revision 129956) +++ tree-ssa-structalias.c (working copy) @@ -253,6 +253,15 @@ struct variable_info variable. This is used for C++ placement new. */ unsigned int no_tbaa_pruning : 1; + /* True if this variable is inside a structure nested in the + structure for the base variable. For instance, in + struct X { int a; struct Y { int b; int c; } }, the variables for + fields 'b' and 'c' are inside a nested structure. We are not + interested in tracking how many levels of nesting, just whether + there is nesting at all. This is later used to adjust offsets + for pointers pointing into sub-structures. */ + unsigned int in_nested_struct : 1; + /* Points-to set for this variable. */ bitmap solution; @@ -4133,6 +4142,12 @@ push_fields_onto_fieldstack (tree type, pair->alias_set = get_alias_set (addressable_type); else pair->alias_set = -1; + + /* If the base offset is positive, this field belongs to + a structure nested inside the base structure. */ + if (offset > 0) + pair->in_nested_struct = true; + count++; } else @@ -4181,6 +4196,12 @@ push_fields_onto_fieldstack (tree type, pair->alias_set = get_alias_set (addressable_type); else pair->alias_set = -1; + + /* If the base offset is positive, this field belongs to + a structure nested inside the base structure. */ + if (offset > 0) + pair->in_nested_struct = true; + count++; } else @@ -4491,6 +4512,7 @@ create_variable_info_for (tree decl, con newvi->offset = fo->offset; newvi->size = TREE_INT_CST_LOW (fo->size); newvi->fullsize = vi->fullsize; + newvi->in_nested_struct = fo->in_nested_struct; insert_into_field_list (vi, newvi); VEC_safe_push (varinfo_t, heap, varmap, newvi); if (is_global && (!flag_whole_program || !in_ipa_mode)) @@ -4743,6 +4765,7 @@ set_uids_in_ptset (tree ptr, bitmap into || (!is_derefed && !vi->directly_dereferenced) || alias_sets_conflict_p (ptr_alias_set, var_alias_set)) bitmap_set_bit (into, DECL_UID (sft)); + SFT_IN_NESTED_STRUCT (sft) = vi->in_nested_struct; } } else @@ -4946,7 +4969,6 @@ find_what_p_points_to (tree p) } /* Share the final set of variables when possible. */ - finished_solution = BITMAP_GGC_ALLOC (); stats.points_to_sets_created++; Index: tree-ssa-operands.c =================================================================== --- tree-ssa-operands.c (revision 129956) +++ tree-ssa-operands.c (working copy) @@ -1397,8 +1397,48 @@ add_vars_for_offset (tree var, unsigned subvar_t sv; unsigned int i; - /* Adjust offset by the pointed-to location. */ - offset += SFT_OFFSET (var); + if (SFT_IN_NESTED_STRUCT (var)) + { + /* Since VAR is an SFT inside a nested structure, the OFFSET + computed by get_ref_base_and_extent is the offset from the + start of the immediately containing structure. However, to + find out what other SFTs are affected by this reference, we + need to know the offsets starting at the root structure in + the nesting hierarchy. + + For instance, given the following structure: + + struct X { + int a; + struct Y { + int b; + struct Z { + int c[3]; + } d; + } e; + } m; + + and the following address expression: + + p_1 = &m.e.d; + + This structure will receive 5 SFTs, namely 2 for fields 'a' + and 'b' and 3 for the array 'c' in struct Z. So, the + reference p_1->c[2] and m.e.d.c[2] access the exact same + memory location (ie, SFT.5). + + Now, alias analysis computed the points-to set for pointer + p_1 as { SFT.3 } because that is the first field that p_1 + actually points to. When the expression p_1->c[2] is + analyzed, get_ref_base_and_extent will return an offset of 96 + because we are accessing the third element of the array. But + the SFT we are looking for is actually at offset 160, + counting from the top of struct X. + + Therefore, we adjust OFFSET by the offset of VAR so that we + can get at all the fields starting at VAR. */ + offset += SFT_OFFSET (var); + } /* Add all subvars of var that overlap with the access. Binary search for the first relevant SFT. */ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-08 0:24 Fix PR 33870 Diego Novillo @ 2007-11-08 0:35 ` Jakub Jelinek 2007-11-08 2:20 ` Diego Novillo 2007-11-08 10:45 ` Richard Guenther 1 sibling, 1 reply; 18+ messages in thread From: Jakub Jelinek @ 2007-11-08 0:35 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On Wed, Nov 07, 2007 at 07:24:32PM -0500, Diego Novillo wrote: > --- tree.h (revision 129956) > +++ tree.h (working copy) > @@ -2573,15 +2573,21 @@ struct tree_struct_field_tag GTY(()) > /* Size of the field. */ > unsigned HOST_WIDE_INT size; > > + /* True if this SFT is for a field in a nested structure. */ > + unsigned int in_nested_struct : 1; > + > /* Alias set for a DECL_NONADDRESSABLE_P field. Otherwise -1. */ > alias_set_type alias_set; > }; Doesn't this grow struct tree_struct_field_tag by in most cases 8 bytes? Aren't there enough bitfields that could be used for this bit instead? struct tree_memory_tag GTY(()) { struct tree_decl_minimal common; bitmap GTY ((skip)) aliases; unsigned int is_global:1; }; 31 or even 63 bits in tree_memory_tag (and then tree_base has a whole bunch of spare bits, so perhaps if is_global moved to one of those too (of course guarded with TREE_MEMORY_TAG_CHECK resp. STRUCT_FIELD_TAG_CHECK), we would save even 8 more bytes here. I know [tuples] does far more here, but even 4.3 will be used in the wild and every byte in often used structs counts. Jakub ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-08 0:35 ` Jakub Jelinek @ 2007-11-08 2:20 ` Diego Novillo 0 siblings, 0 replies; 18+ messages in thread From: Diego Novillo @ 2007-11-08 2:20 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches On Nov 7, 2007 7:35 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Doesn't this grow struct tree_struct_field_tag by in most cases 8 bytes? Oops, silly me. Thanks for noticing. Fixed with: * tree.h (struct tree_struct_field_tag): Move field in_nested_struct ... (struct tree_memory_tag): ... here. Index: tree.h =================================================================== --- tree.h (revision 129976) +++ tree.h (working copy) @@ -2554,7 +2554,11 @@ struct tree_memory_tag GTY(()) bitmap GTY ((skip)) aliases; + /* True if this tag has global scope. */ unsigned int is_global:1; + + /* True if this SFT is for a field in a nested structure. */ + unsigned int in_nested_struct : 1; }; #define MTAG_GLOBAL(NODE) (TREE_MEMORY_TAG_CHECK (NODE)->mtag.is_global) @@ -2573,9 +2577,6 @@ struct tree_struct_field_tag GTY(()) /* Size of the field. */ unsigned HOST_WIDE_INT size; - /* True if this SFT is for a field in a nested structure. */ - unsigned int in_nested_struct : 1; - /* Alias set for a DECL_NONADDRESSABLE_P field. Otherwise -1. */ alias_set_type alias_set; }; @@ -2587,7 +2588,7 @@ struct tree_struct_field_tag GTY(()) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1) #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set) #define SFT_IN_NESTED_STRUCT(NODE) \ - (STRUCT_FIELD_TAG_CHECK (NODE)->sft.in_nested_struct) + (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.in_nested_struct) /* Memory Partition Tags (MPTs) group memory symbols under one common name for the purposes of placing memory PHI nodes. */ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-08 0:24 Fix PR 33870 Diego Novillo 2007-11-08 0:35 ` Jakub Jelinek @ 2007-11-08 10:45 ` Richard Guenther 2007-11-13 16:30 ` Diego Novillo 1 sibling, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-08 10:45 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On 11/8/07, Diego Novillo <dnovillo@google.com> wrote: > This patch fixes PR 33870. There were a couple of things at play here > related to the way we compute sub-variables for structures and arrays. > Particularly in the presence of nesting. > > When alias analysis computes points-to information for a structure for > which we have sub-variables, it never adds all the variables to the > set. It only adds the first SFT reachable by the pointer. > > So, if we have a pointer p_1 pointing into some structure and we > dereference a field with p_1->fld, we can find out what offset to use > by calling get_base_ref_and_extent. However, if that field is inside > a nested structure, the offset computed by get_ref_base_and_extent is > the offset from the start of the immediately containing structure. > However, to find out what other SFTs are affected by this reference, > we need to know the offsets starting at the root structure in the > nesting hierarchy. > > For instance, given the following structure: > > struct X { > int a; > struct Y { > int b; > struct Z { > int c[3]; > } d; > } e; > } m; > > and the following address expression: > > p_1 = &m.e.d; > > This structure will receive 5 SFTs, namely 2 for fields 'a' and 'b' > and 3 for the array 'c' in struct Z. So, the reference p_1->c[2] and > m.e.d.c[2] access the exact same memory location (ie, SFT.5). > > Now, alias analysis computed the points-to set for pointer p_1 as { > SFT.3 } because that is the first field that p_1 actually points to. > When the expression p_1->c[2] is analyzed, get_ref_base_and_extent > will return an offset of 96 because we are accessing the third element > of the array. But the SFT we are looking for is actually at offset > 160, counting from the top of struct X. > > Therefore, we need to adjust the offset given by > get_base_ref_and_extent by the offset of VAR so that we can get at all > the fields starting at VAR. > > Notice that this adjustment is ONLY necessary if the pointer is > pointing inside a nested structure. If p_1 is pointing at the base > structure of the nesting hierarchy, then no adjustment is necessary. > > Since this adjustment was being applied with every field reference, we > were failing to add SFTs when some of the SFTs in the structure had > been partitioned away into an MPT. > > In the specific case of PR 33870, there was a single field 'pDirty' > inside a large structure that was being referenced with a pointer and > with a regular variable. The field had the subvariable SFT.12 > associated with it, with an offset of X bytes. > > Since SFT.12 was the only field that had not been partitioned away, > when we were adding SFTs in the operand scanner, we would blindly > adjust the offset of SFT.12 by X bytes and then trying to see if there > was any SFT at offset X + X. > > There wasn't any, of course, and so no VOP was added to the memory > load expression, which was then taken out of the loop because it was > thought to be loop independent. Leading to the runtime failure. > > The only reason this didn't occur if partitioning was disabled was > that the same blind offset adjustment was adding SFT.12 because it had > ajdusted the offset of an SFT that was X bytes earlier in the > structure. So, it was working by chance. > > By recognizing that this offset adjustment is only required for nested > structures, we do not need to blindly adjust every offset and so we > don't end up adding SFTs by accident. > > Tested on x86_64. Committed. You are only working around the problem and do not address the fundamental issue. The following still fails: extern void abort (void); typedef struct PgHdr PgHdr; typedef unsigned char u8; struct PgHdr { int y; struct { unsigned int pgno; PgHdr *pNextHash, *pPrevHash; PgHdr *pNextFree, *pPrevFree; PgHdr *pNextAll; u8 inJournal; short int nRef; PgHdr *pDirty, *pPrevDirty; unsigned int notUsed; } x; }; volatile PgHdr **xx; static inline PgHdr *merge_pagelist(PgHdr *pA, PgHdr *pB) { PgHdr result; PgHdr *pTail; xx = &result.x.pDirty; pTail = &result; while( pA && pB ){ if( pA->x.pgno<pB->x.pgno ){ pTail->x.pDirty = pA; pTail = pA; pA = pA->x.pDirty; }else{ pTail->x.pDirty = pB; pTail = pB; pB = pB->x.pDirty; } } if( pA ){ pTail->x.pDirty = pA; }else if( pB ){ pTail->x.pDirty = pB; }else{ pTail->x.pDirty = 0; } return result.x.pDirty; } PgHdr * __attribute__((noinline)) sort_pagelist(PgHdr *pIn) { PgHdr *a[25], *p; int i; __builtin_memset (a, 0, sizeof (a)); while( pIn ){ p = pIn; pIn = p->x.pDirty; p->x.pDirty = 0; for(i=0; i<25 -1; i++){ if( a[i]==0 ){ a[i] = p; break; }else{ p = merge_pagelist(a[i], p); a[i] = 0; a[i] = 0; } } if( i==25 -1 ){ a[i] = merge_pagelist(a[i], p); } } p = a[0]; for(i=1; i<25; i++){ p = merge_pagelist (p, a[i]); } return p; } int main() { PgHdr a[5]; PgHdr *p; a[0].x.pgno = 5; a[0].x.pDirty = &a[1]; a[1].x.pgno = 4; a[1].x.pDirty = &a[2]; a[2].x.pgno = 1; a[2].x.pDirty = &a[3]; a[3].x.pgno = 3; a[3].x.pDirty = 0; p = sort_pagelist (&a[0]); if (p->x.pDirty == p) abort (); return 0; } taking the address of result.x.pDirty causes the SFT to be marked as in substructure and the problem re-surfaces. Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-08 10:45 ` Richard Guenther @ 2007-11-13 16:30 ` Diego Novillo 2007-11-13 16:47 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Diego Novillo @ 2007-11-13 16:30 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 4038 bytes --] Richard Guenther wrote: > taking the address of result.x.pDirty causes the SFT to be marked as > in substructure and the problem re-surfaces. Yes. Thanks for the test case. The original patch did not take into account that the nesting level of the memory reference is also important. There are two problems here: 1- Given an arbitrary memory expression M dereferencing a pointer P into a nested structure, the offset computed for M is the offset starting at P. However, the offsets computed for the original SFTs are offsets from the top of the pointed-to memory object O. So, when we recognize this situation, we must adjust the offsets. This was the situation recognized and fixed by the original patch. The bug was in the detection of whether P is "pointing to the middle of an object". This is where the nesting level comes in to play. Suppose that M was the expression p_3->b[3] and the original pointed-to object was v.x.y: struct { ... struct { ... struct { ... int b[4]; } y; } x; } v; p_3 = &v.x.y; If alias analysis created SFTs for this structure, the array 'b' will have 4 SFTs associated with it. Say SFT.12, SFT.13, SFT.14 and SFT.15. However, only SFT.12 will be added to the alias set of p_3's name tag. So, when we see the expression 'p_3->b[3]', we compute an offset of 96, but we need SFT.14 which will probably have an offset much higher than that. So, we adjust 96 by the offset of SFT.12 and get to SFT.14. In the original patch, we recognized this situation by asking whether the SFT.12 was inside a nested structure. However, it may happen that the pointer was actually pointing to the base of the original object 'v'. In which case, we would have the expression 'p_3->x.y.b[3]', which also takes us to SFT.12. But in this case, the offset we computed from 'p_3->x.y.b[3]' is *already* the right offset, so adding the offset of SFT.12 to it gives us the wrong offset. So, what this patch does is determine the nesting level of each SFT and the nesting level of the memory expression. We only need to adjust the offset of an SFT if the nesting level of the memory expression is lower than the nesting level of the SFT that we are analyzing. 2- The second problem we have with this scheme is that we rely on the presence of key SFTs to adjust offsets when we detect references to nested aggregates. This is something we cannot easily undo because inside the operand scanner we cannot really start doing use-def chain chasing to determine the original pointed-to objects. This work has already been done by alias analysis, so duplicating this in the operand scanner would be wasteful. This means that this scheme cannot tolerate these key SFTs to be partitioned away. If they are partitioned, then inside the operand scanner is essentially impossible to determine where in the structure is an arbitrary memory expression referring to. This is shown in the new test gcc.dg/tree-ssa/alias-16.c. One way to deal with this would be to use the pointed-to set for the base pointer instead of the alias set of the name tag. This has the problem that we'd miss the compile-time benefits of partitioning, as we would be traversing the original sets instead of the partitioned (smaller) alias sets. Also, we may want to get rid of the duplicate pointed-to/alias sets in the future (not sure if they're both worth keeping). The other way of dealing with this is to recognize which SFTs are important to not partition and make sure they are never partitioned. During alias analysis, if a nested SFT is added to the alias set of a pointer, then it is marked as unpartitionable. The partitioner, in turn, gives these SFTs a very high score and makes sure that they are never added to a partition. The effects of this are/should be negligible as it only involves a single SFT and nested structures. The new test alias-16.c tests for these edge cases. Tested on x86_64 and ppc64. Committed to trunk. Diego. [-- Attachment #2: 20071113-33870-nesting-level.diff.txt --] [-- Type: text/plain, Size: 21492 bytes --] 2007-11-12 Diego Novillo <dnovillo@google.com> PR 33870 * tree.h (strcut tree_memory_tag): Add field unpartitionable. Remove field in_nested_struct. (struct tree_struct_field_tag): Add field nesting_level. (SFT_IN_NESTED_STRUCT): Remove. (SFT_NESTING_LEVEL): Define. (SFT_UNPARTITIONABLE_P): Define. * tree-ssa-alias.c (mem_sym_score): If MP->VAR is not partitionable, return LONG_MAX. (compute_memory_partitions): Do not partition SFTs marked unpartitionable. (create_sft): Add argument NESTING_LEVEL. Set SFT_NESTING_LEVEL with it. Update all users. (create_overlap_variables_for): Show nesting level. * tree-dfa.c (dump_subvars_for): Likewise. (dump_variable): Likewise. Show whether the SFT is partitionable or not. * tree-flow.h (struct fieldoff): Remove field in_nested_struct. Add field nesting_level. * tree-ssa-structalias.c (struct variable_info): Remove field in_nested_struct. (push_fields_onto_fieldstack): Add argument NESTING_LEVEL. Update all users. Update documentation. Update PAIR->NESTING_LEVEL with NESTING_LEVEL. Make recursive calls with NESTING_LEVEL + 1. (set_uids_in_ptset): If an SFT is added to the points-to set, mark it as unpartitionable. * tree-ssa-operands.c (ref_nesting_level): New. (add_vars_for_offset): Call it. Add argument FULL_REF. Update callers. If VAR is inside a nested structure and the nesting level of FULL_REF is lower than the nesting level of VAR, adjust OFFSET by the offset of VAR. testsuite/ChangeLog PR 33870 * gcc.c-torture/execute/pr33870-1.c: New test. * gcc.dg/tree-ssa/alias-16.c: New test. Index: tree.h =================================================================== --- tree.h (revision 130139) +++ tree.h (working copy) @@ -2555,10 +2555,10 @@ struct tree_memory_tag GTY(()) bitmap GTY ((skip)) aliases; /* True if this tag has global scope. */ - unsigned int is_global:1; + unsigned int is_global : 1; - /* True if this SFT is for a field in a nested structure. */ - unsigned int in_nested_struct : 1; + /* True if this tag should not be grouped into a memory partition. */ + unsigned int unpartitionable : 1; }; #define MTAG_GLOBAL(NODE) (TREE_MEMORY_TAG_CHECK (NODE)->mtag.is_global) @@ -2579,6 +2579,11 @@ struct tree_struct_field_tag GTY(()) /* Alias set for a DECL_NONADDRESSABLE_P field. Otherwise -1. */ alias_set_type alias_set; + + /* Nesting level for this subvariable. This indicates how many + structures are wrapping this field. Fields at the top level have + a nesting level of 0. */ + unsigned int nesting_level; }; #define SFT_PARENT_VAR(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.parent_var) @@ -2587,8 +2592,10 @@ struct tree_struct_field_tag GTY(()) #define SFT_NONADDRESSABLE_P(NODE) \ (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1) #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set) -#define SFT_IN_NESTED_STRUCT(NODE) \ - (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.in_nested_struct) +#define SFT_NESTING_LEVEL(NODE) \ + (STRUCT_FIELD_TAG_CHECK (NODE)->sft.nesting_level) +#define SFT_UNPARTITIONABLE_P(NODE) \ + (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.unpartitionable) /* Memory Partition Tags (MPTs) group memory symbols under one common name for the purposes of placing memory PHI nodes. */ Index: testsuite/gcc.c-torture/execute/pr33870-1.c =================================================================== --- testsuite/gcc.c-torture/execute/pr33870-1.c (revision 0) +++ testsuite/gcc.c-torture/execute/pr33870-1.c (revision 0) @@ -0,0 +1,94 @@ +extern void abort (void); + +typedef struct PgHdr PgHdr; +typedef unsigned char u8; +struct PgHdr { +int y; +struct { + unsigned int pgno; + PgHdr *pNextHash, *pPrevHash; + PgHdr *pNextFree, *pPrevFree; + PgHdr *pNextAll; + u8 inJournal; + short int nRef; + PgHdr *pDirty, *pPrevDirty; + unsigned int notUsed; +} x; +}; +PgHdr **xx; +volatile int vx; +static inline PgHdr *merge_pagelist(PgHdr *pA, PgHdr *pB) +{ + PgHdr result; + PgHdr *pTail; + xx = &result.x.pDirty; + pTail = &result; + while( pA && pB ){ + if( pA->x.pgno<pB->x.pgno ){ + pTail->x.pDirty = pA; + pTail = pA; + pA = pA->x.pDirty; + }else{ + pTail->x.pDirty = pB; + pTail = pB; + pB = pB->x.pDirty; + } + vx = (*xx)->y; + } + if( pA ){ + pTail->x.pDirty = pA; + }else if( pB ){ + pTail->x.pDirty = pB; + }else{ + pTail->x.pDirty = 0; + } + return result.x.pDirty; +} + +PgHdr * __attribute__((noinline)) sort_pagelist(PgHdr *pIn) +{ + PgHdr *a[25], *p; + int i; + __builtin_memset (a, 0, sizeof (a)); + while( pIn ){ + p = pIn; + pIn = p->x.pDirty; + p->x.pDirty = 0; + for(i=0; i<25 -1; i++){ + if( a[i]==0 ){ + a[i] = p; + break; + }else{ + p = merge_pagelist(a[i], p); + a[i] = 0; + a[i] = 0; + } + } + if( i==25 -1 ){ + a[i] = merge_pagelist(a[i], p); + } + } + p = a[0]; + for(i=1; i<25; i++){ + p = merge_pagelist (p, a[i]); + } + return p; +} + +int main() +{ + PgHdr a[5]; + PgHdr *p; + a[0].x.pgno = 5; + a[0].x.pDirty = &a[1]; + a[1].x.pgno = 4; + a[1].x.pDirty = &a[2]; + a[2].x.pgno = 1; + a[2].x.pDirty = &a[3]; + a[3].x.pgno = 3; + a[3].x.pDirty = 0; + p = sort_pagelist (&a[0]); + if (p->x.pDirty == p) + abort (); + return 0; +} Index: testsuite/gcc.dg/tree-ssa/alias-16.c =================================================================== --- testsuite/gcc.dg/tree-ssa/alias-16.c (revision 0) +++ testsuite/gcc.dg/tree-ssa/alias-16.c (revision 0) @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-options "-O --param max-aliased-vops=1" } */ + +/* Compile with -O --param max-aliased-vops=1. This partitions all + the initial SFTs for 'm' which was causing the operand scanner to + miss adding the right SFTs to p->b[2]. */ +extern void abort (void); + +struct X { + int a; + struct Y { + int b[4]; + } b; + struct Y c; +} m; + +struct X n; + +foo (int i) +{ + struct Y *p = (i > 10) ? &m.b : &n.c; + p->b[2] = 10; + m.b.b[3] = 6; + n.c.b[2] = 3; + return p->b[2] + n.c.b[2] + m.b.b[3]; +} + +main() +{ + if (foo (3) != 12) + abort (); + return 0; +} Index: tree-ssa-alias.c =================================================================== --- tree-ssa-alias.c (revision 130139) +++ tree-ssa-alias.c (working copy) @@ -828,6 +828,13 @@ count_mem_refs (long *num_vuses_p, long static inline long mem_sym_score (mem_sym_stats_t mp) { + /* Unpartitionable SFTs are automatically thrown to the bottom of + the list. They are not stored in partitions, but they are used + for computing overall statistics. */ + if (TREE_CODE (mp->var) == STRUCT_FIELD_TAG + && SFT_UNPARTITIONABLE_P (mp->var)) + return LONG_MAX; + return mp->frequency_writes * 64 + mp->frequency_reads * 32 + mp->num_direct_writes * 16 + mp->num_direct_reads * 8 + mp->num_indirect_writes * 4 + mp->num_indirect_reads * 2 @@ -1392,8 +1399,8 @@ update_reference_counts (struct mem_ref_ static void build_mp_info (struct mem_ref_stats_d *mem_ref_stats, - VEC(mem_sym_stats_t,heap) **mp_info_p, - VEC(tree,heap) **tags_p) + VEC(mem_sym_stats_t,heap) **mp_info_p, + VEC(tree,heap) **tags_p) { tree var; referenced_var_iterator rvi; @@ -1591,6 +1598,15 @@ compute_memory_partitions (void) if (!need_to_partition_p (mem_ref_stats)) break; + /* SFTs that are marked unpartitionable should not be added to + partitions. These SFTs are special because they mark the + first SFT into a structure where a pointer is pointing to. + This is needed by the operand scanner to find adjacent + fields. See add_vars_for_offset for details. */ + if (TREE_CODE (mp_p->var) == STRUCT_FIELD_TAG + && SFT_UNPARTITIONABLE_P (mp_p->var)) + continue; + mpt = find_partition_for (mp_p); estimate_vop_reduction (mem_ref_stats, mp_p, mpt); } @@ -3774,7 +3790,8 @@ get_or_create_used_part_for (size_t uid) static tree create_sft (tree var, tree field, unsigned HOST_WIDE_INT offset, - unsigned HOST_WIDE_INT size, alias_set_type alias_set) + unsigned HOST_WIDE_INT size, alias_set_type alias_set, + unsigned nesting_level) { tree subvar = create_tag_raw (STRUCT_FIELD_TAG, field, "SFT"); @@ -3794,6 +3811,8 @@ create_sft (tree var, tree field, unsign SFT_OFFSET (subvar) = offset; SFT_SIZE (subvar) = size; SFT_ALIAS_SET (subvar) = alias_set; + SFT_NESTING_LEVEL (subvar) = nesting_level; + return subvar; } @@ -3814,7 +3833,7 @@ create_overlap_variables_for (tree var) return; push_fields_onto_fieldstack (TREE_TYPE (var), &fieldstack, 0, NULL, - TREE_TYPE (var)); + TREE_TYPE (var), 0); if (VEC_length (fieldoff_s, fieldstack) != 0) { subvar_t *subvars; @@ -3897,7 +3916,6 @@ create_overlap_variables_for (tree var) field, skip it. Note that we always need the field at offset 0 so we can properly handle pointers to the structure. */ - if ((fo->offset != 0 && ((fo->offset <= up->minused && fo->offset + fosize <= up->minused) @@ -3906,8 +3924,9 @@ create_overlap_variables_for (tree var) && fosize == lastfosize && currfotype == lastfotype)) continue; - subvar = create_sft (var, fo->type, fo->offset, - fosize, fo->alias_set); + + subvar = create_sft (var, fo->type, fo->offset, fosize, + fo->alias_set, fo->nesting_level); VEC_quick_push (tree, *subvars, subvar); if (dump_file) @@ -3918,7 +3937,8 @@ create_overlap_variables_for (tree var) SFT_OFFSET (subvar)); fprintf (dump_file, " size " HOST_WIDE_INT_PRINT_DEC, SFT_SIZE (subvar)); - fprintf (dump_file, "\n"); + fprintf (dump_file, " nesting level %d\n", + SFT_NESTING_LEVEL (subvar)); } lastfotype = currfotype; Index: tree-dfa.c =================================================================== --- tree-dfa.c (revision 130139) +++ tree-dfa.c (working copy) @@ -287,7 +287,9 @@ dump_subvars_for (FILE *file, tree var) for (i = 0; VEC_iterate (tree, sv, i, subvar); ++i) { print_generic_expr (file, subvar, dump_flags); - fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED " ", SFT_OFFSET (subvar)); + fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED, SFT_OFFSET (subvar)); + fprintf (file, "[%u]", SFT_NESTING_LEVEL (subvar)); + fprintf (file, " "); } fprintf (file, "}"); @@ -417,6 +419,15 @@ dump_variable (FILE *file, tree var) fprintf (file, ", partition symbols: "); dump_decl_set (file, MPT_SYMBOLS (var)); } + + if (TREE_CODE (var) == STRUCT_FIELD_TAG) + { + fprintf (file, ", offset: " HOST_WIDE_INT_PRINT_UNSIGNED, + SFT_OFFSET (var)); + fprintf (file, ", nesting: %u", SFT_NESTING_LEVEL (var)); + fprintf (file, ", partitionable: %s", + SFT_UNPARTITIONABLE_P (var) ? "NO" : "YES"); + } } fprintf (file, "\n"); Index: tree-flow.h =================================================================== --- tree-flow.h (revision 130139) +++ tree-flow.h (working copy) @@ -1159,9 +1159,9 @@ struct fieldoff /* Field. */ tree decl; - /* True if this field is inside a structure nested inside the base - containing object. */ - unsigned int in_nested_struct : 1; + /* Nesting level. This number represents how many structures are + wrapping this field. */ + unsigned nesting_level; /* Offset from the base of the base containing object to this field. */ HOST_WIDE_INT offset; @@ -1173,8 +1173,8 @@ typedef struct fieldoff fieldoff_s; DEF_VEC_O(fieldoff_s); DEF_VEC_ALLOC_O(fieldoff_s,heap); -int push_fields_onto_fieldstack (tree, VEC(fieldoff_s,heap) **, - HOST_WIDE_INT, bool *, tree); +int push_fields_onto_fieldstack (tree, VEC(fieldoff_s,heap) **, HOST_WIDE_INT, + bool *, tree, unsigned); void sort_fieldstack (VEC(fieldoff_s,heap) *); void init_alias_heapvars (void); Index: tree-ssa-structalias.c =================================================================== --- tree-ssa-structalias.c (revision 130139) +++ tree-ssa-structalias.c (working copy) @@ -253,15 +253,6 @@ struct variable_info variable. This is used for C++ placement new. */ unsigned int no_tbaa_pruning : 1; - /* True if this variable is inside a structure nested in the - structure for the base variable. For instance, in - struct X { int a; struct Y { int b; int c; } }, the variables for - fields 'b' and 'c' are inside a nested structure. We are not - interested in tracking how many levels of nesting, just whether - there is nesting at all. This is later used to adjust offsets - for pointers pointing into sub-structures. */ - unsigned int in_nested_struct : 1; - /* Points-to set for this variable. */ bitmap solution; @@ -4050,19 +4041,28 @@ sort_fieldstack (VEC(fieldoff_s,heap) *f fieldoff_compare); } -/* Given a TYPE, and a vector of field offsets FIELDSTACK, push all the fields - of TYPE onto fieldstack, recording their offsets along the way. - OFFSET is used to keep track of the offset in this entire structure, rather - than just the immediately containing structure. Returns the number - of fields pushed. +/* Given a TYPE, and a vector of field offsets FIELDSTACK, push all + the fields of TYPE onto fieldstack, recording their offsets along + the way. + + OFFSET is used to keep track of the offset in this entire + structure, rather than just the immediately containing structure. + Returns the number of fields pushed. + HAS_UNION is set to true if we find a union type as a field of - TYPE. ADDRESSABLE_TYPE is the type of the outermost object that could have - its address taken. */ + TYPE. + + ADDRESSABLE_TYPE is the type of the outermost object that could + have its address taken. + + NESTING_LEVEL indicates whether TYPE is a structure nested inside + another, it starts at 0 and it is incremented by one on every + structure recursed into. */ int push_fields_onto_fieldstack (tree type, VEC(fieldoff_s,heap) **fieldstack, HOST_WIDE_INT offset, bool *has_union, - tree addressable_type) + tree addressable_type, unsigned nesting_level) { tree field; int count = 0; @@ -4119,11 +4119,14 @@ push_fields_onto_fieldstack (tree type, if (!AGGREGATE_TYPE_P (TREE_TYPE (type))) /* var_can_have_subvars */ push = true; else if (!(pushed = push_fields_onto_fieldstack - (TREE_TYPE (type), fieldstack, - offset + i * TREE_INT_CST_LOW (elsz), has_union, + (TREE_TYPE (type), + fieldstack, + offset + i * TREE_INT_CST_LOW (elsz), + has_union, (TYPE_NONALIASED_COMPONENT (type) ? addressable_type - : TREE_TYPE (type))))) + : TREE_TYPE (type)), + nesting_level + 1))) /* Empty structures may have actual size, like in C++. So see if we didn't push any subfields and the size is nonzero, push the field onto the stack */ @@ -4142,12 +4145,7 @@ push_fields_onto_fieldstack (tree type, pair->alias_set = get_alias_set (addressable_type); else pair->alias_set = -1; - - /* If the base offset is positive, this field belongs to - a structure nested inside the base structure. */ - if (offset > 0) - pair->in_nested_struct = true; - + pair->nesting_level = nesting_level; count++; } else @@ -4171,11 +4169,14 @@ push_fields_onto_fieldstack (tree type, if (!var_can_have_subvars (field)) push = true; else if (!(pushed = push_fields_onto_fieldstack - (TREE_TYPE (field), fieldstack, - offset + bitpos_of_field (field), has_union, + (TREE_TYPE (field), + fieldstack, + offset + bitpos_of_field (field), + has_union, (DECL_NONADDRESSABLE_P (field) ? addressable_type - : TREE_TYPE (field)))) + : TREE_TYPE (field)), + nesting_level + 1)) && DECL_SIZE (field) && !integer_zerop (DECL_SIZE (field))) /* Empty structures may have actual size, like in C++. So @@ -4196,12 +4197,7 @@ push_fields_onto_fieldstack (tree type, pair->alias_set = get_alias_set (addressable_type); else pair->alias_set = -1; - - /* If the base offset is positive, this field belongs to - a structure nested inside the base structure. */ - if (offset > 0) - pair->in_nested_struct = true; - + pair->nesting_level = nesting_level; count++; } else @@ -4401,7 +4397,7 @@ create_variable_info_for (tree decl, con if (var_can_have_subvars (decl) && use_field_sensitive && !hasunion) { push_fields_onto_fieldstack (decltype, &fieldstack, 0, &hasunion, - decltype); + decltype, 0); if (hasunion) { VEC_free (fieldoff_s, heap, fieldstack); @@ -4512,7 +4508,6 @@ create_variable_info_for (tree decl, con newvi->offset = fo->offset; newvi->size = TREE_INT_CST_LOW (fo->size); newvi->fullsize = vi->fullsize; - newvi->in_nested_struct = fo->in_nested_struct; insert_into_field_list (vi, newvi); VEC_safe_push (varinfo_t, heap, varmap, newvi); if (is_global && (!flag_whole_program || !in_ipa_mode)) @@ -4764,8 +4759,20 @@ set_uids_in_ptset (tree ptr, bitmap into if (no_tbaa_pruning || (!is_derefed && !vi->directly_dereferenced) || alias_sets_conflict_p (ptr_alias_set, var_alias_set)) - bitmap_set_bit (into, DECL_UID (sft)); - SFT_IN_NESTED_STRUCT (sft) = vi->in_nested_struct; + { + bitmap_set_bit (into, DECL_UID (sft)); + + /* If SFT is inside a nested structure, it will + be needed by the operand scanner to adjust + offsets when adding operands to memory + expressions that dereference PTR. This means + that memory partitioning may not partition + this SFT because the operand scanner will not + be able to find the other SFTs next to this + one. */ + if (SFT_NESTING_LEVEL (sft) > 0) + SFT_UNPARTITIONABLE_P (sft) = true; + } } } else Index: tree-ssa-operands.c =================================================================== --- tree-ssa-operands.c (revision 130139) +++ tree-ssa-operands.c (working copy) @@ -1367,8 +1367,33 @@ access_can_touch_variable (tree ref, tre return true; } + +/* Given an aggregate expression FULL_REF, return the number of + aggregates that are containing FULL_REF. So, given a structure + reference a.b.c.d, the nesting level for this expression is 2 (the + number of '.' in the expression minus 1). */ + +static unsigned +ref_nesting_level (tree full_ref) +{ + unsigned nesting_level = 0; + + if (!handled_component_p (full_ref)) + return 0; + + full_ref = TREE_OPERAND (full_ref, 0); + while (handled_component_p (full_ref)) + { + nesting_level++; + full_ref = TREE_OPERAND (full_ref, 0); + } + + return nesting_level; +} + + /* Add the actual variables FULL_REF can access, given a member of - full_ref's points-to set VAR, where FULL_REF is an access of SIZE at + FULL_REF's points-to set VAR, where FULL_REF is an access of SIZE at OFFSET from var. IS_CALL_SITE is true if this is a call, and IS_DEF is true if this is supposed to be a vdef, and false if this should be a VUSE. @@ -1386,10 +1411,12 @@ access_can_touch_variable (tree ref, tre This is necessary because foop only actually points to foo's first member, so that is all the points-to set contains. However, an access to foop->a may be touching some single SFT if we have created some - SFT's for a structure. */ + SFT's for a structure. + + FULL_REF is the original memory expression being analyzed. */ static bool -add_vars_for_offset (tree var, unsigned HOST_WIDE_INT offset, +add_vars_for_offset (tree full_ref, tree var, unsigned HOST_WIDE_INT offset, unsigned HOST_WIDE_INT size, bool is_def) { bool added = false; @@ -1397,14 +1424,21 @@ add_vars_for_offset (tree var, unsigned subvar_t sv; unsigned int i; - if (SFT_IN_NESTED_STRUCT (var)) + if (full_ref + && SFT_NESTING_LEVEL (var) > 0 + && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var)) { /* Since VAR is an SFT inside a nested structure, the OFFSET computed by get_ref_base_and_extent is the offset from the - start of the immediately containing structure. However, to - find out what other SFTs are affected by this reference, we - need to know the offsets starting at the root structure in - the nesting hierarchy. + start of the immediately containing structure. If VAR is an + SFT inside a nested structure, then FULL_REF may be a + reference to the structure immediately enclosing SFT, and so + OFFSET will be the offset from the start of the immediately + enclosing structure. + + However, to find out what other SFTs are affected by this + reference, we need to know the offsets starting at the root + structure in the nesting hierarchy. For instance, given the following structure: @@ -1541,7 +1575,7 @@ add_virtual_operand (tree var, stmt_ann_ if it is a potential points-to location. */ if (TREE_CODE (al) == STRUCT_FIELD_TAG && TREE_CODE (var) == NAME_MEMORY_TAG) - none_added &= !add_vars_for_offset (al, offset, size, + none_added &= !add_vars_for_offset (full_ref, al, offset, size, flags & opf_def); else { ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-13 16:30 ` Diego Novillo @ 2007-11-13 16:47 ` Richard Guenther 2007-11-14 12:36 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-13 16:47 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On Nov 13, 2007 4:44 PM, Diego Novillo <dnovillo@google.com> wrote: > Richard Guenther wrote: > > > taking the address of result.x.pDirty causes the SFT to be marked as > > in substructure and the problem re-surfaces. > > Yes. Thanks for the test case. The original patch did not take into > account that the nesting level of the memory reference is also important. > > There are two problems here: > > 1- Given an arbitrary memory expression M dereferencing a pointer P into > a nested structure, the offset computed for M is the offset starting at > P. However, the offsets computed for the original SFTs are offsets from > the top of the pointed-to memory object O. So, when we recognize this > situation, we must adjust the offsets. This was the situation > recognized and fixed by the original patch. > > The bug was in the detection of whether P is "pointing to the middle of > an object". This is where the nesting level comes in to play. Suppose > that M was the expression p_3->b[3] and the original pointed-to object > was v.x.y: > > struct { > ... > struct { > ... > struct { > ... > int b[4]; > } y; > } x; > } v; > > p_3 = &v.x.y; > > If alias analysis created SFTs for this structure, the array 'b' will > have 4 SFTs associated with it. Say SFT.12, SFT.13, SFT.14 and SFT.15. > However, only SFT.12 will be added to the alias set of p_3's name tag. > > So, when we see the expression 'p_3->b[3]', we compute an offset of 96, > but we need SFT.14 which will probably have an offset much higher than > that. So, we adjust 96 by the offset of SFT.12 and get to SFT.14. > > In the original patch, we recognized this situation by asking whether > the SFT.12 was inside a nested structure. However, it may happen that > the pointer was actually pointing to the base of the original object > 'v'. In which case, we would have the expression 'p_3->x.y.b[3]', which > also takes us to SFT.12. > > But in this case, the offset we computed from 'p_3->x.y.b[3]' is > *already* the right offset, so adding the offset of SFT.12 to it gives > us the wrong offset. So, what this patch does is determine the nesting > level of each SFT and the nesting level of the memory expression. We > only need to adjust the offset of an SFT if the nesting level of the > memory expression is lower than the nesting level of the SFT that we are > analyzing. > > > 2- The second problem we have with this scheme is that we rely on the > presence of key SFTs to adjust offsets when we detect references to > nested aggregates. This is something we cannot easily undo because > inside the operand scanner we cannot really start doing use-def chain > chasing to determine the original pointed-to objects. This work has > already been done by alias analysis, so duplicating this in the operand > scanner would be wasteful. > > This means that this scheme cannot tolerate these key SFTs to be > partitioned away. If they are partitioned, then inside the operand > scanner is essentially impossible to determine where in the structure is > an arbitrary memory expression referring to. This is shown in the new > test gcc.dg/tree-ssa/alias-16.c. > > One way to deal with this would be to use the pointed-to set for the > base pointer instead of the alias set of the name tag. This has the > problem that we'd miss the compile-time benefits of partitioning, as we > would be traversing the original sets instead of the partitioned > (smaller) alias sets. Also, we may want to get rid of the duplicate > pointed-to/alias sets in the future (not sure if they're both worth > keeping). > > The other way of dealing with this is to recognize which SFTs are > important to not partition and make sure they are never partitioned. > During alias analysis, if a nested SFT is added to the alias set of a > pointer, then it is marked as unpartitionable. The partitioner, in > turn, gives these SFTs a very high score and makes sure that they are > never added to a partition. The effects of this are/should be > negligible as it only involves a single SFT and nested structures. > > The new test alias-16.c tests for these edge cases. > > Tested on x86_64 and ppc64. Committed to trunk. Thanks. Now, you should be able to do all this without introducing all the nesting-level stuff if you make all pointed-to SFTs unpartitionable. In fact, > - if (SFT_IN_NESTED_STRUCT (var)) > + if (full_ref > + && SFT_NESTING_LEVEL (var) > 0 > + && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var)) > { > /* Since VAR is an SFT inside a nested structure, the OFFSET > computed by get_ref_base_and_extent is the offset from the > - start of the immediately containing structure. However, to > - find out what other SFTs are affected by this reference, we > - need to know the offsets starting at the root structure in > - the nesting hierarchy. > + start of the immediately containing structure. If VAR is an > + SFT inside a nested structure, then FULL_REF may be a > + reference to the structure immediately enclosing SFT, and so > + OFFSET will be the offset from the start of the immediately > + enclosing structure. > + > + However, to find out what other SFTs are affected by this > + reference, we need to know the offsets starting at the root > + structure in the nesting hierarchy. > > For instance, given the following structure: This looks like a pure workaround for that you do not retain the offset zero SFT as unpartitionable. Otherwise you'd enter this function with it (if it is pointed-to), and you can always add the pointed-to SFT offset. I believe your patch makes it more twisted than it needs to be (it doesn't even look like an optimization, as you are scanning for pointed-to offset zero multiple times, for all pointed-to SFTs with a nesting level > than the ref). And don't you still get struct X { int i; int a[4]; } x; p = &x->a[0]; return p[3]; wrong? You don't keep the SFT for a[0] from being partitioned away(?). [it get's twisted to force GCC to generate IL with an ARRAY_REF here, but with fortran it's possible I believe, so this is a theory example] Thanks, Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-13 16:47 ` Richard Guenther @ 2007-11-14 12:36 ` Richard Guenther 2007-11-14 12:41 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-14 12:36 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On Nov 13, 2007 5:08 PM, Richard Guenther <richard.guenther@gmail.com> wrote: > > On Nov 13, 2007 4:44 PM, Diego Novillo <dnovillo@google.com> wrote: > > Richard Guenther wrote: > > > > > taking the address of result.x.pDirty causes the SFT to be marked as > > > in substructure and the problem re-surfaces. > > > > Yes. Thanks for the test case. The original patch did not take into > > account that the nesting level of the memory reference is also important. > > > > There are two problems here: > > > > 1- Given an arbitrary memory expression M dereferencing a pointer P into > > a nested structure, the offset computed for M is the offset starting at > > P. However, the offsets computed for the original SFTs are offsets from > > the top of the pointed-to memory object O. So, when we recognize this > > situation, we must adjust the offsets. This was the situation > > recognized and fixed by the original patch. > > > > The bug was in the detection of whether P is "pointing to the middle of > > an object". This is where the nesting level comes in to play. Suppose > > that M was the expression p_3->b[3] and the original pointed-to object > > was v.x.y: > > > > struct { > > ... > > struct { > > ... > > struct { > > ... > > int b[4]; > > } y; > > } x; > > } v; > > > > p_3 = &v.x.y; > > > > If alias analysis created SFTs for this structure, the array 'b' will > > have 4 SFTs associated with it. Say SFT.12, SFT.13, SFT.14 and SFT.15. > > However, only SFT.12 will be added to the alias set of p_3's name tag. > > > > So, when we see the expression 'p_3->b[3]', we compute an offset of 96, > > but we need SFT.14 which will probably have an offset much higher than > > that. So, we adjust 96 by the offset of SFT.12 and get to SFT.14. > > > > In the original patch, we recognized this situation by asking whether > > the SFT.12 was inside a nested structure. However, it may happen that > > the pointer was actually pointing to the base of the original object > > 'v'. In which case, we would have the expression 'p_3->x.y.b[3]', which > > also takes us to SFT.12. > > > > But in this case, the offset we computed from 'p_3->x.y.b[3]' is > > *already* the right offset, so adding the offset of SFT.12 to it gives > > us the wrong offset. So, what this patch does is determine the nesting > > level of each SFT and the nesting level of the memory expression. We > > only need to adjust the offset of an SFT if the nesting level of the > > memory expression is lower than the nesting level of the SFT that we are > > analyzing. > > > > > > 2- The second problem we have with this scheme is that we rely on the > > presence of key SFTs to adjust offsets when we detect references to > > nested aggregates. This is something we cannot easily undo because > > inside the operand scanner we cannot really start doing use-def chain > > chasing to determine the original pointed-to objects. This work has > > already been done by alias analysis, so duplicating this in the operand > > scanner would be wasteful. > > > > This means that this scheme cannot tolerate these key SFTs to be > > partitioned away. If they are partitioned, then inside the operand > > scanner is essentially impossible to determine where in the structure is > > an arbitrary memory expression referring to. This is shown in the new > > test gcc.dg/tree-ssa/alias-16.c. > > > > One way to deal with this would be to use the pointed-to set for the > > base pointer instead of the alias set of the name tag. This has the > > problem that we'd miss the compile-time benefits of partitioning, as we > > would be traversing the original sets instead of the partitioned > > (smaller) alias sets. Also, we may want to get rid of the duplicate > > pointed-to/alias sets in the future (not sure if they're both worth > > keeping). > > > > The other way of dealing with this is to recognize which SFTs are > > important to not partition and make sure they are never partitioned. > > During alias analysis, if a nested SFT is added to the alias set of a > > pointer, then it is marked as unpartitionable. The partitioner, in > > turn, gives these SFTs a very high score and makes sure that they are > > never added to a partition. The effects of this are/should be > > negligible as it only involves a single SFT and nested structures. > > > > The new test alias-16.c tests for these edge cases. > > > > Tested on x86_64 and ppc64. Committed to trunk. > > Thanks. Now, you should be able to do all this without introducing > all the nesting-level stuff if you make all pointed-to SFTs unpartitionable. > > In fact, > > > - if (SFT_IN_NESTED_STRUCT (var)) > > + if (full_ref > > + && SFT_NESTING_LEVEL (var) > 0 > > + && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var)) > > { > > /* Since VAR is an SFT inside a nested structure, the OFFSET > > computed by get_ref_base_and_extent is the offset from the > > - start of the immediately containing structure. However, to > > - find out what other SFTs are affected by this reference, we > > - need to know the offsets starting at the root structure in > > - the nesting hierarchy. > > + start of the immediately containing structure. If VAR is an > > + SFT inside a nested structure, then FULL_REF may be a > > + reference to the structure immediately enclosing SFT, and so > > + OFFSET will be the offset from the start of the immediately > > + enclosing structure. > > + > > + However, to find out what other SFTs are affected by this > > + reference, we need to know the offsets starting at the root > > + structure in the nesting hierarchy. > > > > For instance, given the following structure: > > This looks like a pure workaround for that you do not retain the offset > zero SFT as unpartitionable. Otherwise you'd enter this function with > it (if it is pointed-to), and you can always add the pointed-to SFT offset. > > I believe your patch makes it more twisted than it needs to be (it doesn't > even look like an optimization, as you are scanning for pointed-to offset > zero multiple times, for all pointed-to SFTs with a nesting level > > than the ref). > > And don't you still get > > struct X { > int i; > int a[4]; > } x; > > p = &x->a[0]; > return p[3]; > > wrong? You don't keep the SFT for a[0] from being partitioned away(?). > [it get's twisted to force GCC to generate IL with an ARRAY_REF here, but > with fortran it's possible I believe, so this is a theory example] Ok, I see that even for struct X { int i; int a[4]; }; the SFTs for i and a have a nesting level of 1 (that is, if pointed-to, they are disregarded for partitioning). So the case that is not fixed yet is only the case of a toplevel array: struct X { int i; int a[4]; } m; int a[4]; static inline void *fwd(void *p) { return p; } int foo(int b) { int (*p)[4] = b ? &a : &m.a; a[3] = 0; (*p)[3] = 1; return (*p)[3] + (*p)[2] + (*p)[1] + a[0] + a[3]; } extern void abort (void); int main() { int i; for (i = 0; i < 4; ++i) a[i] = 0; if (foo(1) != 2) abort (); return 0; } fails with at least -O --param max-aliased-vops=1 Because we partition away the SFT for &a[0]. I'm going to fix this by removing all the nesting level logic again and just mark all pointed-to SFTs as unpartitionable. That means, the following fixes this: Index: tree-ssa-structalias.c =================================================================== --- tree-ssa-structalias.c (revision 130174) +++ tree-ssa-structalias.c (working copy) @@ -4770,7 +4770,7 @@ set_uids_in_ptset (tree ptr, bitmap into this SFT because the operand scanner will not be able to find the other SFTs next to this one. */ - if (SFT_NESTING_LEVEL (sft) > 0) + if (1 || SFT_NESTING_LEVEL (sft) > 0) SFT_UNPARTITIONABLE_P (sft) = true; } } Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 12:36 ` Richard Guenther @ 2007-11-14 12:41 ` Richard Guenther 2007-11-14 13:01 ` Diego Novillo 0 siblings, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-14 12:41 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On Nov 14, 2007 11:55 AM, Richard Guenther <richard.guenther@gmail.com> wrote: > > On Nov 13, 2007 5:08 PM, Richard Guenther <richard.guenther@gmail.com> wrote: > > > > On Nov 13, 2007 4:44 PM, Diego Novillo <dnovillo@google.com> wrote: > > > Richard Guenther wrote: > > > > > > > taking the address of result.x.pDirty causes the SFT to be marked as > > > > in substructure and the problem re-surfaces. > > > > > > Yes. Thanks for the test case. The original patch did not take into > > > account that the nesting level of the memory reference is also important. > > > > > > There are two problems here: > > > > > > 1- Given an arbitrary memory expression M dereferencing a pointer P into > > > a nested structure, the offset computed for M is the offset starting at > > > P. However, the offsets computed for the original SFTs are offsets from > > > the top of the pointed-to memory object O. So, when we recognize this > > > situation, we must adjust the offsets. This was the situation > > > recognized and fixed by the original patch. > > > > > > The bug was in the detection of whether P is "pointing to the middle of > > > an object". This is where the nesting level comes in to play. Suppose > > > that M was the expression p_3->b[3] and the original pointed-to object > > > was v.x.y: > > > > > > struct { > > > ... > > > struct { > > > ... > > > struct { > > > ... > > > int b[4]; > > > } y; > > > } x; > > > } v; > > > > > > p_3 = &v.x.y; > > > > > > If alias analysis created SFTs for this structure, the array 'b' will > > > have 4 SFTs associated with it. Say SFT.12, SFT.13, SFT.14 and SFT.15. > > > However, only SFT.12 will be added to the alias set of p_3's name tag. > > > > > > So, when we see the expression 'p_3->b[3]', we compute an offset of 96, > > > but we need SFT.14 which will probably have an offset much higher than > > > that. So, we adjust 96 by the offset of SFT.12 and get to SFT.14. > > > > > > In the original patch, we recognized this situation by asking whether > > > the SFT.12 was inside a nested structure. However, it may happen that > > > the pointer was actually pointing to the base of the original object > > > 'v'. In which case, we would have the expression 'p_3->x.y.b[3]', which > > > also takes us to SFT.12. > > > > > > But in this case, the offset we computed from 'p_3->x.y.b[3]' is > > > *already* the right offset, so adding the offset of SFT.12 to it gives > > > us the wrong offset. So, what this patch does is determine the nesting > > > level of each SFT and the nesting level of the memory expression. We > > > only need to adjust the offset of an SFT if the nesting level of the > > > memory expression is lower than the nesting level of the SFT that we are > > > analyzing. > > > > > > > > > 2- The second problem we have with this scheme is that we rely on the > > > presence of key SFTs to adjust offsets when we detect references to > > > nested aggregates. This is something we cannot easily undo because > > > inside the operand scanner we cannot really start doing use-def chain > > > chasing to determine the original pointed-to objects. This work has > > > already been done by alias analysis, so duplicating this in the operand > > > scanner would be wasteful. > > > > > > This means that this scheme cannot tolerate these key SFTs to be > > > partitioned away. If they are partitioned, then inside the operand > > > scanner is essentially impossible to determine where in the structure is > > > an arbitrary memory expression referring to. This is shown in the new > > > test gcc.dg/tree-ssa/alias-16.c. > > > > > > One way to deal with this would be to use the pointed-to set for the > > > base pointer instead of the alias set of the name tag. This has the > > > problem that we'd miss the compile-time benefits of partitioning, as we > > > would be traversing the original sets instead of the partitioned > > > (smaller) alias sets. Also, we may want to get rid of the duplicate > > > pointed-to/alias sets in the future (not sure if they're both worth > > > keeping). > > > > > > The other way of dealing with this is to recognize which SFTs are > > > important to not partition and make sure they are never partitioned. > > > During alias analysis, if a nested SFT is added to the alias set of a > > > pointer, then it is marked as unpartitionable. The partitioner, in > > > turn, gives these SFTs a very high score and makes sure that they are > > > never added to a partition. The effects of this are/should be > > > negligible as it only involves a single SFT and nested structures. > > > > > > The new test alias-16.c tests for these edge cases. > > > > > > Tested on x86_64 and ppc64. Committed to trunk. > > > > Thanks. Now, you should be able to do all this without introducing > > all the nesting-level stuff if you make all pointed-to SFTs unpartitionable. > > > > In fact, > > > > > - if (SFT_IN_NESTED_STRUCT (var)) > > > + if (full_ref > > > + && SFT_NESTING_LEVEL (var) > 0 > > > + && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var)) > > > { > > > /* Since VAR is an SFT inside a nested structure, the OFFSET > > > computed by get_ref_base_and_extent is the offset from the > > > - start of the immediately containing structure. However, to > > > - find out what other SFTs are affected by this reference, we > > > - need to know the offsets starting at the root structure in > > > - the nesting hierarchy. > > > + start of the immediately containing structure. If VAR is an > > > + SFT inside a nested structure, then FULL_REF may be a > > > + reference to the structure immediately enclosing SFT, and so > > > + OFFSET will be the offset from the start of the immediately > > > + enclosing structure. > > > + > > > + However, to find out what other SFTs are affected by this > > > + reference, we need to know the offsets starting at the root > > > + structure in the nesting hierarchy. > > > > > > For instance, given the following structure: > > > > This looks like a pure workaround for that you do not retain the offset > > zero SFT as unpartitionable. Otherwise you'd enter this function with > > it (if it is pointed-to), and you can always add the pointed-to SFT offset. > > > > I believe your patch makes it more twisted than it needs to be (it doesn't > > even look like an optimization, as you are scanning for pointed-to offset > > zero multiple times, for all pointed-to SFTs with a nesting level > > > than the ref). Oh, and as you basically go back towards my first fix that was reverted this patch causes compile-time regressions and we hit internal compiler error: in ssa_operand_alloc, at tree-ssa-operands.c:484 more often as well. (I believe the latter is because the max-aliased-vops limit is a per-function one while a per-stmt one would be more useful, though I see that it is difficult to estimate per-stmt VOP reduction during partitioning. But as we do the limit per-function we are not efficient in preventing this ICE) Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 12:41 ` Richard Guenther @ 2007-11-14 13:01 ` Diego Novillo 2007-11-14 13:18 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Diego Novillo @ 2007-11-14 13:01 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches Richard Guenther wrote: > Oh, and as you basically go back towards my first fix that was reverted this > patch causes compile-time regressions and we hit > internal compiler error: in ssa_operand_alloc, at tree-ssa-operands.c:484 > more often as well. You'll need to be more specific. We are simply limiting the first SFT of a nested struct out of the partitioner, instead of recursing into MPT elements during operand scanning. Where is the extra compile time coming from? Are we generating that many more virtual operands? It would help quite a bit if you'd open a PR with a testcase for me to reproduce this behaviour. Thanks. Diego. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 13:01 ` Diego Novillo @ 2007-11-14 13:18 ` Richard Guenther 2007-11-14 13:59 ` Diego Novillo 0 siblings, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-14 13:18 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On Nov 14, 2007 1:04 PM, Diego Novillo <dnovillo@google.com> wrote: > Richard Guenther wrote: > > > Oh, and as you basically go back towards my first fix that was reverted this > > patch causes compile-time regressions and we hit > > internal compiler error: in ssa_operand_alloc, at tree-ssa-operands.c:484 > > more often as well. > > You'll need to be more specific. We are simply limiting the first SFT > of a nested struct out of the partitioner, instead of recursing into MPT > elements during operand scanning. No, we are limiting all pointed-to SFTs of all structs (maybe this was not intended, but it is necessary) out of the partitioner. Which will effectively make the NMTs contain all pointed-to SFTs. My initial patch fixed the problem by recursing into MPTs that are contained in NMTs to look for SFTs in them. With my initial patch those MPTs could contain unrelated symbols, but clearly you are growing the numer of symbols in NMTs (the regression is not as bad as for my initial patch, but is the least regression we can have there if we want to fix the miscompilation this way). > Where is the extra compile time coming from? Are we generating that > many more virtual operands? It would help quite a bit if you'd open a > PR with a testcase for me to reproduce this behaviour. We enlarge the NMT aliased symbols. (And obviously we can now create aritifical testcases with any number of aliased SFTs remaining on a stmt, I'll come up with one of those for you) Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 13:18 ` Richard Guenther @ 2007-11-14 13:59 ` Diego Novillo 2007-11-14 14:07 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Diego Novillo @ 2007-11-14 13:59 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches Richard Guenther wrote: > No, we are limiting all pointed-to SFTs of all structs (maybe this was not > intended, but it is necessary) out of the partitioner. *All* of them? That's a bug. Only the first one inside a nested structure should be limited. > but clearly you are growing the numer of symbols in NMTs > (the regression is not as bad as for my initial patch, but is the least > regression we can have there if we want to fix the miscompilation this way). Yes, but we should only be adding as many SFTs as nested structures we have. So, to give you a ridiculous example, if we had a structure with 300 fields and 2 nested structures, we should only have 2 unpartitionable SFTs (the first one of each nested structure). > We enlarge the NMT aliased symbols. (And obviously we can now create > aritifical testcases with any number of aliased SFTs remaining on a stmt, > I'll come up with one of those for you) Thanks. The secret here is going to be one where we are marking way too many SFTs as unpartitionable. My contention is that we shouldn't be doing this too often. If we are, that's the bug. Thanks. Diego. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 13:59 ` Diego Novillo @ 2007-11-14 14:07 ` Richard Guenther 2007-11-14 14:09 ` Diego Novillo 0 siblings, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-14 14:07 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On Nov 14, 2007 1:41 PM, Diego Novillo <dnovillo@google.com> wrote: > Richard Guenther wrote: > > > No, we are limiting all pointed-to SFTs of all structs (maybe this was not > > intended, but it is necessary) out of the partitioner. > > *All* of them? That's a bug. Only the first one inside a nested > structure should be limited. Yes, for struct X { int i; } m; the SFT for &m.i; has nesting level 1. And this is required - see the example with the toplevel array that is still failing. > > but clearly you are growing the numer of symbols in NMTs > > (the regression is not as bad as for my initial patch, but is the least > > regression we can have there if we want to fix the miscompilation this way). > > Yes, but we should only be adding as many SFTs as nested structures we > have. So, to give you a ridiculous example, if we had a structure with > 300 fields and 2 nested structures, we should only have 2 > unpartitionable SFTs (the first one of each nested structure). Huh? No, all pointed-to SFTs are unpartitionable. So, struct X { struct Y { int i1; int i2; ... &i1; &i2; ... makes all SFTs that belong to i1, i2, etc. unpartitionable. At least that's how your code works ;) I guess in add_uids_for_ptset instead of ! if (SFT_NESTING_LEVEL (sft) > 0) ! SFT_UNPARTITIONABLE_P (sft) = true; we can get away with only doing that if the type of the SFT itself could have subvariables (which is probably again what you intended?). > > We enlarge the NMT aliased symbols. (And obviously we can now create > > aritifical testcases with any number of aliased SFTs remaining on a stmt, > > I'll come up with one of those for you) > > Thanks. The secret here is going to be one where we are marking way too > many SFTs as unpartitionable. My contention is that we shouldn't be > doing this too often. If we are, that's the bug. Probably the latter then. Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 14:07 ` Richard Guenther @ 2007-11-14 14:09 ` Diego Novillo 2007-11-14 14:22 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Diego Novillo @ 2007-11-14 14:09 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches Richard Guenther wrote: > I guess in add_uids_for_ptset instead of > > ! if (SFT_NESTING_LEVEL (sft) > 0) > ! SFT_UNPARTITIONABLE_P (sft) = true; > > we can get away with only doing that if the type of the SFT itself could > have subvariables (which is probably again what you intended?). Hmm, well the SFTs themselves are usually of scalar type. The first SFT of an int[] array is of type int. Yes, the bug is around here. Perhaps the easiest way to fix this is to mark the variable earlier when we are pushing fields in the fieldstack (in the code that computes the nesting level). And then transport that marker into the SFT here. Diego. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 14:09 ` Diego Novillo @ 2007-11-14 14:22 ` Richard Guenther 2007-11-14 15:29 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-14 14:22 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches, Richard Guenther On Nov 14, 2007 1:56 PM, Diego Novillo <dnovillo@google.com> wrote: > Richard Guenther wrote: > > > I guess in add_uids_for_ptset instead of > > > > ! if (SFT_NESTING_LEVEL (sft) > 0) > > ! SFT_UNPARTITIONABLE_P (sft) = true; > > > > we can get away with only doing that if the type of the SFT itself could > > have subvariables (which is probably again what you intended?). > > Hmm, well the SFTs themselves are usually of scalar type. The first SFT > of an int[] array is of type int. Yes, the bug is around here. Perhaps > the easiest way to fix this is to mark the variable earlier when we are > pushing fields in the fieldstack (in the code that computes the nesting > level). And then transport that marker into the SFT here. Yes, I'm doing that at the moment. Stay tuned for a patch. Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 14:22 ` Richard Guenther @ 2007-11-14 15:29 ` Richard Guenther 2007-11-14 15:30 ` Diego Novillo 0 siblings, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-14 15:29 UTC (permalink / raw) To: gcc-patches; +Cc: Diego Novillo On Wed, 14 Nov 2007, Richard Guenther wrote: > On Nov 14, 2007 1:56 PM, Diego Novillo <dnovillo@google.com> wrote: > > Richard Guenther wrote: > > > > > I guess in add_uids_for_ptset instead of > > > > > > ! if (SFT_NESTING_LEVEL (sft) > 0) > > > ! SFT_UNPARTITIONABLE_P (sft) = true; > > > > > > we can get away with only doing that if the type of the SFT itself could > > > have subvariables (which is probably again what you intended?). > > > > Hmm, well the SFTs themselves are usually of scalar type. The first SFT > > of an int[] array is of type int. Yes, the bug is around here. Perhaps > > the easiest way to fix this is to mark the variable earlier when we are > > pushing fields in the fieldstack (in the code that computes the nesting > > level). And then transport that marker into the SFT here. > > Yes, I'm doing that at the moment. Stay tuned for a patch. Like this one. It does a few things though: 1) Remove the nesting level stuff 2) Mark SFTs that can be used as a base for further sub-references (ARRAY_REFs, REAL/IMAGPART, COMPONENT_REF, etc.) 3) Mark all pointed-to SFTs that can be used as a base for sub-references as not partitionable 4) Fix a bug in TBAA pruning (without that we TBAA prune the pointed-to SFT (of type unsigned int) from the NMT as unsigned int does not alias with struct PgHdr *. We were wrongly using the pointer alias set instead of the pointed-to alias set for the pruning. This should reduce the compile-time regression we are seeing at the moment. Bootstrap and regtest on x86_64-unknown-linux-gnu in progress, I intend to commit this if that (as expected) finishes successfully. Richard. 2007-11-14 Richard Guenther <rguenther@suse.de> * tree.h (struct tree_memory_tag): Add base_for_components flag. (struct tree_struct_field_tag): Remove nesting_level field. (SFT_NESTING_LEVEL): Remove. (SFT_BASE_FOR_COMPONENTS_P): Add. * tree-flow.h (struct fieldoff): Remove nesting_level field. Add base_for_components flag. (push_fields_onto_fieldstack): Remove nesting_level parameter. * tree-ssa-alias.c (create_sft): Likewise. Add base_for_components parameter. (create_overlap_variables_for): Deal with it. * tree-dfa.c (dump_subvars_for): Likewise. (dump_variable): Likewise. * tree-ssa-structalias.c (push_fields_onto_fieldstack): Likewise. Set base_for_components for first elements of sub-structures. (create_variable_info_for): Handle base_for_components. (set_uids_in_ptset): Always set SFT_UNPARTITIONABLE_P for pointed-to SFTs if SFT_BASE_FOR_COMPONENTS_P is set. Use the correct alias set to do TBAA pruning. (find_what_p_points_to): Only call set_uids_in_ptset for pointer variables. * tree-ssa-operands.c (ref_nesting_level): Remove. (add_vars_for_offset): Remove full_ref parameter, always add the offset of the pointed-to SFT. (add_virtual_operand): Adjust for changed signature of add_vars_for_offset. * gcc.dg/torture/pr33870.c: New testcase. Index: tree.h =================================================================== *** tree.h (revision 130174) --- tree.h (working copy) *************** struct tree_memory_tag GTY(()) *** 2557,2562 **** --- 2557,2565 ---- /* True if this tag has global scope. */ unsigned int is_global : 1; + /* True if this tag can be a base for component accesses. */ + unsigned int base_for_components : 1; + /* True if this tag should not be grouped into a memory partition. */ unsigned int unpartitionable : 1; }; *************** struct tree_struct_field_tag GTY(()) *** 2579,2601 **** /* Alias set for a DECL_NONADDRESSABLE_P field. Otherwise -1. */ alias_set_type alias_set; - - /* Nesting level for this subvariable. This indicates how many - structures are wrapping this field. Fields at the top level have - a nesting level of 0. */ - unsigned int nesting_level; }; - #define SFT_PARENT_VAR(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.parent_var) #define SFT_OFFSET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.offset) #define SFT_SIZE(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.size) #define SFT_NONADDRESSABLE_P(NODE) \ (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1) #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set) - #define SFT_NESTING_LEVEL(NODE) \ - (STRUCT_FIELD_TAG_CHECK (NODE)->sft.nesting_level) #define SFT_UNPARTITIONABLE_P(NODE) \ (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.unpartitionable) /* Memory Partition Tags (MPTs) group memory symbols under one common name for the purposes of placing memory PHI nodes. */ --- 2582,2598 ---- /* Alias set for a DECL_NONADDRESSABLE_P field. Otherwise -1. */ alias_set_type alias_set; }; #define SFT_PARENT_VAR(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.parent_var) #define SFT_OFFSET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.offset) #define SFT_SIZE(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.size) #define SFT_NONADDRESSABLE_P(NODE) \ (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set != -1) #define SFT_ALIAS_SET(NODE) (STRUCT_FIELD_TAG_CHECK (NODE)->sft.alias_set) #define SFT_UNPARTITIONABLE_P(NODE) \ (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.unpartitionable) + #define SFT_BASE_FOR_COMPONENTS_P(NODE) \ + (STRUCT_FIELD_TAG_CHECK (NODE)->sft.common.base_for_components) /* Memory Partition Tags (MPTs) group memory symbols under one common name for the purposes of placing memory PHI nodes. */ Index: testsuite/gcc.dg/torture/pr33870.c =================================================================== *** testsuite/gcc.dg/torture/pr33870.c (revision 0) --- testsuite/gcc.dg/torture/pr33870.c (revision 0) *************** *** 0 **** --- 1,30 ---- + /* { dg-do run } */ + /* { dg-options "--param max-aliased-vops=1" } */ + + struct X { + int i; + int a[4]; + } m; + + int a[4]; + + int __attribute__((noinline)) foo(int b) + { + int (*p)[4] = b ? &a : &m.a; + a[3] = 0; + (*p)[3] = 1; + return (*p)[3] + (*p)[2] + (*p)[1] + a[0] + a[3]; + } + + extern void abort (void); + + int main() + { + int i; + for (i = 0; i < 4; ++i) + a[i] = 0; + if (foo(1) != 2) + abort (); + return 0; + } + Index: tree-ssa-alias.c =================================================================== *** tree-ssa-alias.c (revision 130174) --- tree-ssa-alias.c (working copy) *************** get_or_create_used_part_for (size_t uid) *** 3791,3797 **** static tree create_sft (tree var, tree field, unsigned HOST_WIDE_INT offset, unsigned HOST_WIDE_INT size, alias_set_type alias_set, ! unsigned nesting_level) { tree subvar = create_tag_raw (STRUCT_FIELD_TAG, field, "SFT"); --- 3791,3797 ---- static tree create_sft (tree var, tree field, unsigned HOST_WIDE_INT offset, unsigned HOST_WIDE_INT size, alias_set_type alias_set, ! bool base_for_components) { tree subvar = create_tag_raw (STRUCT_FIELD_TAG, field, "SFT"); *************** create_sft (tree var, tree field, unsign *** 3811,3817 **** SFT_OFFSET (subvar) = offset; SFT_SIZE (subvar) = size; SFT_ALIAS_SET (subvar) = alias_set; ! SFT_NESTING_LEVEL (subvar) = nesting_level; return subvar; } --- 3811,3818 ---- SFT_OFFSET (subvar) = offset; SFT_SIZE (subvar) = size; SFT_ALIAS_SET (subvar) = alias_set; ! SFT_BASE_FOR_COMPONENTS_P (subvar) = base_for_components; ! SFT_UNPARTITIONABLE_P (subvar) = false; return subvar; } *************** create_overlap_variables_for (tree var) *** 3833,3839 **** return; push_fields_onto_fieldstack (TREE_TYPE (var), &fieldstack, 0, NULL, ! TREE_TYPE (var), 0); if (VEC_length (fieldoff_s, fieldstack) != 0) { subvar_t *subvars; --- 3834,3840 ---- return; push_fields_onto_fieldstack (TREE_TYPE (var), &fieldstack, 0, NULL, ! TREE_TYPE (var)); if (VEC_length (fieldoff_s, fieldstack) != 0) { subvar_t *subvars; *************** create_overlap_variables_for (tree var) *** 3916,3921 **** --- 3917,3923 ---- field, skip it. Note that we always need the field at offset 0 so we can properly handle pointers to the structure. */ + if ((fo->offset != 0 && ((fo->offset <= up->minused && fo->offset + fosize <= up->minused) *************** create_overlap_variables_for (tree var) *** 3924,3932 **** && fosize == lastfosize && currfotype == lastfotype)) continue; ! ! subvar = create_sft (var, fo->type, fo->offset, fosize, ! fo->alias_set, fo->nesting_level); VEC_quick_push (tree, *subvars, subvar); if (dump_file) --- 3926,3933 ---- && fosize == lastfosize && currfotype == lastfotype)) continue; ! subvar = create_sft (var, fo->type, fo->offset, ! fosize, fo->alias_set, fo->base_for_components); VEC_quick_push (tree, *subvars, subvar); if (dump_file) *************** create_overlap_variables_for (tree var) *** 3937,3944 **** SFT_OFFSET (subvar)); fprintf (dump_file, " size " HOST_WIDE_INT_PRINT_DEC, SFT_SIZE (subvar)); ! fprintf (dump_file, " nesting level %d\n", ! SFT_NESTING_LEVEL (subvar)); } lastfotype = currfotype; --- 3938,3944 ---- SFT_OFFSET (subvar)); fprintf (dump_file, " size " HOST_WIDE_INT_PRINT_DEC, SFT_SIZE (subvar)); ! fprintf (dump_file, "\n"); } lastfotype = currfotype; Index: tree-dfa.c =================================================================== *** tree-dfa.c (revision 130174) --- tree-dfa.c (working copy) *************** dump_subvars_for (FILE *file, tree var) *** 287,294 **** for (i = 0; VEC_iterate (tree, sv, i, subvar); ++i) { print_generic_expr (file, subvar, dump_flags); fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED, SFT_OFFSET (subvar)); - fprintf (file, "[%u]", SFT_NESTING_LEVEL (subvar)); fprintf (file, " "); } --- 287,295 ---- for (i = 0; VEC_iterate (tree, sv, i, subvar); ++i) { print_generic_expr (file, subvar, dump_flags); + if (SFT_BASE_FOR_COMPONENTS_P (subvar)) + fprintf (file, "@"); fprintf (file, "@" HOST_WIDE_INT_PRINT_UNSIGNED, SFT_OFFSET (subvar)); fprintf (file, " "); } *************** dump_variable (FILE *file, tree var) *** 424,430 **** { fprintf (file, ", offset: " HOST_WIDE_INT_PRINT_UNSIGNED, SFT_OFFSET (var)); ! fprintf (file, ", nesting: %u", SFT_NESTING_LEVEL (var)); fprintf (file, ", partitionable: %s", SFT_UNPARTITIONABLE_P (var) ? "NO" : "YES"); } --- 425,432 ---- { fprintf (file, ", offset: " HOST_WIDE_INT_PRINT_UNSIGNED, SFT_OFFSET (var)); ! fprintf (file, ", base for components: %s", ! SFT_BASE_FOR_COMPONENTS_P (var) ? "NO" : "YES"); fprintf (file, ", partitionable: %s", SFT_UNPARTITIONABLE_P (var) ? "NO" : "YES"); } Index: tree-flow.h =================================================================== *** tree-flow.h (revision 130174) --- tree-flow.h (working copy) *************** struct fieldoff *** 1159,1180 **** /* Field. */ tree decl; - /* Nesting level. This number represents how many structures are - wrapping this field. */ - unsigned nesting_level; - /* Offset from the base of the base containing object to this field. */ HOST_WIDE_INT offset; /* Alias set for the field. */ alias_set_type alias_set; }; typedef struct fieldoff fieldoff_s; DEF_VEC_O(fieldoff_s); DEF_VEC_ALLOC_O(fieldoff_s,heap); ! int push_fields_onto_fieldstack (tree, VEC(fieldoff_s,heap) **, HOST_WIDE_INT, ! bool *, tree, unsigned); void sort_fieldstack (VEC(fieldoff_s,heap) *); void init_alias_heapvars (void); --- 1159,1179 ---- /* Field. */ tree decl; /* Offset from the base of the base containing object to this field. */ HOST_WIDE_INT offset; /* Alias set for the field. */ alias_set_type alias_set; + + /* True, if this offset can be a base for further component accesses. */ + unsigned base_for_components : 1; }; typedef struct fieldoff fieldoff_s; DEF_VEC_O(fieldoff_s); DEF_VEC_ALLOC_O(fieldoff_s,heap); ! int push_fields_onto_fieldstack (tree, VEC(fieldoff_s,heap) **, ! HOST_WIDE_INT, bool *, tree); void sort_fieldstack (VEC(fieldoff_s,heap) *); void init_alias_heapvars (void); Index: tree-ssa-structalias.c =================================================================== *** tree-ssa-structalias.c (revision 130174) --- tree-ssa-structalias.c (working copy) *************** sort_fieldstack (VEC(fieldoff_s,heap) *f *** 4053,4071 **** TYPE. ADDRESSABLE_TYPE is the type of the outermost object that could ! have its address taken. ! ! NESTING_LEVEL indicates whether TYPE is a structure nested inside ! another, it starts at 0 and it is incremented by one on every ! structure recursed into. */ int push_fields_onto_fieldstack (tree type, VEC(fieldoff_s,heap) **fieldstack, HOST_WIDE_INT offset, bool *has_union, ! tree addressable_type, unsigned nesting_level) { tree field; int count = 0; if (TREE_CODE (type) == COMPLEX_TYPE) { --- 4053,4068 ---- TYPE. ADDRESSABLE_TYPE is the type of the outermost object that could ! have its address taken. */ int push_fields_onto_fieldstack (tree type, VEC(fieldoff_s,heap) **fieldstack, HOST_WIDE_INT offset, bool *has_union, ! tree addressable_type) { tree field; int count = 0; + int first_element = VEC_length (fieldoff_s, *fieldstack); if (TREE_CODE (type) == COMPLEX_TYPE) { *************** push_fields_onto_fieldstack (tree type, *** 4076,4081 **** --- 4073,4079 ---- real_part->offset = offset; real_part->decl = NULL_TREE; real_part->alias_set = -1; + real_part->base_for_components = false; img_part = VEC_safe_push (fieldoff_s, heap, *fieldstack, NULL); img_part->type = TREE_TYPE (type); *************** push_fields_onto_fieldstack (tree type, *** 4083,4093 **** img_part->offset = offset + TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (type))); img_part->decl = NULL_TREE; img_part->alias_set = -1; ! return 2; } ! if (TREE_CODE (type) == ARRAY_TYPE) { tree sz = TYPE_SIZE (type); tree elsz = TYPE_SIZE (TREE_TYPE (type)); --- 4081,4092 ---- img_part->offset = offset + TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (type))); img_part->decl = NULL_TREE; img_part->alias_set = -1; + img_part->base_for_components = false; ! count = 2; } ! else if (TREE_CODE (type) == ARRAY_TYPE) { tree sz = TYPE_SIZE (type); tree elsz = TYPE_SIZE (TREE_TYPE (type)); *************** push_fields_onto_fieldstack (tree type, *** 4125,4132 **** has_union, (TYPE_NONALIASED_COMPONENT (type) ? addressable_type ! : TREE_TYPE (type)), ! nesting_level + 1))) /* Empty structures may have actual size, like in C++. So see if we didn't push any subfields and the size is nonzero, push the field onto the stack */ --- 4124,4130 ---- has_union, (TYPE_NONALIASED_COMPONENT (type) ? addressable_type ! : TREE_TYPE (type))))) /* Empty structures may have actual size, like in C++. So see if we didn't push any subfields and the size is nonzero, push the field onto the stack */ *************** push_fields_onto_fieldstack (tree type, *** 4145,4160 **** pair->alias_set = get_alias_set (addressable_type); else pair->alias_set = -1; ! pair->nesting_level = nesting_level; count++; } else count += pushed; } - - return count; } for (field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field)) if (TREE_CODE (field) == FIELD_DECL) { --- 4143,4158 ---- pair->alias_set = get_alias_set (addressable_type); else pair->alias_set = -1; ! pair->base_for_components = false; count++; } else count += pushed; } } + else + { for (field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field)) if (TREE_CODE (field) == FIELD_DECL) { *************** push_fields_onto_fieldstack (tree type, *** 4175,4182 **** has_union, (DECL_NONADDRESSABLE_P (field) ? addressable_type ! : TREE_TYPE (field)), ! nesting_level + 1)) && DECL_SIZE (field) && !integer_zerop (DECL_SIZE (field))) /* Empty structures may have actual size, like in C++. So --- 4173,4179 ---- has_union, (DECL_NONADDRESSABLE_P (field) ? addressable_type ! : TREE_TYPE (field)))) && DECL_SIZE (field) && !integer_zerop (DECL_SIZE (field))) /* Empty structures may have actual size, like in C++. So *************** push_fields_onto_fieldstack (tree type, *** 4197,4208 **** pair->alias_set = get_alias_set (addressable_type); else pair->alias_set = -1; ! pair->nesting_level = nesting_level; count++; } else count += pushed; } return count; } --- 4194,4211 ---- pair->alias_set = get_alias_set (addressable_type); else pair->alias_set = -1; ! pair->base_for_components = false; count++; } else count += pushed; } + } + + /* Make sure the first pushed field is marked as eligible for + being a base for component references. */ + if (count > 0) + VEC_index (fieldoff_s, *fieldstack, first_element)->base_for_components = true; return count; } *************** create_variable_info_for (tree decl, con *** 4397,4403 **** if (var_can_have_subvars (decl) && use_field_sensitive && !hasunion) { push_fields_onto_fieldstack (decltype, &fieldstack, 0, &hasunion, ! decltype, 0); if (hasunion) { VEC_free (fieldoff_s, heap, fieldstack); --- 4400,4406 ---- if (var_can_have_subvars (decl) && use_field_sensitive && !hasunion) { push_fields_onto_fieldstack (decltype, &fieldstack, 0, &hasunion, ! decltype); if (hasunion) { VEC_free (fieldoff_s, heap, fieldstack); *************** set_uids_in_ptset (tree ptr, bitmap into *** 4719,4725 **** { unsigned int i; bitmap_iterator bi; ! alias_set_type ptr_alias_set = get_alias_set (TREE_TYPE (ptr)); EXECUTE_IF_SET_IN_BITMAP (from, 0, i, bi) { --- 4722,4728 ---- { unsigned int i; bitmap_iterator bi; ! alias_set_type ptr_alias_set = get_alias_set (TREE_TYPE (TREE_TYPE (ptr))); EXECUTE_IF_SET_IN_BITMAP (from, 0, i, bi) { *************** set_uids_in_ptset (tree ptr, bitmap into *** 4762,4776 **** { bitmap_set_bit (into, DECL_UID (sft)); ! /* If SFT is inside a nested structure, it will ! be needed by the operand scanner to adjust ! offsets when adding operands to memory expressions that dereference PTR. This means that memory partitioning may not partition this SFT because the operand scanner will not be able to find the other SFTs next to this ! one. */ ! if (SFT_NESTING_LEVEL (sft) > 0) SFT_UNPARTITIONABLE_P (sft) = true; } } --- 4765,4779 ---- { bitmap_set_bit (into, DECL_UID (sft)); ! /* Pointed-to SFTs are needed by the operand scanner ! to adjust offsets when adding operands to memory expressions that dereference PTR. This means that memory partitioning may not partition this SFT because the operand scanner will not be able to find the other SFTs next to this ! one. But we only need to do this if the pointed ! to type is aggregate. */ ! if (SFT_BASE_FOR_COMPONENTS_P (sft)) SFT_UNPARTITIONABLE_P (sft) = true; } } *************** find_what_p_points_to (tree p) *** 4996,5004 **** pi->pt_global_mem = 1; } ! set_uids_in_ptset (vi->decl, finished_solution, vi->solution, ! vi->directly_dereferenced, ! vi->no_tbaa_pruning); result = shared_bitmap_lookup (finished_solution); if (!result) --- 4999,5008 ---- pi->pt_global_mem = 1; } ! if (POINTER_TYPE_P (TREE_TYPE (vi->decl))) ! set_uids_in_ptset (vi->decl, finished_solution, vi->solution, ! vi->directly_dereferenced, ! vi->no_tbaa_pruning); result = shared_bitmap_lookup (finished_solution); if (!result) Index: tree-ssa-operands.c =================================================================== *** tree-ssa-operands.c (revision 130174) --- tree-ssa-operands.c (working copy) *************** access_can_touch_variable (tree ref, tre *** 1367,1399 **** return true; } - - /* Given an aggregate expression FULL_REF, return the number of - aggregates that are containing FULL_REF. So, given a structure - reference a.b.c.d, the nesting level for this expression is 2 (the - number of '.' in the expression minus 1). */ - - static unsigned - ref_nesting_level (tree full_ref) - { - unsigned nesting_level = 0; - - if (!handled_component_p (full_ref)) - return 0; - - full_ref = TREE_OPERAND (full_ref, 0); - while (handled_component_p (full_ref)) - { - nesting_level++; - full_ref = TREE_OPERAND (full_ref, 0); - } - - return nesting_level; - } - - /* Add the actual variables FULL_REF can access, given a member of ! FULL_REF's points-to set VAR, where FULL_REF is an access of SIZE at OFFSET from var. IS_CALL_SITE is true if this is a call, and IS_DEF is true if this is supposed to be a vdef, and false if this should be a VUSE. --- 1367,1374 ---- return true; } /* Add the actual variables FULL_REF can access, given a member of ! full_ref's points-to set VAR, where FULL_REF is an access of SIZE at OFFSET from var. IS_CALL_SITE is true if this is a call, and IS_DEF is true if this is supposed to be a vdef, and false if this should be a VUSE. *************** ref_nesting_level (tree full_ref) *** 1411,1422 **** This is necessary because foop only actually points to foo's first member, so that is all the points-to set contains. However, an access to foop->a may be touching some single SFT if we have created some ! SFT's for a structure. ! ! FULL_REF is the original memory expression being analyzed. */ static bool ! add_vars_for_offset (tree full_ref, tree var, unsigned HOST_WIDE_INT offset, unsigned HOST_WIDE_INT size, bool is_def) { bool added = false; --- 1386,1395 ---- This is necessary because foop only actually points to foo's first member, so that is all the points-to set contains. However, an access to foop->a may be touching some single SFT if we have created some ! SFT's for a structure. */ static bool ! add_vars_for_offset (tree var, unsigned HOST_WIDE_INT offset, unsigned HOST_WIDE_INT size, bool is_def) { bool added = false; *************** add_vars_for_offset (tree full_ref, tree *** 1424,1478 **** subvar_t sv; unsigned int i; ! if (full_ref ! && SFT_NESTING_LEVEL (var) > 0 ! && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var)) ! { ! /* Since VAR is an SFT inside a nested structure, the OFFSET ! computed by get_ref_base_and_extent is the offset from the ! start of the immediately containing structure. If VAR is an ! SFT inside a nested structure, then FULL_REF may be a ! reference to the structure immediately enclosing SFT, and so ! OFFSET will be the offset from the start of the immediately ! enclosing structure. ! ! However, to find out what other SFTs are affected by this ! reference, we need to know the offsets starting at the root ! structure in the nesting hierarchy. ! ! For instance, given the following structure: ! ! struct X { ! int a; ! struct Y { ! int b; ! struct Z { ! int c[3]; ! } d; ! } e; ! } m; ! ! and the following address expression: ! ! p_1 = &m.e.d; ! ! This structure will receive 5 SFTs, namely 2 for fields 'a' ! and 'b' and 3 for the array 'c' in struct Z. So, the ! reference p_1->c[2] and m.e.d.c[2] access the exact same ! memory location (ie, SFT.5). ! ! Now, alias analysis computed the points-to set for pointer ! p_1 as { SFT.3 } because that is the first field that p_1 ! actually points to. When the expression p_1->c[2] is ! analyzed, get_ref_base_and_extent will return an offset of 96 ! because we are accessing the third element of the array. But ! the SFT we are looking for is actually at offset 160, ! counting from the top of struct X. ! ! Therefore, we adjust OFFSET by the offset of VAR so that we ! can get at all the fields starting at VAR. */ ! offset += SFT_OFFSET (var); ! } /* Add all subvars of var that overlap with the access. Binary search for the first relevant SFT. */ --- 1397,1404 ---- subvar_t sv; unsigned int i; ! /* Adjust offset by the pointed-to location. */ ! offset += SFT_OFFSET (var); /* Add all subvars of var that overlap with the access. Binary search for the first relevant SFT. */ *************** add_virtual_operand (tree var, stmt_ann_ *** 1575,1581 **** if it is a potential points-to location. */ if (TREE_CODE (al) == STRUCT_FIELD_TAG && TREE_CODE (var) == NAME_MEMORY_TAG) ! none_added &= !add_vars_for_offset (full_ref, al, offset, size, flags & opf_def); else { --- 1501,1507 ---- if it is a potential points-to location. */ if (TREE_CODE (al) == STRUCT_FIELD_TAG && TREE_CODE (var) == NAME_MEMORY_TAG) ! none_added &= !add_vars_for_offset (al, offset, size, flags & opf_def); else { ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 15:29 ` Richard Guenther @ 2007-11-14 15:30 ` Diego Novillo 2007-11-15 13:55 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Diego Novillo @ 2007-11-14 15:30 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches Richard Guenther wrote: > Bootstrap and regtest on x86_64-unknown-linux-gnu in progress, I intend > to commit this if that (as expected) finishes successfully. No, let me study this first. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-14 15:30 ` Diego Novillo @ 2007-11-15 13:55 ` Richard Guenther 2007-11-15 15:47 ` Richard Guenther 0 siblings, 1 reply; 18+ messages in thread From: Richard Guenther @ 2007-11-15 13:55 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On Wed, 14 Nov 2007, Diego Novillo wrote: > Richard Guenther wrote: > > > Bootstrap and regtest on x86_64-unknown-linux-gnu in progress, I intend > > to commit this if that (as expected) finishes successfully. > > No, let me study this first. We can also add one additional optimization and sanity check. In add_virtual_operand instead of dispatching to add_vars_for_offset for every SFT in the NMT just do so for those that have SFT_BASE_FOR_COMPONENTS_P set. All others can (should) be only directly dereferenced which means offset has to be zero. If not, we should visit them during one of the add_vars_for_offset calls. This avoids redundant calls to append_vuse/vdef (so it's merely a compile-time improvement): /* For SFTs we have to consider all subvariables of the parent var if it is a potential points-to location. */ if (TREE_CODE (al) == STRUCT_FIELD_TAG && TREE_CODE (var) == NAME_MEMORY_TAG) { if (SFT_BASE_FOR_COMPONENTS_P (al)) none_added &= !add_vars_for_offset (al, offset, size, flags & opf_def); else if (offset == 0) { /* If this is a SFT that cannot be used as a base for component references, we only need to consider them if offset is zero. Otherwise if it is aliased via a component ref we will visit it during processing of a SFT_BASE_FOR_COMPONENTS_P pointed-to SFT. In this case just add this SFT as a possible direct dereference target. */ if (flags & opf_def) append_vdef (al); else append_vuse (al); none_added = false; } I guess we don't want to do this as part of the initial patch, but I'll check the compile-time effects if the above works out. Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Fix PR 33870 2007-11-15 13:55 ` Richard Guenther @ 2007-11-15 15:47 ` Richard Guenther 0 siblings, 0 replies; 18+ messages in thread From: Richard Guenther @ 2007-11-15 15:47 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc-patches On Thu, 15 Nov 2007, Richard Guenther wrote: > On Wed, 14 Nov 2007, Diego Novillo wrote: > > > Richard Guenther wrote: > > > > > Bootstrap and regtest on x86_64-unknown-linux-gnu in progress, I intend > > > to commit this if that (as expected) finishes successfully. > > > > No, let me study this first. > > We can also add one additional optimization and sanity check. In > add_virtual_operand instead of dispatching to add_vars_for_offset for > every SFT in the NMT just do so for those that have > SFT_BASE_FOR_COMPONENTS_P set. All others can (should) be only directly > dereferenced which means offset has to be zero. If not, we should > visit them during one of the add_vars_for_offset calls. This avoids > redundant calls to append_vuse/vdef (so it's merely a compile-time > improvement): > > /* For SFTs we have to consider all subvariables of the parent > var > if it is a potential points-to location. */ > if (TREE_CODE (al) == STRUCT_FIELD_TAG > && TREE_CODE (var) == NAME_MEMORY_TAG) > { > if (SFT_BASE_FOR_COMPONENTS_P (al)) > none_added &= !add_vars_for_offset (al, offset, size, > flags & opf_def); > else if (offset == 0) > { > /* If this is a SFT that cannot be used as a base for > component references, we only need to consider them > if offset is zero. Otherwise if it is aliased via > a component ref we will visit it during processing > of a SFT_BASE_FOR_COMPONENTS_P pointed-to SFT. > In this case just add this SFT as a possible direct > dereference target. */ > if (flags & opf_def) > append_vdef (al); > else > append_vuse (al); > none_added = false; > } > > I guess we don't want to do this as part of the initial patch, but I'll > check the compile-time effects if the above works out. Ok, that doesn't work. As we have cases where we can have sub-references on an SFT (notably in the case where we glob fields into a single SFT as we do for arrays with more than 4 elements), the offset of the access may be bigger than zero. Even if we only can reach the SFT itself we cannot avoid the append_vdef/vuse in any case, so the possible speed difference should be a wash. That is we can do if (TREE_CODE (al) == STRUCT_FIELD_TAG && TREE_CODE (var) == NAME_MEMORY_TAG) { if (SFT_BASE_FOR_COMPONENTS_P (al)) none_added &= !add_vars_for_offset (al, offset, size, flags & opf_def); else { /* If this is a SFT that cannot be used as a base for component references, we only need to consider them if offset is zero. Otherwise if it is aliased via a component ref we will visit it during processing of a SFT_BASE_FOR_COMPONENTS_P pointed-to SFT. In this case just add this SFT as a possible direct dereference target. */ if (flags & opf_def) append_vdef (al); else append_vuse (al); none_added = false; } Richard. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2007-11-15 13:37 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-11-08 0:24 Fix PR 33870 Diego Novillo 2007-11-08 0:35 ` Jakub Jelinek 2007-11-08 2:20 ` Diego Novillo 2007-11-08 10:45 ` Richard Guenther 2007-11-13 16:30 ` Diego Novillo 2007-11-13 16:47 ` Richard Guenther 2007-11-14 12:36 ` Richard Guenther 2007-11-14 12:41 ` Richard Guenther 2007-11-14 13:01 ` Diego Novillo 2007-11-14 13:18 ` Richard Guenther 2007-11-14 13:59 ` Diego Novillo 2007-11-14 14:07 ` Richard Guenther 2007-11-14 14:09 ` Diego Novillo 2007-11-14 14:22 ` Richard Guenther 2007-11-14 15:29 ` Richard Guenther 2007-11-14 15:30 ` Diego Novillo 2007-11-15 13:55 ` Richard Guenther 2007-11-15 15:47 ` Richard Guenther
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).