* [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible @ 2016-03-16 10:00 Bin Cheng 2016-03-16 12:20 ` Richard Biener 0 siblings, 1 reply; 10+ messages in thread From: Bin Cheng @ 2016-03-16 10:00 UTC (permalink / raw) To: gcc-patches; +Cc: nd [-- Attachment #1: Type: text/plain, Size: 2683 bytes --] Hi, One issue revealed in tree ifcvt is the pass stores/tracks DRs against its memory references in IR. This causes difficulty in identifying same memory references appearing in different forms. Given below example: void foo (int a[], int b[]) { int i; for (i = 0; i < 100; i++) { if (a[i] ==0) a[i] = b[i]*4; else a[i] = b[i]*3; } } The gimple dump before tree ifcvt is as: <bb 2>: <bb 3>: # i_24 = PHI <i_19(7), 0(2)> # ivtmp_28 = PHI <ivtmp_23(7), 100(2)> _5 = (long unsigned int) i_24; _6 = _5 * 4; _8 = a_7(D) + _6; _9 = *_8; if (_9 == 0) goto <bb 4>; else goto <bb 5>; <bb 4>: _11 = b_10(D) + _6; _12 = *_11; _13 = _12 * 4; goto <bb 6>; <bb 5>: _15 = b_10(D) + _6; _16 = *_15; _17 = _16 * 3; <bb 6>: # cstore_1 = PHI <_13(4), _17(5)> *_8 = cstore_1; i_19 = i_24 + 1; ivtmp_23 = ivtmp_28 - 1; if (ivtmp_23 != 0) goto <bb 7>; else goto <bb 8>; <bb 7>: goto <bb 3>; <bb 8>: return; The two memory references "*_11" and "*_15" are actually the same thing, but ifcvt failed to discover this because they are recorded in different forms. This patch fixes the issue by recording/tracking memory reference against its innermost_loop_behavior if: the memory reference has innermost_loop_behavior and it is not a compound reference. BTW, PR56625 reported that this case couldn't be vectorized even tree if-conv can handle it. Interesting thing is at current time, it's tree if-conv that could not handle the case. Once it's if-converted with this patch, vectorizer are able to handle it too. Bootstrap and test on x86_64 and AArch64. Is it OK, not sure if it's GCC 7? Thanks, bin 2016-03-11 Bin Cheng <bin.cheng@arm.com> PR tree-optimization/56625 PR tree-optimization/69489 * tree-data-ref.h (innermost_loop_behavior_hash): New class for hashing struct innermost_loop_behavior. (DR_INNERMOST): New macro. * tree-if-conv.c (innermost_DR_map): New map. (ref_DR_map, baseref_DR_map): Revise the comment. (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR to innermost_DR_map if it has innermost loop behavior and is not a compound reference. (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map if it has innermost loop behavior and is not a compound reference. (if_convertible_loop_p_1): Initialize innermost_DR_map. (if_convertible_loop_p): Release innermost_DR_map. gcc/testsuite/ChangeLog 2016-03-11 Bin Cheng <bin.cheng@arm.com> PR tree-optimization/56625 PR tree-optimization/69489 * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test. [-- Attachment #2: pr69489-part1-20160315.txt --] [-- Type: text/plain, Size: 7233 bytes --] diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c new file mode 100644 index 0000000..05db62b --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-S -O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* } } */ + +void foo (int a[], int b[]) +{ + int i; + for (i = 0; i < 100; i++) + { + if (a[i] ==0) + a[i] = b[i]*4; + else + a[i] = b[i]*3; + } +} + +/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */ diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h index eb24d16..310514a 100644 --- a/gcc/tree-data-ref.h +++ b/gcc/tree-data-ref.h @@ -57,6 +57,61 @@ struct innermost_loop_behavior tree aligned_to; }; +/* Hash for struct innermost_loop_behavior. It depends on the user to + free the memory. */ + +struct innermost_loop_behavior_hash : nofree_ptr_hash <innermost_loop_behavior> +{ + static inline hashval_t hash (const value_type &); + static inline bool equal (const value_type &, + const compare_type &); +}; + +inline hashval_t +innermost_loop_behavior_hash::hash (const value_type &e) +{ + hashval_t hash; + + hash = iterative_hash_expr (e->base_address, 0); + hash = iterative_hash_expr (e->offset, hash); + hash = iterative_hash_expr (e->init, hash); + hash = iterative_hash_expr (e->step, hash); + return iterative_hash_expr (e->aligned_to, hash); +} + +inline bool +innermost_loop_behavior_hash::equal (const value_type &e1, + const compare_type &e2) +{ + bool equal_p; + + if ((e1->base_address && !e2->base_address) + || (!e1->base_address && e2->base_address) + || (!e1->offset && e2->offset) + || (e1->offset && !e2->offset) + || (!e1->init && e2->init) + || (e1->init && !e2->init) + || (!e1->step && e2->step) + || (e1->step && !e2->step) + || (!e1->aligned_to && e2->aligned_to) + || (e1->aligned_to && !e2->aligned_to)) + return false; + + equal_p = true; + if (e1->base_address && e2->base_address) + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); + if (e1->offset && e2->offset) + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); + if (e1->init && e2->init) + equal_p &= operand_equal_p (e1->init, e2->init, 0); + if (e1->step && e2->step) + equal_p &= operand_equal_p (e1->step, e2->step, 0); + if (e1->aligned_to && e2->aligned_to) + equal_p &= operand_equal_p (e1->aligned_to, e2->aligned_to, 0); + + return equal_p; +} + /* Describes the evolutions of indices of the memory reference. The indices are indices of the ARRAY_REFs, indexes in artificial dimensions added for member selection of records and the operands of MEM_REFs. @@ -144,6 +199,7 @@ struct data_reference #define DR_STEP(DR) (DR)->innermost.step #define DR_PTR_INFO(DR) (DR)->alias.ptr_info #define DR_ALIGNED_TO(DR) (DR)->innermost.aligned_to +#define DR_INNERMOST(DR) (DR)->innermost typedef struct data_reference *data_reference_p; diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index 9e305c7..1315370 100644 --- a/gcc/tree-if-conv.c +++ b/gcc/tree-if-conv.c @@ -120,10 +120,14 @@ static basic_block *ifc_bbs; /* Apply more aggressive (extended) if-conversion if true. */ static bool aggressive_if_conv; -/* Hash table to store references, DR pairs. */ +/* Hash table to store <references, DR> pairs. */ static hash_map<tree_operand_hash, data_reference_p> *ref_DR_map; -/* Hash table to store base reference, DR pairs. */ +/* Hash table to store <DR's innermost loop behavior, DR> pairs. */ +static hash_map<innermost_loop_behavior_hash, + data_reference_p> *innermost_DR_map; + +/* Hash table to store <base reference, DR> pairs. */ static hash_map<tree_operand_hash, data_reference_p> *baseref_DR_map; /* Structure used to predicate basic blocks. This is attached to the @@ -615,15 +619,29 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a) data_reference_p *master_dr, *base_master_dr; tree ref = DR_REF (a); tree base_ref = DR_BASE_OBJECT (a); + innermost_loop_behavior *innermost = &DR_INNERMOST (a); tree ca = bb_predicate (gimple_bb (DR_STMT (a))); bool exist1, exist2; - while (TREE_CODE (ref) == COMPONENT_REF - || TREE_CODE (ref) == IMAGPART_EXPR - || TREE_CODE (ref) == REALPART_EXPR) - ref = TREE_OPERAND (ref, 0); + /* If reference in DR has innermost loop behavior and it is not + a compound memory reference, we store it to innermost_DR_map, + otherwise to ref_DR_map. */ + if (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) + { + while (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR) + ref = TREE_OPERAND (ref, 0); + + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); + } + else + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); - master_dr = &ref_DR_map->get_or_insert (ref, &exist1); if (!exist1) *master_dr = a; @@ -687,15 +705,29 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs) tree ref_base_a = DR_REF (a); tree base = DR_BASE_OBJECT (a); + innermost_loop_behavior *innermost = &DR_INNERMOST (a); gcc_assert (DR_STMT (a) == stmt); - while (TREE_CODE (ref_base_a) == COMPONENT_REF - || TREE_CODE (ref_base_a) == IMAGPART_EXPR - || TREE_CODE (ref_base_a) == REALPART_EXPR) - ref_base_a = TREE_OPERAND (ref_base_a, 0); + /* If reference in DR has innermost loop behavior and it is not + a compound memory reference, we get it from innermost_DR_map, + otherwise from ref_DR_map. */ + if (TREE_CODE (ref_base_a) == COMPONENT_REF + || TREE_CODE (ref_base_a) == IMAGPART_EXPR + || TREE_CODE (ref_base_a) == REALPART_EXPR + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) + { + while (TREE_CODE (ref_base_a) == COMPONENT_REF + || TREE_CODE (ref_base_a) == IMAGPART_EXPR + || TREE_CODE (ref_base_a) == REALPART_EXPR) + ref_base_a = TREE_OPERAND (ref_base_a, 0); + + master_dr = ref_DR_map->get (ref_base_a); + } + else + master_dr = innermost_DR_map->get (innermost); - master_dr = ref_DR_map->get (ref_base_a); base_master_dr = baseref_DR_map->get (base); gcc_assert (master_dr != NULL); @@ -1229,6 +1261,8 @@ if_convertible_loop_p_1 (struct loop *loop, data_reference_p dr; ref_DR_map = new hash_map<tree_operand_hash, data_reference_p>; + innermost_DR_map + = new hash_map<innermost_loop_behavior_hash, data_reference_p>; baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>; predicate_bbs (loop); @@ -1341,6 +1375,9 @@ if_convertible_loop_p (struct loop *loop, bool *any_mask_load_store) delete ref_DR_map; ref_DR_map = NULL; + delete innermost_DR_map; + innermost_DR_map = NULL; + delete baseref_DR_map; baseref_DR_map = NULL; ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible 2016-03-16 10:00 [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible Bin Cheng @ 2016-03-16 12:20 ` Richard Biener 2016-03-16 16:17 ` Bin.Cheng 0 siblings, 1 reply; 10+ messages in thread From: Richard Biener @ 2016-03-16 12:20 UTC (permalink / raw) To: Bin Cheng; +Cc: gcc-patches, nd On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <Bin.Cheng@arm.com> wrote: > Hi, > One issue revealed in tree ifcvt is the pass stores/tracks DRs against its memory references in IR. This causes difficulty in identifying same memory references appearing in different forms. Given below example: > > void foo (int a[], int b[]) > { > int i; > for (i = 0; i < 100; i++) > { > if (a[i] ==0) > a[i] = b[i]*4; > else > a[i] = b[i]*3; > } > } > > The gimple dump before tree ifcvt is as: > > <bb 2>: > > <bb 3>: > # i_24 = PHI <i_19(7), 0(2)> > # ivtmp_28 = PHI <ivtmp_23(7), 100(2)> > _5 = (long unsigned int) i_24; > _6 = _5 * 4; > _8 = a_7(D) + _6; > _9 = *_8; > if (_9 == 0) > goto <bb 4>; > else > goto <bb 5>; > > <bb 4>: > _11 = b_10(D) + _6; > _12 = *_11; > _13 = _12 * 4; > goto <bb 6>; > > <bb 5>: > _15 = b_10(D) + _6; > _16 = *_15; > _17 = _16 * 3; > > <bb 6>: > # cstore_1 = PHI <_13(4), _17(5)> > *_8 = cstore_1; > i_19 = i_24 + 1; > ivtmp_23 = ivtmp_28 - 1; > if (ivtmp_23 != 0) > goto <bb 7>; > else > goto <bb 8>; > > <bb 7>: > goto <bb 3>; > > <bb 8>: > return; > > The two memory references "*_11" and "*_15" are actually the same thing, but ifcvt failed to discover this because they are recorded in different forms. This patch fixes the issue by recording/tracking memory reference against its innermost_loop_behavior if: the memory reference has innermost_loop_behavior and it is not a compound reference. > BTW, PR56625 reported that this case couldn't be vectorized even tree if-conv can handle it. Interesting thing is at current time, it's tree if-conv that could not handle the case. Once it's if-converted with this patch, vectorizer are able to handle it too. > > Bootstrap and test on x86_64 and AArch64. Is it OK, not sure if it's GCC 7? Hmm. + equal_p = true; + if (e1->base_address && e2->base_address) + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); + if (e1->offset && e2->offset) + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); surely better to return false early. I think we don't want this in tree-data-refs.h also because of ... @@ -615,15 +619,29 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a) data_reference_p *master_dr, *base_master_dr; tree ref = DR_REF (a); tree base_ref = DR_BASE_OBJECT (a); + innermost_loop_behavior *innermost = &DR_INNERMOST (a); tree ca = bb_predicate (gimple_bb (DR_STMT (a))); bool exist1, exist2; - while (TREE_CODE (ref) == COMPONENT_REF - || TREE_CODE (ref) == IMAGPART_EXPR - || TREE_CODE (ref) == REALPART_EXPR) - ref = TREE_OPERAND (ref, 0); + /* If reference in DR has innermost loop behavior and it is not + a compound memory reference, we store it to innermost_DR_map, + otherwise to ref_DR_map. */ + if (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) + { + while (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR) + ref = TREE_OPERAND (ref, 0); + + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); + } + else + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); we don't want an extra hashmap but replace ref_DR_map entirely. So we'd need to strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART and REALPART) before creating the DR (or adjust the equality function and hashing to disregard them which means subtracting their offset from DR_INIT. To adjust the references we collect you'd maybe could use a callback to get_references_in_stmt to adjust them. OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as Index: tree-if-conv.c =================================================================== --- tree-if-conv.c (revision 234215) +++ tree-if-conv.c (working copy) @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo for (i = 0; refs->iterate (i, &dr); i++) { + tree *refp = &DR_REF (dr); + while ((TREE_CODE (*refp) == COMPONENT_REF + && TREE_OPERAND (*refp, 2) == NULL_TREE) + || TREE_CODE (*refp) == IMAGPART_EXPR + || TREE_CODE (*refp) == REALPART_EXPR) + refp = &TREE_OPERAND (*refp, 0); + if (refp != &DR_REF (dr)) + { + tree saved_base = *refp; + *refp = integer_zero_node; + + if (DR_INIT (dr)) + { + tree poffset; + int punsignedp, preversep, pvolatilep; + machine_mode pmode; + HOST_WIDE_INT pbitsize, pbitpos; + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset, + &pmode, &punsignedp, &preversep, &pvolatilep, + false); + gcc_assert (poffset == NULL_TREE); + + DR_INIT (dr) + = wide_int_to_tree (ssizetype, + wi::sub (DR_INIT (dr), + pbitpos / BITS_PER_UNIT)); + } + + *refp = saved_base; + DR_REF (dr) = *refp; + } + dr->aux = XNEW (struct ifc_dr); DR_BASE_W_UNCONDITIONALLY (dr) = false; DR_RW_UNCONDITIONALLY (dr) = false; Can you add a testcase for the vectorization? Richard. > Thanks, > bin > > > 2016-03-11 Bin Cheng <bin.cheng@arm.com> > > PR tree-optimization/56625 > PR tree-optimization/69489 > * tree-data-ref.h (innermost_loop_behavior_hash): New class for > hashing struct innermost_loop_behavior. > (DR_INNERMOST): New macro. > * tree-if-conv.c (innermost_DR_map): New map. > (ref_DR_map, baseref_DR_map): Revise the comment. > (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR > to innermost_DR_map if it has innermost loop behavior and is not > a compound reference. > (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map if it has > innermost loop behavior and is not a compound reference. > (if_convertible_loop_p_1): Initialize innermost_DR_map. > (if_convertible_loop_p): Release innermost_DR_map. > > gcc/testsuite/ChangeLog > 2016-03-11 Bin Cheng <bin.cheng@arm.com> > > PR tree-optimization/56625 > PR tree-optimization/69489 > * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test. > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible 2016-03-16 12:20 ` Richard Biener @ 2016-03-16 16:17 ` Bin.Cheng 2016-03-17 9:53 ` Richard Biener 0 siblings, 1 reply; 10+ messages in thread From: Bin.Cheng @ 2016-03-16 16:17 UTC (permalink / raw) To: Richard Biener; +Cc: Bin Cheng, gcc-patches On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener <richard.guenther@gmail.com> wrote: > > On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <Bin.Cheng@arm.com> wrote: > > Hi, > > ...... > > Bootstrap and test on x86_64 and AArch64. Is it OK, not sure if it's GCC 7? > > Hmm. Hi, Thanks for reviewing. > > + equal_p = true; > + if (e1->base_address && e2->base_address) > + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); > + if (e1->offset && e2->offset) > + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); > > surely better to return false early. > > I think we don't want this in tree-data-refs.h also because of ... > > @@ -615,15 +619,29 @@ > hash_memrefs_baserefs_and_store_DRs_read_written_info > (data_reference_p a) > data_reference_p *master_dr, *base_master_dr;and REALPART) before creating the DR (or adjust the equality function and hashing > tree ref = DR_REF (a); > tree base_ref = DR_BASE_OBJECT (a); > + innermost_loop_behavior *innermost = &DR_INNERMOST (a); > tree ca = bb_predicate (gimple_bb (DR_STMT (a))); > bool exist1, exist2; > > - while (TREE_CODE (ref) == COMPONENT_REF > - || TREE_CODE (ref) == IMAGPART_EXPR > - || TREE_CODE (ref) == REALPART_EXPR) > - ref = TREE_OPERAND (ref, 0); > + /* If reference in DR has innermost loop behavior and it is not > + a compound memory reference, we store it to innermost_DR_map, > + otherwise to ref_DR_map. */ > + if (TREE_CODE (ref) == COMPONENT_REF > + || TREE_CODE (ref) == IMAGPART_EXPR > + || TREE_CODE (ref) == REALPART_EXPR > + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) > + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) > + { > + while (TREE_CODE (ref) == COMPONENT_REF > + || TREE_CODE (ref) == IMAGPART_EXPR > + || TREE_CODE (ref) == REALPART_EXPR) > + ref = TREE_OPERAND (ref, 0); > + > + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); > + } > + else > + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); > > we don't want an extra hashmap but replace ref_DR_map entirely. So we'd need to > strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART > and REALPART) before creating the DR (or adjust the equality function > and hashing > to disregard them which means subtracting their offset from DR_INIT. I am not sure if I understand correctly. But for component reference, it is the base object that we want to record/track. For example, for (i = 0; i < N; i++) { m = *data++; m1 = p1->x - m; m2 = p2->x + m; p3->y = (m1 >= m2) ? p1->y : p2->y; p1++; p2++; p3++; } We want to infer that reads of p1/p2 in condition statement won't trap because there are unconditional reads of the structures, though the unconditional reads are actual of other sub-objects. Here it is the invariant part of address that we want to track. Also illustrated by this example, we can't rely on data-ref analyzer here. Because in gathering/scattering cases, the address could be not affine at all. > > To adjust the references we collect you'd maybe could use a callback > to get_references_in_stmt > to adjust them. > > OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as Is this a part of the method you suggested above, or is it an alternative one? If it's the latter, then I have below questions embedded. > > Index: tree-if-conv.c > =================================================================== > --- tree-if-conv.c (revision 234215) > +++ tree-if-conv.c (working copy) > @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo > > for (i = 0; refs->iterate (i, &dr); i++) > { > + tree *refp = &DR_REF (dr); > + while ((TREE_CODE (*refp) == COMPONENT_REF > + && TREE_OPERAND (*refp, 2) == NULL_TREE) > + || TREE_CODE (*refp) == IMAGPART_EXPR > + || TREE_CODE (*refp) == REALPART_EXPR) > + refp = &TREE_OPERAND (*refp, 0); > + if (refp != &DR_REF (dr)) > + { > + tree saved_base = *refp; > + *refp = integer_zero_node; > + > + if (DR_INIT (dr)) > + { > + tree poffset; > + int punsignedp, preversep, pvolatilep; > + machine_mode pmode; > + HOST_WIDE_INT pbitsize, pbitpos; > + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset, > + &pmode, &punsignedp, &preversep, &pvolatilep, > + false); > + gcc_assert (poffset == NULL_TREE); > + > + DR_INIT (dr) > + = wide_int_to_tree (ssizetype, > + wi::sub (DR_INIT (dr), > + pbitpos / BITS_PER_UNIT)); > + } > + > + *refp = saved_base; > + DR_REF (dr) = *refp; > + } Looks to me the code is trying to resolve difference between two (or more) component references, which is DR_INIT in the code. But DR_INIT is not the only thing needs to be handled. For a structure containing two sub-arrays, DR_OFFSET may be different too. Actually I did try to replace ref_DR_map. For memory reference doesn't have innermost affine behavior, we need to modify DR fields by populating it with artificial data. This results in some kind of over-designed code. IMHO, modifying some DR structures outside of data-ref analyzer isn't that good. After comparing the two methods, I think may be this one is better, of course with the cost of two different maps. I might be misunderstanding your idea, so please correct if I was wrong. Thanks, bin > + > dr->aux = XNEW (struct ifc_dr); > DR_BASE_W_UNCONDITIONALLY (dr) = false; > DR_RW_UNCONDITIONALLY (dr) = false; > > > Can you add a testcase for the vectorization? > > Richard. > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible 2016-03-16 16:17 ` Bin.Cheng @ 2016-03-17 9:53 ` Richard Biener [not found] ` <CAHFci29CHpOBxah4QxrL_JOL6p_NC=r3-e3eoNjHp_4z749PnQ@mail.gmail.com> 0 siblings, 1 reply; 10+ messages in thread From: Richard Biener @ 2016-03-17 9:53 UTC (permalink / raw) To: Bin.Cheng; +Cc: Bin Cheng, gcc-patches On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: > On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener > <richard.guenther@gmail.com> wrote: >> >> On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <Bin.Cheng@arm.com> wrote: >> > Hi, >> > ...... >> > Bootstrap and test on x86_64 and AArch64. Is it OK, not sure if it's GCC 7? >> >> Hmm. > Hi, > Thanks for reviewing. >> >> + equal_p = true; >> + if (e1->base_address && e2->base_address) >> + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); >> + if (e1->offset && e2->offset) >> + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); >> >> surely better to return false early. >> >> I think we don't want this in tree-data-refs.h also because of ... >> >> @@ -615,15 +619,29 @@ >> hash_memrefs_baserefs_and_store_DRs_read_written_info >> (data_reference_p a) >> data_reference_p *master_dr, *base_master_dr;and REALPART) before creating the DR (or adjust the equality function > and hashing >> tree ref = DR_REF (a); >> tree base_ref = DR_BASE_OBJECT (a); >> + innermost_loop_behavior *innermost = &DR_INNERMOST (a); >> tree ca = bb_predicate (gimple_bb (DR_STMT (a))); >> bool exist1, exist2; >> >> - while (TREE_CODE (ref) == COMPONENT_REF >> - || TREE_CODE (ref) == IMAGPART_EXPR >> - || TREE_CODE (ref) == REALPART_EXPR) >> - ref = TREE_OPERAND (ref, 0); >> + /* If reference in DR has innermost loop behavior and it is not >> + a compound memory reference, we store it to innermost_DR_map, >> + otherwise to ref_DR_map. */ >> + if (TREE_CODE (ref) == COMPONENT_REF >> + || TREE_CODE (ref) == IMAGPART_EXPR >> + || TREE_CODE (ref) == REALPART_EXPR >> + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) >> + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) >> + { >> + while (TREE_CODE (ref) == COMPONENT_REF >> + || TREE_CODE (ref) == IMAGPART_EXPR >> + || TREE_CODE (ref) == REALPART_EXPR) >> + ref = TREE_OPERAND (ref, 0); >> + >> + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); >> + } >> + else >> + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); >> >> we don't want an extra hashmap but replace ref_DR_map entirely. So we'd need to >> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART >> and REALPART) before creating the DR (or adjust the equality function >> and hashing >> to disregard them which means subtracting their offset from DR_INIT. > I am not sure if I understand correctly. But for component reference, > it is the base object that we want to record/track. For example, > > for (i = 0; i < N; i++) { > m = *data++; > > m1 = p1->x - m; > m2 = p2->x + m; > > p3->y = (m1 >= m2) ? p1->y : p2->y; > > p1++; > p2++; > p3++; > } > We want to infer that reads of p1/p2 in condition statement won't trap > because there are unconditional reads of the structures, though the > unconditional reads are actual of other sub-objects. Here it is the > invariant part of address that we want to track. Well, the variant parts - we want to strip invariant parts as far as we can (offsetof (x) and offsetof (y)) > Also illustrated by this example, we can't rely on data-ref analyzer > here. Because in gathering/scattering cases, the address could be not > affine at all. Sure, but that's a different issue. >> >> To adjust the references we collect you'd maybe could use a callback >> to get_references_in_stmt >> to adjust them. >> >> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as > Is this a part of the method you suggested above, or is it an > alternative one? If it's the latter, then I have below questions > embedded. It is an alternative to adding a hook to get_references_in_stmt and probably "easier". >> >> Index: tree-if-conv.c >> =================================================================== >> --- tree-if-conv.c (revision 234215) >> +++ tree-if-conv.c (working copy) >> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo >> >> for (i = 0; refs->iterate (i, &dr); i++) >> { >> + tree *refp = &DR_REF (dr); >> + while ((TREE_CODE (*refp) == COMPONENT_REF >> + && TREE_OPERAND (*refp, 2) == NULL_TREE) >> + || TREE_CODE (*refp) == IMAGPART_EXPR >> + || TREE_CODE (*refp) == REALPART_EXPR) >> + refp = &TREE_OPERAND (*refp, 0); >> + if (refp != &DR_REF (dr)) >> + { >> + tree saved_base = *refp; >> + *refp = integer_zero_node; >> + >> + if (DR_INIT (dr)) >> + { >> + tree poffset; >> + int punsignedp, preversep, pvolatilep; >> + machine_mode pmode; >> + HOST_WIDE_INT pbitsize, pbitpos; >> + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset, >> + &pmode, &punsignedp, &preversep, &pvolatilep, >> + false); >> + gcc_assert (poffset == NULL_TREE); >> + >> + DR_INIT (dr) >> + = wide_int_to_tree (ssizetype, >> + wi::sub (DR_INIT (dr), >> + pbitpos / BITS_PER_UNIT)); >> + } >> + >> + *refp = saved_base; >> + DR_REF (dr) = *refp; >> + } > Looks to me the code is trying to resolve difference between two (or > more) component references, which is DR_INIT in the code. But DR_INIT > is not the only thing needs to be handled. For a structure containing > two sub-arrays, DR_OFFSET may be different too. Yes, but we can't say that if a->a[i] doesn't trap that then a->b[i] doesn't trap either. We can only "strip" outermost non-variable-offset components. But maybe I'm missing what example you are thinking of. > Actually I did try to replace ref_DR_map. For memory reference > doesn't have innermost affine behavior, we need to modify DR fields by > populating it with artificial data. This results in some kind of > over-designed code. IMHO, modifying some DR structures outside of > data-ref analyzer isn't that good. After comparing the two methods, I > think may be this one is better, of course with the cost of two > different maps. IMHO the code is already somewhat awful in using data-ref analysis at all (for what it does it certainly doesn't need it). Note that in the case data-ref analysis cannot analyze innermost behavior we're good as well with my proposed code - we simply leave DR_INIT NULL (but still strip outermost components). Richard. > I might be misunderstanding your idea, so please correct if I was wrong. > > Thanks, > bin >> + >> dr->aux = XNEW (struct ifc_dr); >> DR_BASE_W_UNCONDITIONALLY (dr) = false; >> DR_RW_UNCONDITIONALLY (dr) = false; >> >> >> Can you add a testcase for the vectorization? >> >> Richard. >> ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <CAHFci29CHpOBxah4QxrL_JOL6p_NC=r3-e3eoNjHp_4z749PnQ@mail.gmail.com>]
* Fwd: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible [not found] ` <CAHFci29CHpOBxah4QxrL_JOL6p_NC=r3-e3eoNjHp_4z749PnQ@mail.gmail.com> @ 2016-03-28 22:04 ` Bin.Cheng 2016-03-29 8:44 ` Richard Biener 0 siblings, 1 reply; 10+ messages in thread From: Bin.Cheng @ 2016-03-28 22:04 UTC (permalink / raw) To: gcc-patches Sorry, Should have replied to gcc-patches list. Thanks, bin ---------- Forwarded message ---------- From: "Bin.Cheng" <amker.cheng@gmail.com> Date: Tue, 29 Mar 2016 03:55:04 +0800 Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible To: Richard Biener <richard.guenther@gmail.com> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote: > On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener >> <richard.guenther@gmail.com> wrote: >>> >>> Hmm. >> Hi, >> Thanks for reviewing. >>> >>> + equal_p = true; >>> + if (e1->base_address && e2->base_address) >>> + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); >>> + if (e1->offset && e2->offset) >>> + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); >>> >>> surely better to return false early. >>> >>> I think we don't want this in tree-data-refs.h also because of ... >>> >>> @@ -615,15 +619,29 @@ >>> hash_memrefs_baserefs_and_store_DRs_read_written_info >>> (data_reference_p a) >>> data_reference_p *master_dr, *base_master_dr;and REALPART) before >>> creating the DR (or adjust the equality function >> and hashing >>> tree ref = DR_REF (a); >>> tree base_ref = DR_BASE_OBJECT (a); >>> + innermost_loop_behavior *innermost = &DR_INNERMOST (a); >>> tree ca = bb_predicate (gimple_bb (DR_STMT (a))); >>> bool exist1, exist2; >>> >>> - while (TREE_CODE (ref) == COMPONENT_REF >>> - || TREE_CODE (ref) == IMAGPART_EXPR >>> - || TREE_CODE (ref) == REALPART_EXPR) >>> - ref = TREE_OPERAND (ref, 0); >>> + /* If reference in DR has innermost loop behavior and it is not >>> + a compound memory reference, we store it to innermost_DR_map, >>> + otherwise to ref_DR_map. */ >>> + if (TREE_CODE (ref) == COMPONENT_REF >>> + || TREE_CODE (ref) == IMAGPART_EXPR >>> + || TREE_CODE (ref) == REALPART_EXPR >>> + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) >>> + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) >>> + { >>> + while (TREE_CODE (ref) == COMPONENT_REF >>> + || TREE_CODE (ref) == IMAGPART_EXPR >>> + || TREE_CODE (ref) == REALPART_EXPR) >>> + ref = TREE_OPERAND (ref, 0); >>> + >>> + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); >>> + } >>> + else >>> + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); >>> >>> we don't want an extra hashmap but replace ref_DR_map entirely. So we'd >>> need to >>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART >>> and REALPART) before creating the DR (or adjust the equality function >>> and hashing >>> to disregard them which means subtracting their offset from DR_INIT. >> I am not sure if I understand correctly. But for component reference, >> it is the base object that we want to record/track. For example, >> >> for (i = 0; i < N; i++) { >> m = *data++; >> >> m1 = p1->x - m; >> m2 = p2->x + m; >> >> p3->y = (m1 >= m2) ? p1->y : p2->y; >> >> p1++; >> p2++; >> p3++; >> } >> We want to infer that reads of p1/p2 in condition statement won't trap >> because there are unconditional reads of the structures, though the >> unconditional reads are actual of other sub-objects. Here it is the >> invariant part of address that we want to track. > > Well, the variant parts - we want to strip invariant parts as far as we can > (offsetof (x) and offsetof (y)) > >> Also illustrated by this example, we can't rely on data-ref analyzer >> here. Because in gathering/scattering cases, the address could be not >> affine at all. > > Sure, but that's a different issue. > >>> >>> To adjust the references we collect you'd maybe could use a callback >>> to get_references_in_stmt >>> to adjust them. >>> >>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple >>> as >> Is this a part of the method you suggested above, or is it an >> alternative one? If it's the latter, then I have below questions >> embedded. > > It is an alternative to adding a hook to get_references_in_stmt and > probably "easier". > >>> >>> Index: tree-if-conv.c >>> =================================================================== >>> --- tree-if-conv.c (revision 234215) >>> +++ tree-if-conv.c (working copy) >>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo >>> >>> for (i = 0; refs->iterate (i, &dr); i++) >>> { >>> + tree *refp = &DR_REF (dr); >>> + while ((TREE_CODE (*refp) == COMPONENT_REF >>> + && TREE_OPERAND (*refp, 2) == NULL_TREE) >>> + || TREE_CODE (*refp) == IMAGPART_EXPR >>> + || TREE_CODE (*refp) == REALPART_EXPR) >>> + refp = &TREE_OPERAND (*refp, 0); >>> + if (refp != &DR_REF (dr)) >>> + { >>> + tree saved_base = *refp; >>> + *refp = integer_zero_node; >>> + >>> + if (DR_INIT (dr)) >>> + { >>> + tree poffset; >>> + int punsignedp, preversep, pvolatilep; >>> + machine_mode pmode; >>> + HOST_WIDE_INT pbitsize, pbitpos; >>> + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, >>> &poffset, >>> + &pmode, &punsignedp, &preversep, >>> &pvolatilep, >>> + false); >>> + gcc_assert (poffset == NULL_TREE); >>> + >>> + DR_INIT (dr) >>> + = wide_int_to_tree (ssizetype, >>> + wi::sub (DR_INIT (dr), >>> + pbitpos / BITS_PER_UNIT)); >>> + } >>> + >>> + *refp = saved_base; >>> + DR_REF (dr) = *refp; >>> + } >> Looks to me the code is trying to resolve difference between two (or >> more) component references, which is DR_INIT in the code. But DR_INIT >> is not the only thing needs to be handled. For a structure containing >> two sub-arrays, DR_OFFSET may be different too. > > Yes, but we can't say that if > > a->a[i] > > doesn't trap that then > > a->b[i] > > doesn't trap either. We can only "strip" outermost > non-variable-offset components. > > But maybe I'm missing what example you are thinking of. Hmm, this was the case I meant. What I don't understand is current code logic does infer trap information for a.b[i] from a.a[i]. Given below example: struct str { int a[10]; int b[20]; char c; }; void bar (struct str *); int foo (int x, int n) { int i; struct str s; bar (&s); for (i = 0; i < n; i++) { s.a[i] = s.b[i]; if (x > i) s.b[i] = 0; } bar (&s); return 0; } The loop is convertible because of below code in function ifcvt_memrefs_wont_trap: /* If a is unconditionally accessed then ... */ if (DR_RW_UNCONDITIONALLY (*master_dr)) { /* an unconditional read won't trap. */ if (DR_IS_READ (a)) return true; /* an unconditionaly write won't trap if the base is written to unconditionally. */ if (base_master_dr && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); else { /* or the base is know to be not readonly. */ tree base_tree = get_base_address (DR_REF (a)); if (DECL_P (base_tree) && decl_binds_to_current_def_p (base_tree) && ! TREE_READONLY (base_tree)) return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); } } It is the main object '&s' that is recorded in base_master_dr, so s.b[i] is considered not trap. Even it's not, I suppose get_base_address will give same result? Thanks, bin -- Best Regards. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible 2016-03-28 22:04 ` Fwd: " Bin.Cheng @ 2016-03-29 8:44 ` Richard Biener 2016-03-31 17:08 ` Bin.Cheng 0 siblings, 1 reply; 10+ messages in thread From: Richard Biener @ 2016-03-29 8:44 UTC (permalink / raw) To: Bin.Cheng; +Cc: GCC Patches On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: > Sorry, Should have replied to gcc-patches list. > > Thanks, > bin > > ---------- Forwarded message ---------- > From: "Bin.Cheng" <amker.cheng@gmail.com> > Date: Tue, 29 Mar 2016 03:55:04 +0800 > Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking > DR against its innermost loop bahavior if possible > To: Richard Biener <richard.guenther@gmail.com> > > On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote: >> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener >>> <richard.guenther@gmail.com> wrote: >>>> >>>> Hmm. >>> Hi, >>> Thanks for reviewing. >>>> >>>> + equal_p = true; >>>> + if (e1->base_address && e2->base_address) >>>> + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); >>>> + if (e1->offset && e2->offset) >>>> + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); >>>> >>>> surely better to return false early. >>>> >>>> I think we don't want this in tree-data-refs.h also because of ... >>>> >>>> @@ -615,15 +619,29 @@ >>>> hash_memrefs_baserefs_and_store_DRs_read_written_info >>>> (data_reference_p a) >>>> data_reference_p *master_dr, *base_master_dr;and REALPART) before >>>> creating the DR (or adjust the equality function >>> and hashing >>>> tree ref = DR_REF (a); >>>> tree base_ref = DR_BASE_OBJECT (a); >>>> + innermost_loop_behavior *innermost = &DR_INNERMOST (a); >>>> tree ca = bb_predicate (gimple_bb (DR_STMT (a))); >>>> bool exist1, exist2; >>>> >>>> - while (TREE_CODE (ref) == COMPONENT_REF >>>> - || TREE_CODE (ref) == IMAGPART_EXPR >>>> - || TREE_CODE (ref) == REALPART_EXPR) >>>> - ref = TREE_OPERAND (ref, 0); >>>> + /* If reference in DR has innermost loop behavior and it is not >>>> + a compound memory reference, we store it to innermost_DR_map, >>>> + otherwise to ref_DR_map. */ >>>> + if (TREE_CODE (ref) == COMPONENT_REF >>>> + || TREE_CODE (ref) == IMAGPART_EXPR >>>> + || TREE_CODE (ref) == REALPART_EXPR >>>> + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) >>>> + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) >>>> + { >>>> + while (TREE_CODE (ref) == COMPONENT_REF >>>> + || TREE_CODE (ref) == IMAGPART_EXPR >>>> + || TREE_CODE (ref) == REALPART_EXPR) >>>> + ref = TREE_OPERAND (ref, 0); >>>> + >>>> + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); >>>> + } >>>> + else >>>> + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); >>>> >>>> we don't want an extra hashmap but replace ref_DR_map entirely. So we'd >>>> need to >>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART >>>> and REALPART) before creating the DR (or adjust the equality function >>>> and hashing >>>> to disregard them which means subtracting their offset from DR_INIT. >>> I am not sure if I understand correctly. But for component reference, >>> it is the base object that we want to record/track. For example, >>> >>> for (i = 0; i < N; i++) { >>> m = *data++; >>> >>> m1 = p1->x - m; >>> m2 = p2->x + m; >>> >>> p3->y = (m1 >= m2) ? p1->y : p2->y; >>> >>> p1++; >>> p2++; >>> p3++; >>> } >>> We want to infer that reads of p1/p2 in condition statement won't trap >>> because there are unconditional reads of the structures, though the >>> unconditional reads are actual of other sub-objects. Here it is the >>> invariant part of address that we want to track. >> >> Well, the variant parts - we want to strip invariant parts as far as we can >> (offsetof (x) and offsetof (y)) >> >>> Also illustrated by this example, we can't rely on data-ref analyzer >>> here. Because in gathering/scattering cases, the address could be not >>> affine at all. >> >> Sure, but that's a different issue. >> >>>> >>>> To adjust the references we collect you'd maybe could use a callback >>>> to get_references_in_stmt >>>> to adjust them. >>>> >>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple >>>> as >>> Is this a part of the method you suggested above, or is it an >>> alternative one? If it's the latter, then I have below questions >>> embedded. >> >> It is an alternative to adding a hook to get_references_in_stmt and >> probably "easier". >> >>>> >>>> Index: tree-if-conv.c >>>> =================================================================== >>>> --- tree-if-conv.c (revision 234215) >>>> +++ tree-if-conv.c (working copy) >>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo >>>> >>>> for (i = 0; refs->iterate (i, &dr); i++) >>>> { >>>> + tree *refp = &DR_REF (dr); >>>> + while ((TREE_CODE (*refp) == COMPONENT_REF >>>> + && TREE_OPERAND (*refp, 2) == NULL_TREE) >>>> + || TREE_CODE (*refp) == IMAGPART_EXPR >>>> + || TREE_CODE (*refp) == REALPART_EXPR) >>>> + refp = &TREE_OPERAND (*refp, 0); >>>> + if (refp != &DR_REF (dr)) >>>> + { >>>> + tree saved_base = *refp; >>>> + *refp = integer_zero_node; >>>> + >>>> + if (DR_INIT (dr)) >>>> + { >>>> + tree poffset; >>>> + int punsignedp, preversep, pvolatilep; >>>> + machine_mode pmode; >>>> + HOST_WIDE_INT pbitsize, pbitpos; >>>> + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, >>>> &poffset, >>>> + &pmode, &punsignedp, &preversep, >>>> &pvolatilep, >>>> + false); >>>> + gcc_assert (poffset == NULL_TREE); >>>> + >>>> + DR_INIT (dr) >>>> + = wide_int_to_tree (ssizetype, >>>> + wi::sub (DR_INIT (dr), >>>> + pbitpos / BITS_PER_UNIT)); >>>> + } >>>> + >>>> + *refp = saved_base; >>>> + DR_REF (dr) = *refp; >>>> + } >>> Looks to me the code is trying to resolve difference between two (or >>> more) component references, which is DR_INIT in the code. But DR_INIT >>> is not the only thing needs to be handled. For a structure containing >>> two sub-arrays, DR_OFFSET may be different too. >> >> Yes, but we can't say that if >> >> a->a[i] >> >> doesn't trap that then >> >> a->b[i] >> >> doesn't trap either. We can only "strip" outermost >> non-variable-offset components. >> >> But maybe I'm missing what example you are thinking of. > Hmm, this was the case I meant. What I don't understand is current > code logic does infer trap information for a.b[i] from a.a[i]. Given > below example: > struct str > { > int a[10]; > int b[20]; > char c; > }; > > void bar (struct str *); > int foo (int x, int n) > { > int i; > struct str s; > bar (&s); > for (i = 0; i < n; i++) > { > s.a[i] = s.b[i]; > if (x > i) > s.b[i] = 0; > } > bar (&s); > return 0; > } > The loop is convertible because of below code in function > ifcvt_memrefs_wont_trap: > > /* If a is unconditionally accessed then ... */ > if (DR_RW_UNCONDITIONALLY (*master_dr)) > { > /* an unconditional read won't trap. */ > if (DR_IS_READ (a)) > return true; > > /* an unconditionaly write won't trap if the base is written > to unconditionally. */ > if (base_master_dr > && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) > return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); > else > { > /* or the base is know to be not readonly. */ > tree base_tree = get_base_address (DR_REF (a)); > if (DECL_P (base_tree) > && decl_binds_to_current_def_p (base_tree) > && ! TREE_READONLY (base_tree)) > return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); > } > } > It is the main object '&s' that is recorded in base_master_dr, so > s.b[i] is considered not trap. Even it's not, I suppose > get_base_address will give same result? Well, for this case it sees that s.b[i] is read from so it can't be an out-of-bound access. And s.a[i] is written to unconditionally so 's' cannot be a readonly object. With both pieces of information we can conclude that s.b[i] = 0 doesn't trap. Richard. > > Thanks, > bin > > > > -- > Best Regards. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible 2016-03-29 8:44 ` Richard Biener @ 2016-03-31 17:08 ` Bin.Cheng 2016-04-04 13:07 ` Richard Biener 0 siblings, 1 reply; 10+ messages in thread From: Bin.Cheng @ 2016-03-31 17:08 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches [-- Attachment #1: Type: text/plain, Size: 10834 bytes --] On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener <richard.guenther@gmail.com> wrote: > On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >> Sorry, Should have replied to gcc-patches list. >> >> Thanks, >> bin >> >> ---------- Forwarded message ---------- >> From: "Bin.Cheng" <amker.cheng@gmail.com> >> Date: Tue, 29 Mar 2016 03:55:04 +0800 >> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking >> DR against its innermost loop bahavior if possible >> To: Richard Biener <richard.guenther@gmail.com> >> >> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote: >>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener >>>> <richard.guenther@gmail.com> wrote: >>>>> >>>>> Hmm. >>>> Hi, >>>> Thanks for reviewing. >>>>> >>>>> + equal_p = true; >>>>> + if (e1->base_address && e2->base_address) >>>>> + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); >>>>> + if (e1->offset && e2->offset) >>>>> + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); >>>>> >>>>> surely better to return false early. >>>>> >>>>> I think we don't want this in tree-data-refs.h also because of ... >>>>> >>>>> @@ -615,15 +619,29 @@ >>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info >>>>> (data_reference_p a) >>>>> data_reference_p *master_dr, *base_master_dr;and REALPART) before >>>>> creating the DR (or adjust the equality function >>>> and hashing >>>>> tree ref = DR_REF (a); >>>>> tree base_ref = DR_BASE_OBJECT (a); >>>>> + innermost_loop_behavior *innermost = &DR_INNERMOST (a); >>>>> tree ca = bb_predicate (gimple_bb (DR_STMT (a))); >>>>> bool exist1, exist2; >>>>> >>>>> - while (TREE_CODE (ref) == COMPONENT_REF >>>>> - || TREE_CODE (ref) == IMAGPART_EXPR >>>>> - || TREE_CODE (ref) == REALPART_EXPR) >>>>> - ref = TREE_OPERAND (ref, 0); >>>>> + /* If reference in DR has innermost loop behavior and it is not >>>>> + a compound memory reference, we store it to innermost_DR_map, >>>>> + otherwise to ref_DR_map. */ >>>>> + if (TREE_CODE (ref) == COMPONENT_REF >>>>> + || TREE_CODE (ref) == IMAGPART_EXPR >>>>> + || TREE_CODE (ref) == REALPART_EXPR >>>>> + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) >>>>> + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) >>>>> + { >>>>> + while (TREE_CODE (ref) == COMPONENT_REF >>>>> + || TREE_CODE (ref) == IMAGPART_EXPR >>>>> + || TREE_CODE (ref) == REALPART_EXPR) >>>>> + ref = TREE_OPERAND (ref, 0); >>>>> + >>>>> + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); >>>>> + } >>>>> + else >>>>> + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); >>>>> >>>>> we don't want an extra hashmap but replace ref_DR_map entirely. So we'd >>>>> need to >>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART >>>>> and REALPART) before creating the DR (or adjust the equality function >>>>> and hashing >>>>> to disregard them which means subtracting their offset from DR_INIT. >>>> I am not sure if I understand correctly. But for component reference, >>>> it is the base object that we want to record/track. For example, >>>> >>>> for (i = 0; i < N; i++) { >>>> m = *data++; >>>> >>>> m1 = p1->x - m; >>>> m2 = p2->x + m; >>>> >>>> p3->y = (m1 >= m2) ? p1->y : p2->y; >>>> >>>> p1++; >>>> p2++; >>>> p3++; >>>> } >>>> We want to infer that reads of p1/p2 in condition statement won't trap >>>> because there are unconditional reads of the structures, though the >>>> unconditional reads are actual of other sub-objects. Here it is the >>>> invariant part of address that we want to track. >>> >>> Well, the variant parts - we want to strip invariant parts as far as we can >>> (offsetof (x) and offsetof (y)) >>> >>>> Also illustrated by this example, we can't rely on data-ref analyzer >>>> here. Because in gathering/scattering cases, the address could be not >>>> affine at all. >>> >>> Sure, but that's a different issue. >>> >>>>> >>>>> To adjust the references we collect you'd maybe could use a callback >>>>> to get_references_in_stmt >>>>> to adjust them. >>>>> >>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple >>>>> as >>>> Is this a part of the method you suggested above, or is it an >>>> alternative one? If it's the latter, then I have below questions >>>> embedded. >>> >>> It is an alternative to adding a hook to get_references_in_stmt and >>> probably "easier". >>> >>>>> >>>>> Index: tree-if-conv.c >>>>> =================================================================== >>>>> --- tree-if-conv.c (revision 234215) >>>>> +++ tree-if-conv.c (working copy) >>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo >>>>> >>>>> for (i = 0; refs->iterate (i, &dr); i++) >>>>> { >>>>> + tree *refp = &DR_REF (dr); >>>>> + while ((TREE_CODE (*refp) == COMPONENT_REF >>>>> + && TREE_OPERAND (*refp, 2) == NULL_TREE) >>>>> + || TREE_CODE (*refp) == IMAGPART_EXPR >>>>> + || TREE_CODE (*refp) == REALPART_EXPR) >>>>> + refp = &TREE_OPERAND (*refp, 0); >>>>> + if (refp != &DR_REF (dr)) >>>>> + { >>>>> + tree saved_base = *refp; >>>>> + *refp = integer_zero_node; >>>>> + >>>>> + if (DR_INIT (dr)) >>>>> + { >>>>> + tree poffset; >>>>> + int punsignedp, preversep, pvolatilep; >>>>> + machine_mode pmode; >>>>> + HOST_WIDE_INT pbitsize, pbitpos; >>>>> + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, >>>>> &poffset, >>>>> + &pmode, &punsignedp, &preversep, >>>>> &pvolatilep, >>>>> + false); >>>>> + gcc_assert (poffset == NULL_TREE); >>>>> + >>>>> + DR_INIT (dr) >>>>> + = wide_int_to_tree (ssizetype, >>>>> + wi::sub (DR_INIT (dr), >>>>> + pbitpos / BITS_PER_UNIT)); >>>>> + } >>>>> + >>>>> + *refp = saved_base; >>>>> + DR_REF (dr) = *refp; >>>>> + } >>>> Looks to me the code is trying to resolve difference between two (or >>>> more) component references, which is DR_INIT in the code. But DR_INIT >>>> is not the only thing needs to be handled. For a structure containing >>>> two sub-arrays, DR_OFFSET may be different too. >>> >>> Yes, but we can't say that if >>> >>> a->a[i] >>> >>> doesn't trap that then >>> >>> a->b[i] >>> >>> doesn't trap either. We can only "strip" outermost >>> non-variable-offset components. >>> >>> But maybe I'm missing what example you are thinking of. >> Hmm, this was the case I meant. What I don't understand is current >> code logic does infer trap information for a.b[i] from a.a[i]. Given >> below example: >> struct str >> { >> int a[10]; >> int b[20]; >> char c; >> }; >> >> void bar (struct str *); >> int foo (int x, int n) >> { >> int i; >> struct str s; >> bar (&s); >> for (i = 0; i < n; i++) >> { >> s.a[i] = s.b[i]; >> if (x > i) >> s.b[i] = 0; >> } >> bar (&s); >> return 0; >> } >> The loop is convertible because of below code in function >> ifcvt_memrefs_wont_trap: >> >> /* If a is unconditionally accessed then ... */ >> if (DR_RW_UNCONDITIONALLY (*master_dr)) >> { >> /* an unconditional read won't trap. */ >> if (DR_IS_READ (a)) >> return true; >> >> /* an unconditionaly write won't trap if the base is written >> to unconditionally. */ >> if (base_master_dr >> && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) >> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); >> else >> { >> /* or the base is know to be not readonly. */ >> tree base_tree = get_base_address (DR_REF (a)); >> if (DECL_P (base_tree) >> && decl_binds_to_current_def_p (base_tree) >> && ! TREE_READONLY (base_tree)) >> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); >> } >> } >> It is the main object '&s' that is recorded in base_master_dr, so >> s.b[i] is considered not trap. Even it's not, I suppose >> get_base_address will give same result? > > Well, for this case it sees that s.b[i] is read from so it can't be an > out-of-bound > access. And s.a[i] is written to unconditionally so 's' cannot be a > readonly object. > With both pieces of information we can conclude that s.b[i] = 0 doesn't trap. Hi Richard, Attachment is the updated patch. I made below changes wrto your review comments: 1) Moved hash class for innermost loop behavior from tree-data-ref.h to tree-if-conv.c. I also removed check on innermost_loop_behavior.aligned_to which seems redundant to me because it's computed from offset and we have already checked equality for offset. 2) Replaced ref_DR_map with new map innermost_DR_map. 3) Post-processed DR in if_convertible_loop_p_1 for compound reference or references don't have innermost behavior. This cleans up following code using DR information. 4) Added a vectorization test for PR56625. I didn't incorporate your proposed code because I think it handles COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y. Looks to me it is another ifcvt opportunity different from PR69489. Anyway, fix is easy, I can send another patch in GCC7. Bootstrap and test on x86_64/AArch64, so any comments on this version? Thanks, bin 2016-03-30 Bin Cheng <bin.cheng@arm.com> PR tree-optimization/56625 PR tree-optimization/69489 * tree-data-ref.h (DR_INNERMOST): New macro. * tree-if-conv.c (innermost_loop_behavior_hash): New class for hashing struct innermost_loop_behavior. (ref_DR_map): Remove. (innermost_DR_map): New map. (baseref_DR_map): Revise comment. (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR to innermost_DR_map accroding to its innermost loop behavior. (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map according to its innermost loop behavior. (if_convertible_loop_p_1): Remove intialization for ref_DR_map. Add initialization for innermost_DR_map. Record memory reference in DR_BASE_ADDRESS if the reference is compound one or it doesn't have innermost loop behavior. (if_convertible_loop_p): Remove release for ref_DR_map. Release innermost_DR_map. gcc/testsuite/ChangeLog 2016-03-30 Bin Cheng <bin.cheng@arm.com> PR tree-optimization/56625 PR tree-optimization/69489 * gcc.dg/vect/pr56625.c: New test. * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test. [-- Attachment #2: pr69489-part1-20160330.txt --] [-- Type: text/plain, Size: 7637 bytes --] diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h index eb24d16..856dd58 100644 --- a/gcc/tree-data-ref.h +++ b/gcc/tree-data-ref.h @@ -144,6 +144,7 @@ struct data_reference #define DR_STEP(DR) (DR)->innermost.step #define DR_PTR_INFO(DR) (DR)->alias.ptr_info #define DR_ALIGNED_TO(DR) (DR)->innermost.aligned_to +#define DR_INNERMOST(DR) (DR)->innermost typedef struct data_reference *data_reference_p; diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index 9e305c7..884006c 100644 --- a/gcc/tree-if-conv.c +++ b/gcc/tree-if-conv.c @@ -114,16 +114,68 @@ along with GCC; see the file COPYING3. If not see #include "builtins.h" #include "params.h" +/* Hash for struct innermost_loop_behavior. It depends on the user to + free the memory. */ + +struct innermost_loop_behavior_hash : nofree_ptr_hash <innermost_loop_behavior> +{ + static inline hashval_t hash (const value_type &); + static inline bool equal (const value_type &, + const compare_type &); +}; + +inline hashval_t +innermost_loop_behavior_hash::hash (const value_type &e) +{ + hashval_t hash; + + hash = iterative_hash_expr (e->base_address, 0); + hash = iterative_hash_expr (e->offset, hash); + hash = iterative_hash_expr (e->init, hash); + return iterative_hash_expr (e->step, hash); +} + +inline bool +innermost_loop_behavior_hash::equal (const value_type &e1, + const compare_type &e2) +{ + if ((e1->base_address && !e2->base_address) + || (!e1->base_address && e2->base_address) + || (!e1->offset && e2->offset) + || (e1->offset && !e2->offset) + || (!e1->init && e2->init) + || (e1->init && !e2->init) + || (!e1->step && e2->step) + || (e1->step && !e2->step)) + return false; + + if (e1->base_address && e2->base_address + && !operand_equal_p (e1->base_address, e2->base_address, 0)) + return false; + if (e1->offset && e2->offset + && !operand_equal_p (e1->offset, e2->offset, 0)) + return false; + if (e1->init && e2->init + && !operand_equal_p (e1->init, e2->init, 0)) + return false; + if (e1->step && e2->step + && !operand_equal_p (e1->step, e2->step, 0)) + return false; + + return true; +} + /* List of basic blocks in if-conversion-suitable order. */ static basic_block *ifc_bbs; /* Apply more aggressive (extended) if-conversion if true. */ static bool aggressive_if_conv; -/* Hash table to store references, DR pairs. */ -static hash_map<tree_operand_hash, data_reference_p> *ref_DR_map; +/* Hash table to store <DR's innermost loop behavior, DR> pairs. */ +static hash_map<innermost_loop_behavior_hash, + data_reference_p> *innermost_DR_map; -/* Hash table to store base reference, DR pairs. */ +/* Hash table to store <base reference, DR> pairs. */ static hash_map<tree_operand_hash, data_reference_p> *baseref_DR_map; /* Structure used to predicate basic blocks. This is attached to the @@ -613,17 +665,12 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a) { data_reference_p *master_dr, *base_master_dr; - tree ref = DR_REF (a); tree base_ref = DR_BASE_OBJECT (a); + innermost_loop_behavior *innermost = &DR_INNERMOST (a); tree ca = bb_predicate (gimple_bb (DR_STMT (a))); bool exist1, exist2; - while (TREE_CODE (ref) == COMPONENT_REF - || TREE_CODE (ref) == IMAGPART_EXPR - || TREE_CODE (ref) == REALPART_EXPR) - ref = TREE_OPERAND (ref, 0); - - master_dr = &ref_DR_map->get_or_insert (ref, &exist1); + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); if (!exist1) *master_dr = a; @@ -685,21 +732,18 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs) data_reference_p *master_dr, *base_master_dr; data_reference_p a = drs[gimple_uid (stmt) - 1]; - tree ref_base_a = DR_REF (a); tree base = DR_BASE_OBJECT (a); + innermost_loop_behavior *innermost = &DR_INNERMOST (a); gcc_assert (DR_STMT (a) == stmt); + gcc_assert (DR_BASE_ADDRESS (a) || DR_OFFSET (a) + || DR_INIT (a) || DR_STEP (a)); - while (TREE_CODE (ref_base_a) == COMPONENT_REF - || TREE_CODE (ref_base_a) == IMAGPART_EXPR - || TREE_CODE (ref_base_a) == REALPART_EXPR) - ref_base_a = TREE_OPERAND (ref_base_a, 0); + master_dr = innermost_DR_map->get (innermost); + gcc_assert (master_dr != NULL); - master_dr = ref_DR_map->get (ref_base_a); base_master_dr = baseref_DR_map->get (base); - gcc_assert (master_dr != NULL); - /* If a is unconditionally written to it doesn't trap. */ if (DR_W_UNCONDITIONALLY (*master_dr)) return true; @@ -1228,13 +1272,16 @@ if_convertible_loop_p_1 (struct loop *loop, data_reference_p dr; - ref_DR_map = new hash_map<tree_operand_hash, data_reference_p>; + innermost_DR_map + = new hash_map<innermost_loop_behavior_hash, data_reference_p>; baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>; predicate_bbs (loop); for (i = 0; refs->iterate (i, &dr); i++) { + tree ref = DR_REF (dr); + dr->aux = XNEW (struct ifc_dr); DR_BASE_W_UNCONDITIONALLY (dr) = false; DR_RW_UNCONDITIONALLY (dr) = false; @@ -1244,6 +1291,27 @@ if_convertible_loop_p_1 (struct loop *loop, IFC_DR (dr)->base_w_predicate = boolean_false_node; if (gimple_uid (DR_STMT (dr)) == 0) gimple_set_uid (DR_STMT (dr), i + 1); + + /* If DR doesn't have innermost loop behavior or it's a compound + memory reference, we synthesize its innermost loop behavior + for hashing. */ + if (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR + || !(DR_BASE_ADDRESS (dr) || DR_OFFSET (dr) + || DR_INIT (dr) || DR_STEP (dr))) + { + while (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR) + ref = TREE_OPERAND (ref, 0); + + DR_BASE_ADDRESS (dr) = ref; + DR_OFFSET (dr) = NULL; + DR_INIT (dr) = NULL; + DR_STEP (dr) = NULL; + DR_ALIGNED_TO (dr) = NULL; + } hash_memrefs_baserefs_and_store_DRs_read_written_info (dr); } @@ -1338,8 +1406,8 @@ if_convertible_loop_p (struct loop *loop, bool *any_mask_load_store) free_data_refs (refs); - delete ref_DR_map; - ref_DR_map = NULL; + delete innermost_DR_map; + innermost_DR_map = NULL; delete baseref_DR_map; baseref_DR_map = NULL; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c new file mode 100644 index 0000000..02a6731 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* } } */ + +void foo (int a[], int b[]) +{ + int i; + for (i = 0; i < 100; i++) + { + if (a[i] == 0) + a[i] = b[i]*4; + else + a[i] = b[i]*3; + } +} + +/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/pr56625.c b/gcc/testsuite/gcc.dg/vect/pr56625.c new file mode 100644 index 0000000..b903be3 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr56625.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ + +void foo (int a[], int b[]) +{ + int i; + for (i = 0; i < 100; i++) + { + if (a[i] == 0) + a[i] = b[i]*4; + else + a[i] = b[i]*3; + } +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible 2016-03-31 17:08 ` Bin.Cheng @ 2016-04-04 13:07 ` Richard Biener 2016-04-04 14:14 ` Bin.Cheng 0 siblings, 1 reply; 10+ messages in thread From: Richard Biener @ 2016-04-04 13:07 UTC (permalink / raw) To: Bin.Cheng; +Cc: GCC Patches On Thu, Mar 31, 2016 at 6:43 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: > On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener > <richard.guenther@gmail.com> wrote: >> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>> Sorry, Should have replied to gcc-patches list. >>> >>> Thanks, >>> bin >>> >>> ---------- Forwarded message ---------- >>> From: "Bin.Cheng" <amker.cheng@gmail.com> >>> Date: Tue, 29 Mar 2016 03:55:04 +0800 >>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking >>> DR against its innermost loop bahavior if possible >>> To: Richard Biener <richard.guenther@gmail.com> >>> >>> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote: >>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener >>>>> <richard.guenther@gmail.com> wrote: >>>>>> >>>>>> Hmm. >>>>> Hi, >>>>> Thanks for reviewing. >>>>>> >>>>>> + equal_p = true; >>>>>> + if (e1->base_address && e2->base_address) >>>>>> + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); >>>>>> + if (e1->offset && e2->offset) >>>>>> + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); >>>>>> >>>>>> surely better to return false early. >>>>>> >>>>>> I think we don't want this in tree-data-refs.h also because of ... >>>>>> >>>>>> @@ -615,15 +619,29 @@ >>>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info >>>>>> (data_reference_p a) >>>>>> data_reference_p *master_dr, *base_master_dr;and REALPART) before >>>>>> creating the DR (or adjust the equality function >>>>> and hashing >>>>>> tree ref = DR_REF (a); >>>>>> tree base_ref = DR_BASE_OBJECT (a); >>>>>> + innermost_loop_behavior *innermost = &DR_INNERMOST (a); >>>>>> tree ca = bb_predicate (gimple_bb (DR_STMT (a))); >>>>>> bool exist1, exist2; >>>>>> >>>>>> - while (TREE_CODE (ref) == COMPONENT_REF >>>>>> - || TREE_CODE (ref) == IMAGPART_EXPR >>>>>> - || TREE_CODE (ref) == REALPART_EXPR) >>>>>> - ref = TREE_OPERAND (ref, 0); >>>>>> + /* If reference in DR has innermost loop behavior and it is not >>>>>> + a compound memory reference, we store it to innermost_DR_map, >>>>>> + otherwise to ref_DR_map. */ >>>>>> + if (TREE_CODE (ref) == COMPONENT_REF >>>>>> + || TREE_CODE (ref) == IMAGPART_EXPR >>>>>> + || TREE_CODE (ref) == REALPART_EXPR >>>>>> + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) >>>>>> + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) >>>>>> + { >>>>>> + while (TREE_CODE (ref) == COMPONENT_REF >>>>>> + || TREE_CODE (ref) == IMAGPART_EXPR >>>>>> + || TREE_CODE (ref) == REALPART_EXPR) >>>>>> + ref = TREE_OPERAND (ref, 0); >>>>>> + >>>>>> + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); >>>>>> + } >>>>>> + else >>>>>> + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); >>>>>> >>>>>> we don't want an extra hashmap but replace ref_DR_map entirely. So we'd >>>>>> need to >>>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART >>>>>> and REALPART) before creating the DR (or adjust the equality function >>>>>> and hashing >>>>>> to disregard them which means subtracting their offset from DR_INIT. >>>>> I am not sure if I understand correctly. But for component reference, >>>>> it is the base object that we want to record/track. For example, >>>>> >>>>> for (i = 0; i < N; i++) { >>>>> m = *data++; >>>>> >>>>> m1 = p1->x - m; >>>>> m2 = p2->x + m; >>>>> >>>>> p3->y = (m1 >= m2) ? p1->y : p2->y; >>>>> >>>>> p1++; >>>>> p2++; >>>>> p3++; >>>>> } >>>>> We want to infer that reads of p1/p2 in condition statement won't trap >>>>> because there are unconditional reads of the structures, though the >>>>> unconditional reads are actual of other sub-objects. Here it is the >>>>> invariant part of address that we want to track. >>>> >>>> Well, the variant parts - we want to strip invariant parts as far as we can >>>> (offsetof (x) and offsetof (y)) >>>> >>>>> Also illustrated by this example, we can't rely on data-ref analyzer >>>>> here. Because in gathering/scattering cases, the address could be not >>>>> affine at all. >>>> >>>> Sure, but that's a different issue. >>>> >>>>>> >>>>>> To adjust the references we collect you'd maybe could use a callback >>>>>> to get_references_in_stmt >>>>>> to adjust them. >>>>>> >>>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple >>>>>> as >>>>> Is this a part of the method you suggested above, or is it an >>>>> alternative one? If it's the latter, then I have below questions >>>>> embedded. >>>> >>>> It is an alternative to adding a hook to get_references_in_stmt and >>>> probably "easier". >>>> >>>>>> >>>>>> Index: tree-if-conv.c >>>>>> =================================================================== >>>>>> --- tree-if-conv.c (revision 234215) >>>>>> +++ tree-if-conv.c (working copy) >>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo >>>>>> >>>>>> for (i = 0; refs->iterate (i, &dr); i++) >>>>>> { >>>>>> + tree *refp = &DR_REF (dr); >>>>>> + while ((TREE_CODE (*refp) == COMPONENT_REF >>>>>> + && TREE_OPERAND (*refp, 2) == NULL_TREE) >>>>>> + || TREE_CODE (*refp) == IMAGPART_EXPR >>>>>> + || TREE_CODE (*refp) == REALPART_EXPR) >>>>>> + refp = &TREE_OPERAND (*refp, 0); >>>>>> + if (refp != &DR_REF (dr)) >>>>>> + { >>>>>> + tree saved_base = *refp; >>>>>> + *refp = integer_zero_node; >>>>>> + >>>>>> + if (DR_INIT (dr)) >>>>>> + { >>>>>> + tree poffset; >>>>>> + int punsignedp, preversep, pvolatilep; >>>>>> + machine_mode pmode; >>>>>> + HOST_WIDE_INT pbitsize, pbitpos; >>>>>> + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, >>>>>> &poffset, >>>>>> + &pmode, &punsignedp, &preversep, >>>>>> &pvolatilep, >>>>>> + false); >>>>>> + gcc_assert (poffset == NULL_TREE); >>>>>> + >>>>>> + DR_INIT (dr) >>>>>> + = wide_int_to_tree (ssizetype, >>>>>> + wi::sub (DR_INIT (dr), >>>>>> + pbitpos / BITS_PER_UNIT)); >>>>>> + } >>>>>> + >>>>>> + *refp = saved_base; >>>>>> + DR_REF (dr) = *refp; >>>>>> + } >>>>> Looks to me the code is trying to resolve difference between two (or >>>>> more) component references, which is DR_INIT in the code. But DR_INIT >>>>> is not the only thing needs to be handled. For a structure containing >>>>> two sub-arrays, DR_OFFSET may be different too. >>>> >>>> Yes, but we can't say that if >>>> >>>> a->a[i] >>>> >>>> doesn't trap that then >>>> >>>> a->b[i] >>>> >>>> doesn't trap either. We can only "strip" outermost >>>> non-variable-offset components. >>>> >>>> But maybe I'm missing what example you are thinking of. >>> Hmm, this was the case I meant. What I don't understand is current >>> code logic does infer trap information for a.b[i] from a.a[i]. Given >>> below example: >>> struct str >>> { >>> int a[10]; >>> int b[20]; >>> char c; >>> }; >>> >>> void bar (struct str *); >>> int foo (int x, int n) >>> { >>> int i; >>> struct str s; >>> bar (&s); >>> for (i = 0; i < n; i++) >>> { >>> s.a[i] = s.b[i]; >>> if (x > i) >>> s.b[i] = 0; >>> } >>> bar (&s); >>> return 0; >>> } >>> The loop is convertible because of below code in function >>> ifcvt_memrefs_wont_trap: >>> >>> /* If a is unconditionally accessed then ... */ >>> if (DR_RW_UNCONDITIONALLY (*master_dr)) >>> { >>> /* an unconditional read won't trap. */ >>> if (DR_IS_READ (a)) >>> return true; >>> >>> /* an unconditionaly write won't trap if the base is written >>> to unconditionally. */ >>> if (base_master_dr >>> && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) >>> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); >>> else >>> { >>> /* or the base is know to be not readonly. */ >>> tree base_tree = get_base_address (DR_REF (a)); >>> if (DECL_P (base_tree) >>> && decl_binds_to_current_def_p (base_tree) >>> && ! TREE_READONLY (base_tree)) >>> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); >>> } >>> } >>> It is the main object '&s' that is recorded in base_master_dr, so >>> s.b[i] is considered not trap. Even it's not, I suppose >>> get_base_address will give same result? >> >> Well, for this case it sees that s.b[i] is read from so it can't be an >> out-of-bound >> access. And s.a[i] is written to unconditionally so 's' cannot be a >> readonly object. >> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap. > > Hi Richard, > Attachment is the updated patch. I made below changes wrto your > review comments: > 1) Moved hash class for innermost loop behavior from tree-data-ref.h > to tree-if-conv.c. > I also removed check on innermost_loop_behavior.aligned_to which > seems redundant to me because it's computed from offset and we have > already checked equality for offset. > 2) Replaced ref_DR_map with new map innermost_DR_map. > 3) Post-processed DR in if_convertible_loop_p_1 for compound reference > or references don't have innermost behavior. This cleans up following > code using DR information. > 4) Added a vectorization test for PR56625. > > I didn't incorporate your proposed code because I think it handles > COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y. But the previous code already handled it, so not handling it would be a regression. Note that I think your patch will handle it as well given you hash innermost behavior. > Looks to me it is another ifcvt opportunity different from PR69489. > Anyway, fix is easy, I can send another patch in GCC7. > > Bootstrap and test on x86_64/AArch64, so any comments on this version? Looks good to me. Richard. > Thanks, > bin > > 2016-03-30 Bin Cheng <bin.cheng@arm.com> > > PR tree-optimization/56625 > PR tree-optimization/69489 > * tree-data-ref.h (DR_INNERMOST): New macro. > * tree-if-conv.c (innermost_loop_behavior_hash): New class for > hashing struct innermost_loop_behavior. > (ref_DR_map): Remove. > (innermost_DR_map): New map. > (baseref_DR_map): Revise comment. > (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR > to innermost_DR_map accroding to its innermost loop behavior. > (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map according > to its innermost loop behavior. > (if_convertible_loop_p_1): Remove intialization for ref_DR_map. > Add initialization for innermost_DR_map. Record memory reference > in DR_BASE_ADDRESS if the reference is compound one or it doesn't > have innermost loop behavior. > (if_convertible_loop_p): Remove release for ref_DR_map. Release > innermost_DR_map. > > gcc/testsuite/ChangeLog > 2016-03-30 Bin Cheng <bin.cheng@arm.com> > > PR tree-optimization/56625 > PR tree-optimization/69489 > * gcc.dg/vect/pr56625.c: New test. > * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible 2016-04-04 13:07 ` Richard Biener @ 2016-04-04 14:14 ` Bin.Cheng 2016-04-05 7:23 ` Richard Biener 0 siblings, 1 reply; 10+ messages in thread From: Bin.Cheng @ 2016-04-04 14:14 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches On Mon, Apr 4, 2016 at 2:07 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Thu, Mar 31, 2016 at 6:43 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >> On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener >> <richard.guenther@gmail.com> wrote: >>> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>>> Sorry, Should have replied to gcc-patches list. >>>> >>>> Thanks, >>>> bin >>>> >>>> ---------- Forwarded message ---------- >>>> From: "Bin.Cheng" <amker.cheng@gmail.com> >>>> Date: Tue, 29 Mar 2016 03:55:04 +0800 >>>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking >>>> DR against its innermost loop bahavior if possible >>>> To: Richard Biener <richard.guenther@gmail.com> >>>> >>>> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote: >>>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener >>>>>> <richard.guenther@gmail.com> wrote: >>>>> >>>>> It is an alternative to adding a hook to get_references_in_stmt and >>>>> probably "easier". >>>>> >>>>>>> >>>>>>> Index: tree-if-conv.c >>>>>>> =================================================================== >>>>>>> --- tree-if-conv.c (revision 234215) >>>>>>> +++ tree-if-conv.c (working copy) >>>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo >>>>>>> >>>>>>> for (i = 0; refs->iterate (i, &dr); i++) >>>>>>> { >>>>>>> + tree *refp = &DR_REF (dr); >>>>>>> + while ((TREE_CODE (*refp) == COMPONENT_REF >>>>>>> + && TREE_OPERAND (*refp, 2) == NULL_TREE) >>>>>>> + || TREE_CODE (*refp) == IMAGPART_EXPR >>>>>>> + || TREE_CODE (*refp) == REALPART_EXPR) >>>>>>> + refp = &TREE_OPERAND (*refp, 0); >>>>>>> + if (refp != &DR_REF (dr)) >>>>>>> + { >>>>>>> + tree saved_base = *refp; >>>>>>> + *refp = integer_zero_node; >>>>>>> + >>>>>>> + if (DR_INIT (dr)) >>>>>>> + { >>>>>>> + tree poffset; >>>>>>> + int punsignedp, preversep, pvolatilep; >>>>>>> + machine_mode pmode; >>>>>>> + HOST_WIDE_INT pbitsize, pbitpos; >>>>>>> + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, >>>>>>> &poffset, >>>>>>> + &pmode, &punsignedp, &preversep, >>>>>>> &pvolatilep, >>>>>>> + false); >>>>>>> + gcc_assert (poffset == NULL_TREE); >>>>>>> + >>>>>>> + DR_INIT (dr) >>>>>>> + = wide_int_to_tree (ssizetype, >>>>>>> + wi::sub (DR_INIT (dr), >>>>>>> + pbitpos / BITS_PER_UNIT)); >>>>>>> + } >>>>>>> + >>>>>>> + *refp = saved_base; >>>>>>> + DR_REF (dr) = *refp; >>>>>>> + } >>>>>> Looks to me the code is trying to resolve difference between two (or >>>>>> more) component references, which is DR_INIT in the code. But DR_INIT >>>>>> is not the only thing needs to be handled. For a structure containing >>>>>> two sub-arrays, DR_OFFSET may be different too. >>>>> >>>>> Yes, but we can't say that if >>>>> >>>>> a->a[i] >>>>> >>>>> doesn't trap that then >>>>> >>>>> a->b[i] >>>>> >>>>> doesn't trap either. We can only "strip" outermost >>>>> non-variable-offset components. >>>>> >>>>> But maybe I'm missing what example you are thinking of. >>>> Hmm, this was the case I meant. What I don't understand is current >>>> code logic does infer trap information for a.b[i] from a.a[i]. Given >>>> below example: >>>> struct str >>>> { >>>> int a[10]; >>>> int b[20]; >>>> char c; >>>> }; >>>> >>>> void bar (struct str *); >>>> int foo (int x, int n) >>>> { >>>> int i; >>>> struct str s; >>>> bar (&s); >>>> for (i = 0; i < n; i++) >>>> { >>>> s.a[i] = s.b[i]; >>>> if (x > i) >>>> s.b[i] = 0; >>>> } >>>> bar (&s); >>>> return 0; >>>> } >>>> The loop is convertible because of below code in function >>>> ifcvt_memrefs_wont_trap: >>>> >>>> /* If a is unconditionally accessed then ... */ >>>> if (DR_RW_UNCONDITIONALLY (*master_dr)) >>>> { >>>> /* an unconditional read won't trap. */ >>>> if (DR_IS_READ (a)) >>>> return true; >>>> >>>> /* an unconditionaly write won't trap if the base is written >>>> to unconditionally. */ >>>> if (base_master_dr >>>> && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) >>>> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); >>>> else >>>> { >>>> /* or the base is know to be not readonly. */ >>>> tree base_tree = get_base_address (DR_REF (a)); >>>> if (DECL_P (base_tree) >>>> && decl_binds_to_current_def_p (base_tree) >>>> && ! TREE_READONLY (base_tree)) >>>> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); >>>> } >>>> } >>>> It is the main object '&s' that is recorded in base_master_dr, so >>>> s.b[i] is considered not trap. Even it's not, I suppose >>>> get_base_address will give same result? >>> >>> Well, for this case it sees that s.b[i] is read from so it can't be an >>> out-of-bound >>> access. And s.a[i] is written to unconditionally so 's' cannot be a >>> readonly object. >>> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap. >> >> Hi Richard, >> Attachment is the updated patch. I made below changes wrto your >> review comments: >> 1) Moved hash class for innermost loop behavior from tree-data-ref.h >> to tree-if-conv.c. >> I also removed check on innermost_loop_behavior.aligned_to which >> seems redundant to me because it's computed from offset and we have >> already checked equality for offset. >> 2) Replaced ref_DR_map with new map innermost_DR_map. >> 3) Post-processed DR in if_convertible_loop_p_1 for compound reference >> or references don't have innermost behavior. This cleans up following >> code using DR information. >> 4) Added a vectorization test for PR56625. >> >> I didn't incorporate your proposed code because I think it handles >> COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y. > > But the previous code already handled it, so not handling it would be > a regression. > Note that I think your patch will handle it as well given you hash innermost > behavior. If I understand correctly, your code is more precise handling component reference by stripping const offset from innermost loop behavior. Currently tree if-conv just strips component ref for structure field away. Yes my patch can handle the case, but that's done by falling back to the reference itself (the existing code logic), rather than tunning innermost loop behavior as you suggested: + while (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR) + ref = TREE_OPERAND (ref, 0); + + DR_BASE_ADDRESS (dr) = ref; + DR_OFFSET (dr) = NULL; + DR_INIT (dr) = NULL; + DR_STEP (dr) = NULL; + DR_ALIGNED_TO (dr) = NULL; I think innermost loop behavior is unnecessary here for component refs, so is there an example showing possible exception? I will re-base/apply the patch after entering Stage1. Thanks, bin > >> Looks to me it is another ifcvt opportunity different from PR69489. >> Anyway, fix is easy, I can send another patch in GCC7. >> >> Bootstrap and test on x86_64/AArch64, so any comments on this version? > > Looks good to me. > > Richard. > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible 2016-04-04 14:14 ` Bin.Cheng @ 2016-04-05 7:23 ` Richard Biener 0 siblings, 0 replies; 10+ messages in thread From: Richard Biener @ 2016-04-05 7:23 UTC (permalink / raw) To: Bin.Cheng; +Cc: GCC Patches On Mon, Apr 4, 2016 at 4:14 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: > On Mon, Apr 4, 2016 at 2:07 PM, Richard Biener > <richard.guenther@gmail.com> wrote: >> On Thu, Mar 31, 2016 at 6:43 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>> On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener >>> <richard.guenther@gmail.com> wrote: >>>> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>>>> Sorry, Should have replied to gcc-patches list. >>>>> >>>>> Thanks, >>>>> bin >>>>> >>>>> ---------- Forwarded message ---------- >>>>> From: "Bin.Cheng" <amker.cheng@gmail.com> >>>>> Date: Tue, 29 Mar 2016 03:55:04 +0800 >>>>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking >>>>> DR against its innermost loop bahavior if possible >>>>> To: Richard Biener <richard.guenther@gmail.com> >>>>> >>>>> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote: >>>>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: >>>>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener >>>>>>> <richard.guenther@gmail.com> wrote: >>>>>> >>>>>> It is an alternative to adding a hook to get_references_in_stmt and >>>>>> probably "easier". >>>>>> >>>>>>>> >>>>>>>> Index: tree-if-conv.c >>>>>>>> =================================================================== >>>>>>>> --- tree-if-conv.c (revision 234215) >>>>>>>> +++ tree-if-conv.c (working copy) >>>>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo >>>>>>>> >>>>>>>> for (i = 0; refs->iterate (i, &dr); i++) >>>>>>>> { >>>>>>>> + tree *refp = &DR_REF (dr); >>>>>>>> + while ((TREE_CODE (*refp) == COMPONENT_REF >>>>>>>> + && TREE_OPERAND (*refp, 2) == NULL_TREE) >>>>>>>> + || TREE_CODE (*refp) == IMAGPART_EXPR >>>>>>>> + || TREE_CODE (*refp) == REALPART_EXPR) >>>>>>>> + refp = &TREE_OPERAND (*refp, 0); >>>>>>>> + if (refp != &DR_REF (dr)) >>>>>>>> + { >>>>>>>> + tree saved_base = *refp; >>>>>>>> + *refp = integer_zero_node; >>>>>>>> + >>>>>>>> + if (DR_INIT (dr)) >>>>>>>> + { >>>>>>>> + tree poffset; >>>>>>>> + int punsignedp, preversep, pvolatilep; >>>>>>>> + machine_mode pmode; >>>>>>>> + HOST_WIDE_INT pbitsize, pbitpos; >>>>>>>> + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, >>>>>>>> &poffset, >>>>>>>> + &pmode, &punsignedp, &preversep, >>>>>>>> &pvolatilep, >>>>>>>> + false); >>>>>>>> + gcc_assert (poffset == NULL_TREE); >>>>>>>> + >>>>>>>> + DR_INIT (dr) >>>>>>>> + = wide_int_to_tree (ssizetype, >>>>>>>> + wi::sub (DR_INIT (dr), >>>>>>>> + pbitpos / BITS_PER_UNIT)); >>>>>>>> + } >>>>>>>> + >>>>>>>> + *refp = saved_base; >>>>>>>> + DR_REF (dr) = *refp; >>>>>>>> + } >>>>>>> Looks to me the code is trying to resolve difference between two (or >>>>>>> more) component references, which is DR_INIT in the code. But DR_INIT >>>>>>> is not the only thing needs to be handled. For a structure containing >>>>>>> two sub-arrays, DR_OFFSET may be different too. >>>>>> >>>>>> Yes, but we can't say that if >>>>>> >>>>>> a->a[i] >>>>>> >>>>>> doesn't trap that then >>>>>> >>>>>> a->b[i] >>>>>> >>>>>> doesn't trap either. We can only "strip" outermost >>>>>> non-variable-offset components. >>>>>> >>>>>> But maybe I'm missing what example you are thinking of. >>>>> Hmm, this was the case I meant. What I don't understand is current >>>>> code logic does infer trap information for a.b[i] from a.a[i]. Given >>>>> below example: >>>>> struct str >>>>> { >>>>> int a[10]; >>>>> int b[20]; >>>>> char c; >>>>> }; >>>>> >>>>> void bar (struct str *); >>>>> int foo (int x, int n) >>>>> { >>>>> int i; >>>>> struct str s; >>>>> bar (&s); >>>>> for (i = 0; i < n; i++) >>>>> { >>>>> s.a[i] = s.b[i]; >>>>> if (x > i) >>>>> s.b[i] = 0; >>>>> } >>>>> bar (&s); >>>>> return 0; >>>>> } >>>>> The loop is convertible because of below code in function >>>>> ifcvt_memrefs_wont_trap: >>>>> >>>>> /* If a is unconditionally accessed then ... */ >>>>> if (DR_RW_UNCONDITIONALLY (*master_dr)) >>>>> { >>>>> /* an unconditional read won't trap. */ >>>>> if (DR_IS_READ (a)) >>>>> return true; >>>>> >>>>> /* an unconditionaly write won't trap if the base is written >>>>> to unconditionally. */ >>>>> if (base_master_dr >>>>> && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) >>>>> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); >>>>> else >>>>> { >>>>> /* or the base is know to be not readonly. */ >>>>> tree base_tree = get_base_address (DR_REF (a)); >>>>> if (DECL_P (base_tree) >>>>> && decl_binds_to_current_def_p (base_tree) >>>>> && ! TREE_READONLY (base_tree)) >>>>> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES); >>>>> } >>>>> } >>>>> It is the main object '&s' that is recorded in base_master_dr, so >>>>> s.b[i] is considered not trap. Even it's not, I suppose >>>>> get_base_address will give same result? >>>> >>>> Well, for this case it sees that s.b[i] is read from so it can't be an >>>> out-of-bound >>>> access. And s.a[i] is written to unconditionally so 's' cannot be a >>>> readonly object. >>>> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap. >>> >>> Hi Richard, >>> Attachment is the updated patch. I made below changes wrto your >>> review comments: >>> 1) Moved hash class for innermost loop behavior from tree-data-ref.h >>> to tree-if-conv.c. >>> I also removed check on innermost_loop_behavior.aligned_to which >>> seems redundant to me because it's computed from offset and we have >>> already checked equality for offset. >>> 2) Replaced ref_DR_map with new map innermost_DR_map. >>> 3) Post-processed DR in if_convertible_loop_p_1 for compound reference >>> or references don't have innermost behavior. This cleans up following >>> code using DR information. >>> 4) Added a vectorization test for PR56625. >>> >>> I didn't incorporate your proposed code because I think it handles >>> COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y. >> >> But the previous code already handled it, so not handling it would be >> a regression. >> Note that I think your patch will handle it as well given you hash innermost >> behavior. > If I understand correctly, your code is more precise handling > component reference by stripping const offset from innermost loop > behavior. Currently tree if-conv just strips component ref for > structure field away. Yes my patch can handle the case, but that's > done by falling back to the reference itself (the existing code > logic), rather than tunning innermost loop behavior as you suggested: Ah, indeed. > + while (TREE_CODE (ref) == COMPONENT_REF > + || TREE_CODE (ref) == IMAGPART_EXPR > + || TREE_CODE (ref) == REALPART_EXPR) > + ref = TREE_OPERAND (ref, 0); > + > + DR_BASE_ADDRESS (dr) = ref; > + DR_OFFSET (dr) = NULL; > + DR_INIT (dr) = NULL; > + DR_STEP (dr) = NULL; > + DR_ALIGNED_TO (dr) = NULL; > > I think innermost loop behavior is unnecessary here for component > refs, so is there an example showing possible exception? Well, we don't need the outer component refs but DR analyze included them in its analysis (and thus DR_INIT will differ). For cases DR analyze failed on that doesn't matter of course. > I will re-base/apply the patch after entering Stage1. Thanks, Richard. > Thanks, > bin >> >>> Looks to me it is another ifcvt opportunity different from PR69489. >>> Anyway, fix is easy, I can send another patch in GCC7. >>> >>> Bootstrap and test on x86_64/AArch64, so any comments on this version? >> >> Looks good to me. >> >> Richard. >> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-04-05 7:23 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-03-16 10:00 [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible Bin Cheng 2016-03-16 12:20 ` Richard Biener 2016-03-16 16:17 ` Bin.Cheng 2016-03-17 9:53 ` Richard Biener [not found] ` <CAHFci29CHpOBxah4QxrL_JOL6p_NC=r3-e3eoNjHp_4z749PnQ@mail.gmail.com> 2016-03-28 22:04 ` Fwd: " Bin.Cheng 2016-03-29 8:44 ` Richard Biener 2016-03-31 17:08 ` Bin.Cheng 2016-04-04 13:07 ` Richard Biener 2016-04-04 14:14 ` Bin.Cheng 2016-04-05 7:23 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).