Hi Riachard, Thanks for the review. Here is an updated patch with comments below. > +/* Restore/Pop all the old VRs maintained in the cond_stack. */ > + > +void evrp_dom_walker::finalize_dom_walker () > +{ > + while (!cond_stack.is_empty ()) > + { > + tree var = cond_stack.last ().second; > + pop_value_range (var); > + cond_stack.pop (); > + } > > why is this necessary at all? Looks like a waste of time (and the > stack should be empty here > anyway). I think you leak cond_stack as well, I suppose using > auto_vec might work here, > you have to check. Done. > >>> >>> + /* Discover VR when condition is true. */ >>> + if (te == e >>> + && !TREE_OVERFLOW_P (op0) >>> + && !TREE_OVERFLOW_P (op1)) >> >> >> This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from >> set_value_range otherwise: >> >> gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min)) >> && (!TREE_OVERFLOW_P (max) || is_overflow_infinity >> (max))); > > Ok, instead make sure to remove the overflow flag via > > if (TREE_OVERFLOW_P (op0)) > op0 = drop_tree_overflow (op0); ... Done. > > it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and FALSE_VALUE > cases and only invert the tree comparison for the false edge. Done. > > + tree cond = build2 (code, boolean_type_node, op0, op1); > + extract_range_for_var_from_comparison_expr (&vr, op0, cond); > > no wasteful tree building here please. Instead of cond pass in code, > op0 and op1 > as separate arguments. Done. > > + /* If we found any usable VR, set the VR to ssa_name and create a > + PUSH old value in the cond_stack with the old VR. */ > + if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE) > + { > + value_range *new_vr = vrp_value_range_pool.allocate (); > + memset (new_vr, 0, sizeof (*new_vr)); > + *new_vr = vr; > + cond_stack.safe_push (std::make_pair (bb, op0)); > > the memset looks redundant given you copy-assing from vr anyway. Done. > > You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where > both i_1 and x_2 might have ranges that are useful. I will address this once the basic structure is in place if that is OK with you. > > push and pop_value_range should be methods in the dom walker class > and vrp_stack and cond_stack should be a single stack. You can omit > basic_block if you use a "magic" NULL_TREE var as marker (simply > push a NULL_TREE, NULL pair at the end of before_dom_children). > Done. >>> >>> you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE >>> >>> why do you need those TREE_OVERFLOW checks? >>> >>> + tree cond = build2 (code, boolean_type_node, op0, op1); >>> + tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond); >>> + extract_range_from_assert (&vr, a); >>> >>> so I was hoping that the "refactoring" patch in the series would expose a >>> more >>> useful interface than extract_range_from_assert ... namely one that can >>> extract a range from the comparison directly and does not require building >>> a scratch ASSERT_EXPR. >> >> >> I have split extract_range_from_assert into >> extract_range_for_var_from_comparison_expr and using it now. Is that better? > > See above. > >>> >>> >>> + /* If we found any usable VR, set the VR to ssa_name and create >>> a >>> + restore point in the cond_stack with the old VR. */ >>> + if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE) >>> + { >>> + value_range *new_vr = XCNEW (value_range); >>> + *new_vr = vr; >>> + cond_stack.safe_push (std::make_pair (bb, >>> + std::make_pair (op0, >>> + >>> old_vr))); >>> + change_value_range (op0, new_vr); >>> >>> I don't like 'change_value_range' as interface, please integrate that into >>> a push/pop_value_range style interface instead. >> >> >> Tried using push/pop interface. >>> >>> >>> + vrp_visit_stmt (stmt, &taken_edge_p, &output_p); >>> + } >>> + >>> + return NULL; >>> >>> you should return taken_edge_p (misnamed as it isn't a pointer) and take >>> advantage of EDGE_EXECUTABLE. Again see DOM/SCCVN (might want to >>> defer this as a followup improvement). >> >> >> I have added a TODO to this effect and will comeback to it. >>> >>> >>> Note that the advantage of a DOM-based VRP is that backtracking is easy >>> to implement (you don't do that yet). That is, after DEF got a (better) >>> value-range you can simply re-visit all the defs of its uses (and >>> recursively). >>> I think you have to be careful with stmts that might prematurely leave a >>> BB >>> though (like via EH). So sth for a followup as well. >> >> >> Is this OK now. Bootstrapped and regression tested on x86_64-linux with no >> new regressions. > > Better, still not perfect. > > I'm also arguing on the pass placement now - you put it where it may make sense > for IPA VRP (but then IPA VRP could simply do this in its analysis phase) but > not so much where it makes sense during the early pipeline. I think it makes > sense after FRE. > > Looking at how you finalize I see several issues with simply re-using > vrp_finalize. > > First of all the loop doing the set_range_info will turn up with > nothing as you've > popped off all value-ranges from the lattice. Same for substitute-and-fold (or > jump-threading). I am not sure understanding you. I am poping the value ranges for scope when we leave them. Other value ranges which lives in all the regions will remain. > > What you instead need to do with the DOM scheme is at the point you > call vrp_visit_stmt do sth like > > vrp_visit_stmt (....); > if (fold_stmt (&gsi, follow_single_use_edges)) > update_stmt (); I have added this. I think this will be good as we would be able to optimize with value ranges that is valid within the scope. > tree def = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF); > if (def > && INTEGRAL_TYPE_P (TREE_TYPE (def)) > && (TREE_CODE (vr_value[i]->min) == INTEGER_CST) > && (TREE_CODE (vr_value[i]->max) == INTEGER_CST) > && (vr_value[i]->type == VR_RANGE > || vr_value[i]->type == VR_ANTI_RANGE)) > set_range_info (name, vr_value[i]->type, vr_value[i]->min, > vr_value[i]->max); > I am not sure if this is needed. I dont know if my push/pop value range implementation is not what you wanted. If you still prefer, I am happy to add this. > thus please split vrp_finalize into two parts, one of it with the memory > freeing which is the one you'd call. > > Note that EVRP is not enabled by default at any optimization level > it seems so bootstrap / test of it would be useless (did you only > test with the full series? I never looked at the IPA VRP part) > I have: +ftree-evrp +Common Report Var(flag_tree_early_vrp) Init(1) Optimization +Perform Early Value Range Propagation on trees. I would like to run this only for -O2 and above but for now I am using this to test. I have tested the last set of patch separately. I will do more testing on this patch based on your feedback. Does this look better? Thanks, Kugan