On 01/30/2017 10:03 AM, Richard Biener wrote: > On Fri, Jan 27, 2017 at 12:20 PM, Aldy Hernandez wrote: >> On 01/26/2017 07:29 AM, Richard Biener wrote: >>> >>> On Thu, Jan 26, 2017 at 1:04 PM, Aldy Hernandez wrote: >>>> >>>> On 01/24/2017 07:23 AM, Richard Biener wrote: >> >> >>> Your initial on-demand approach is fine to catch some of the cases but it >>> will not handle those for which we'd need post-dominance. >>> >>> I guess we can incrementally add that. >> >> >> No complaints from me. >> >> This is my initial on-demand approach, with a few fixes you've commented on >> throughout. >> >> As you can see, there is still an #if 0 wrt to using your suggested >> conservative handling of memory loads, which I'm not entirely convinced of, >> as it pessimizes gcc.dg/loop-unswitch-1.c. If you feel strongly about it, I >> can enable the code again. > > It is really required -- fortunately loop-unswitch-1.c is one of the cases where > the post-dom / always-executed bits help . The comparison is inside the > loop header and thus always executed when the loop enters, so inserrting it > on the preheader edge is fine. Left as is then. > >> Also, I enhanced gcc.dg/loop-unswitch-1.c to verify that we're actually >> unswitching something. It seems kinda silly that we have various unswitch >> tests, but we don't actually check whether we have unswitched anything. > > Heh. It probably was added for an ICE... > >> This test was the only one in the *unswitch*.c set that I saw was actually >> being unswitched. Of course, if you don't agree with my #if 0 above, I will >> remove this change to the test. >> >> Finally, are we even guaranteed to unswitch in loop-unswitch-1.c across >> architectures? If not, again, I can remove the one-liner. > > I think so. Left as well. > >> >> How does this look for trunk? > > With a unswitch-local solution I meant to not add new files but put the > defined_or_undefined class (well, or rather a single function...) into > tree-ssa-loop-unswitch.c. Done. > > @@ -138,7 +141,7 @@ tree_may_unswitch_on (basic_block bb, struct loop *loop) > { > /* Unswitching on undefined values would introduce undefined > behavior that the original program might never exercise. */ > - if (ssa_undefined_value_p (use, true)) > + if (defined_or_undefined->is_maybe_undefined (use)) > return NULL_TREE; > def = SSA_NAME_DEF_STMT (use); > def_bb = gimple_bb (def); > > as I said, moving this (now more expensive check) after > > if (def_bb > && flow_bb_inside_loop_p (loop, def_bb)) > return NULL_TREE; > > this cheap check would be better. It should avoid 99% of all calls I bet. Done. > > You can recover the loop-unswitch-1.c case by passing down > the using stmt and checking its BB against loop_header (the only > block that we trivially know is always executed when entering the region). > Or do that check in the caller, like > > if (bb != loop->header > && ssa_undefined_value_p (use, true) / > defined_or_undefined->is_maybe_undefined (use)) Done in callee. > > + gimple *def = SSA_NAME_DEF_STMT (t); > + > + /* Check that all the PHI args are fully defined. */ > + if (gphi *phi = dyn_cast (def)) > + { > + if (virtual_operand_p (PHI_RESULT (phi))) > + continue; > > You should never run into virtual operands (you only walk SSA_OP_USEs). Done. > > You can stop walking at stmts that dominate the region header, > like with > > + gimple *def = SSA_NAME_DEF_STMT (t); > /* Uses in stmts always executed when the region header > executes are fine. */ > if (dominated_by_p (CDI_DOMINATORS, loop_header, gimple_bb (def))) > continue; Hmmmm... doing this causes the PR testcase (gcc.dg/loop-unswitch-5.c in the attached patch to FAIL). I haven't looked at it, but I seem to recall in the testcase that we could have a DEF that dominated the loop but was a mess of PHI's, some of whose args were undefined. Did you perhaps want to put that dominated_by_p call after the PHI arg checks (which seems to work)?: /* Check that all the PHI args are fully defined. */ if (gphi *phi = dyn_cast (def)) ... ... ... + /* Uses in stmts always executed when the region header executes + are fine. */ + if (dominated_by_p (CDI_DOMINATORS, loop->header, gimple_bb (def))) + continue; + /* Handle calls and memory loads conservatively. */ if (!is_gimple_assign (def) || (gimple_assign_single_p (def) Until this is clear, I've left this dominated_by_p call #if 0'ed out. > > and the bail out for PARM_DECLs is wrong: > > + /* A PARM_DECL will not have an SSA_NAME_DEF_STMT. Parameters > + get their initial value from function entry. */ > + if (SSA_NAME_VAR (t) && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL) > + return false; > > needs to be a continue; rather than a return false. Done. Preliminary test show the attached patch works. Further tests on-going. Aldy