On Mon, Jul 26, 2021 at 7:28 PM Richard Sandiford wrote: > > Aldy Hernandez writes: > > On Mon, Jul 26, 2021 at 4:18 PM Richard Sandiford > > wrote: > >> > >> Aldy Hernandez writes: > >> > This patch replaces the evrp_range_analyzer in the loop versioning code > >> > with an on-demand ranger. > >> > > >> > Everything was pretty straightforward, except that range_of_expr requires > >> > a gimple statement as context to provide context aware ranges. I didn't see > >> > a convient place where the statement was saved, so I made a vector indexed > >> > by SSA names. As an alternative, I tried to use the loop's first statement, > >> > but that proved to be insufficient. > >> > >> The mapping is one-to-many though: there can be multiple statements > >> for each SSA name. Maybe that doesn't matter in this context and > >> any of the statements can act as a representative. > >> > >> I'm surprised that the loop's first statement didn't work though, > >> since the SSA name is supposedly known to be loop-invariant. What went > >> wrong when you tried that? > > > > I was looking at the first statement of loop_info->block_list and one > > of the dg.exp=loop-versioning* tests failed. Perhaps I should have > > used the loop itself, as in the attached patch. With this patch all > > of the loop-versioning tests pass. > > > >> > >> > I am not familiar with loop versioning, but if the DOM walk was only > >> > necessary for the calls to record_ranges_from_stmt, this too could be > >> > removed as the ranger will work without it. > >> > >> Yeah, that was the only reason. If the information is available at > >> version_for_unity (I guess it is) then we should just avoid recording > >> the versioning there if so. > >> > >> How expensive is the check? If the result is worth caching, perhaps > >> we should have two bitmaps: the existing one, and one that records > >> whether we've checked a particular SSA name. > >> > >> If the check is relatively cheap then that won't be worth it though. > > > > If you're asking about the range_of_expr check, that's all cached, so > > it should be pretty cheap. Besides, we're no longer calculating > > ranges for each statement in the IL, as we were doing in lv_dom_walker > > with evrp's record_ranges_from_stmt. Only statements of interest are > > queried. > > Sounds good. If the results are already cached then another level > of caching (via the second bitmap I mentioned above) would obviously > be a waste of time. My callgrind harness for performance testing wasn't able to pick up enough samples to measure the time spent in pass_loop_versioning::execute. I've seen this happen before with passes that run too fast. I'm afraid I don't have enough cycles to continue working on this. > > > How about this patch, pending tests? > > OK, thanks, as a strict improvement over the status quo. But it'd be > even better without the dom walk :-) I've removed the DOM walk, and re-tested. OK to push? Aldy