On Fri, Jun 03, 2016 at 12:39:42PM +0200, Richard Biener wrote: > On Thu, Jun 2, 2016 at 6:53 PM, James Greenhalgh > wrote: > > > > Hi, > > > > This patch introduces a new target hook, to be used like BRANCH_COST but > > with a guaranteed unit of measurement. We want this to break away from > > the current ambiguous uses of BRANCH_COST. > > > > BRANCH_COST is used in ifcvt.c in two types of comparisons. One against > > instruction counts - where it is used as the limit on the number of new > > instructions we are permitted to generate. The other (after multiplying > > by COSTS_N_INSNS (1)) directly against RTX costs. > > > > Of these, a comparison against RTX costs is the more easily understood > > metric across the compiler, and the one I've pulled out to the new hook. > > To keep things consistent for targets which don't migrate, this new hook > > has a default value of BRANCH_COST * COSTS_N_INSNS (1). > > > > OK? > > How does the caller compute "predictable"? There are some archs where > an information on whether this is a forward or backward jump is more > useful I guess. Also at least for !speed_p the distance of the branch is > important given not all targets support arbitrary branch offsets. Just through a call to predictable_edge_p. It isn't perfect. My worry with adding more details of the branch is that you end up with a nonsense target implementation that tries way too hard to be clever. But, I don't mind passing the edge through to the target hook, that way a target has it if they want it. In this patch revision, I pass the edge through. > I remember that at the last Cauldron we discussed to change things to > compare costs of sequences of instructions rather than giving targets no > context with just asking for single (sub-)insn rtx costs. I've made better use of seq_cost in this respin. Bernd was right, constructing dummy RTX just for costs, then discarding it, then constructing the actual RTX for matching doesn't make sense as a pipeline. Better just to construct the real sequence and use the cost of that. In this patch revision, I started by removing the idea that this costs a branch at all. It doesn't, the use of this hook is really a target trying to limit if-convert to not end up pulling too much on to the unconditional path. It seems better to expose that limit directly by explicitly asking for the maximum cost of an unconditional sequence we would create, and comparing against seq_cost of the new RTL. This saves a target trying to figure out what is meant by a cost of a branch. Having done that, I think I can see a clearer path to getting the default hook implementation in shape. I've introduced two new params, which give maximum costs for the generated sequence (one for a "predictable" branch, one for "unpredictable") in the speed_p cases. I'm not expecting it to be useful to give the user control in the case we are compiling for size - whether this is a size win or not is independent of whether the branch is predictable. For the default implementation, if the parameters are not set, I just multiply BRANCH_COST through by COSTS_N_INSNS (1) for size and COSTS_N_INSNS (3) for speed. I know this is not ideal, but I'm still short of ideas on how best to form the default implementation. This means we're still potentially going to introduce performance regressions for targets that don't provide an implementation of the new hook, or a default value for the new parameters. It does mean we can keep the testsuite clean by setting parameter values suitably high for all targets that have conditional move instructions. The new default causes some changes in generated conditional move sequences for x86_64. Whether these changes are for the better or not I can't say. This first patch introduces the two new parameters, and uses them in the default implementation of the target hook. Bootstrapped on x86_64 and aarch64 with no issues. OK? Thanks, James --- 2016-06-21 James Greenhalgh * target.def (max_noce_ifcvt_seq_cost): New. * doc/tm.texi.in (TARGET_MAX_NOCE_IFCVT_SEQ_COST): Document it. * doc/tm.texi: Regenerate. * targhooks.h (default_max_noce_ifcvt_seq_cost): New. * targhooks.c (default_max_noce_ifcvt_seq_cost): New. * params.def (PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST): New. (PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST): Likewise. * doc/invoke.texi: Document new params.