Hi, RTL "noce" ifcvt will currently give up if the branches it is trying to make conditional are too complicated. One of the conditions for "too complicated" is that the branch sets more than one value. One common idiom that this misses is something like: int d = a[i]; int e = b[i]; if (d > e) std::swap (d, e) [...] Which is currently going to generate something like compare (d, e) branch.le L1 tmp = d; d = e; e = tmp; L1: In the case that this is an unpredictable branch, we can do better with: compare (d, e) d1 = if_then_else (le, e, d) e1 = if_then_else (le, d, e) d = d1 e = e1 Register allocation will eliminate the two trailing unconditional assignments, and we get a neater sequence. This patch introduces this logic to the RTL if convert passes, catching cases where a basic block does nothing other than multiple SETs. This helps both with the std::swap idiom above, and with pathological cases where tree passes create new basic blocks to resolve Phi nodes, which contain only set instructions and end up unprecdictable. One big question I have with this patch is how I ought to write a meaningful cost model I've used. It seems like yet another misuse of RTX costs, and another bit of stuff for targets to carefully balance. Now, if the relative cost of branches and conditional move instructions is not carefully managed, you may enable or disable these optimisations. This is probably acceptable, but I dislike adding more and more gotcha's to target costs, as I get bitten by them hard enough as is! Elsewhere the ifcvt cost usage is pretty lacking - esentially counting the number of instructions which will be if-converted and comparing that against the magic number "2". I could follow this lead and just count the number of moves I would convert, then compare that to the branch cost, but this feels... wrong. This makes it pretty tough to choose a "good" number for TARGET_BRANCH_COST. This isn't helped now that higher branch costs can mean pulling expensive instructions in to the main execution stream. I've picked a fairly straightforward cost model for this patch, trying to compare the cost of each conditional move, as calculated with rtx_costs, against COSTS_N_INSNS (branch_cost). This essentially kills the optimisation for any target with conditional-move cost > 1. Personally, I consider that a pretty horrible bug in this patch - but I couldn't think of anything better to try. As you might expect, this triggers all over the place when TARGET_BRANCH_COST numbers are tuned high. In an AArch64 Spec2006 build, I saw 3.9% more CSEL operations with this patch and TARGET_BRANCH_COST set to 4. Performance is also good on AArch64 on a range of microbenchmarks and larger workloads (after playing with the branch costs). I didn't see any performance regression on x86_64, as you would expect given that the cost models preclude x86_64 targets from ever hitting this optimisation. Bootstrapped and tested on x86_64 and AArch64 with no issues, and bootstrapped and tested with the cost model turned off, to have some confidence that we will continue to do the right thing if any targets do up their branch costs and start using this code. No testcase provided, as currently I don't know of targets with a high enough branch cost to actually trigger the optimisation. OK? Thanks, James --- gcc/ 2015-09-07 James Greenhalgh * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): New. (noce_convert_multiple_sets): Likewise. (noce_process_if_block): Call them.