When vector comparisons were forced to use vec_cond_expr, we lost a number 
of optimizations (my fault for not adding enough testcases to prevent 
that). This patch tries to unwrap vec_cond_expr a bit so some 
optimizations can still happen.

I wasn't planning to add all those transformations together, but adding 
one caused a regression, whose fix introduced a second regression, etc.

Using a simple fold_binary internally looks like an ok compromise to me. 
It remains cheap enough (not recursive, and vector instructions are not 
that frequent), while still allowing more than const_binop (X|0 or X&X for 
instance). The transformations are quite conservative with :s and folding 
only if everything simplifies, we may want to relax this later. And of 
course we are going to miss things like a?b:c + a?c:b -> b+c.

In terms of number of operations, some transformations turning 2 
VEC_COND_EXPR into VEC_COND_EXPR + BIT_IOR_EXPR + BIT_NOT_EXPR might not 
look like a gain... I expect the bit_not disappears in most cases, and 
VEC_COND_EXPR looks more costly than a simpler BIT_IOR_EXPR.

I am a bit confused that with avx512 we get types like "vector(4) 
<signed-boolean:2>" with :2 and not :1 (is it a hack so true is 1 and not 
-1?), but that doesn't matter for this patch.

Regtest+bootstrap on x86_64-pc-linux-gnu

2020-07-30  Marc Glisse  <marc.glisse@inria.fr>

 	PR tree-optimization/95906
 	PR target/70314
 	* match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e),
 	(v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations.

 	* gcc.dg/tree-ssa/andnot-2.c: New file.
 	* gcc.dg/tree-ssa/pr95906.c: Likewise.
 	* gcc.target/i386/pr70314.c: Likewise.

-- 
Marc Glisse