This patch is a prerequisite for some amdgcn patches I'm working on to 
support shorter vector lengths (having fixed 64 lanes tends to miss 
optimizations, and masking is not supported everywhere yet).

The problem is that, unlike AArch64, I'm not using different mask modes 
for different sized vectors, so all loops end up using the while_ultsidi 
pattern, regardless of vector length.  In theory I could use SImode for 
V32, HImode for V16, etc., but there's no mode to fit V4 or V2 so 
something else is needed.  Moving to using vector masks in the backend 
is not a natural fit for GCN, and would be a huge task in any case.

This patch adds an additional length operand so that we can distinguish 
the different uses in the back end and don't end up with more lanes 
enabled than there ought to be.

I've made the extra operand conditional on the mode so that I don't have 
to modify the AArch64 backend; that uses while_<cond> family of 
operators in a lot of places and uses iterators, so it would end up 
touching a lot of code just to add an inactive operand, plus I don't 
have a way to test it properly.  I've confirmed that AArch64 builds and 
expands while_ult correctly in a simple example.

OK for mainline?

Thanks

Andrew