(Sending to the right mailing list address this time, sorry for
duplicates!)

Hi,

This patch fixes a large number of execution failures which occur when
compiling with NEON and auto-vectorization enabled in big-endian mode,
particularly when -mvectorize-with-neon-quad is in use.

The basic issue (as discussed several times previously, e.g.
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00409.html) is that
NEON quad-word vectors are formed from pairs of double-word vectors,
and those pairs are "the wrong way round" in big-endian mode. This is
contrary to the middle-end's expectations, and several patterns
introduced fairly recently to the GCC NEON support fail to take this
"feature" into account.

So: in the interests of un-breaking code in big-endian mode, the
attached patch does the simplest thing possible: disabling the
troublesome patterns in big-endian mode. I experimented with other
approaches, i.e. trying to "hide" the permutations required by NEON in
the backend patterns:

  http://lists.linaro.org/pipermail/linaro-toolchain/2010-November/000536.html

But eventually came to the conclusion that it was impractical to do
that. (We're planning to revisit the "internal" ordering of vectors for
big-endian NEON in the vectorizer at least, anyway.)

I've tested this patch with custom multilib configurations
(big-endian/little-endian), with the following options:

  little-endian, -mfpu=neon -mfloat-abi=softfp

  little-endian, -mfpu=neon -mfloat-abi=softfp
     -mvectorize-with-neon-quad

  big-endian, -mfpu=neon -mfloat-abi=softfp

  big-endian, -mfpu=neon -mfloat-abi=softfp -mvectorize-with-neon-quad

Unfortunately for C only, since building C++ was broken at the time I
started testing. Some tests fail to vectorize post-patch in BE mode
(predictably, since fewer things end up vectorizable), but many
execution tests transition from FAIL to PASS.

OK to apply?

Thanks,

Julian

ChangeLog

    gcc/
    * config/arm/neon.md (vec_shr_<mode>, vec_shl_<mode>): Disable in
    big-endian mode.
    (reduc_splus_<mode>, reduc_uplus_<mode>, reduc_smin_<mode>)
    (reduc_smax_<mode>, reduc_umin_<mode>, reduc_umax_<mode>)
    (neon_vec_unpack<US>_lo_<mode>, neon_vec_unpack<US>_hi_<mode>)
    (vec_unpack<US>_hi_<mode>, vec_unpack<US>_lo_<mode>)
    (neon_vec_<US>mult_lo_<mode>, vec_widen_<US>mult_lo_<mode>)
    (neon_vec_<US>mult_hi_<mode>, vec_widen_<US>mult_hi_<mode>)
    (vec_pack_trunc_<mode>, neon_vec_pack_trunc_<mode>): Disable for Q
    registers in big-endian mode.