Hi, In a number of cases where we try to create vectors we end up spilling to the stack and then filling. This is one example distilled from a couple of micro-benchmrks where the issue shows up. The reason for the extra cost in this case is the unnecessary use of the stack. The patch attempts to finesse this by using lane loads or vector inserts to produce the right results. This patch is mostly Ramana's work, I've just cleaned it up a little. This has been in a number of our trees lately, and we haven't seen any regressions. I've also bootstrapped and tested it, and run a set of benchmarks to show no regressions on Cortex-A57 or Cortex-A53. The patch fixes some regressions caused by the more agressive vectorization in GCC6, so I'd like to propose it to go in even though we are in Stage 4. OK? Thanks, James --- gcc/ 2016-01-20 James Greenhalgh Ramana Radhakrishnan * config/aarch64/aarch64.c (aarch64_expand_vector_init): Refactor, always use lane loads to construct non-constant vectors. gcc/testsuite/ 2016-01-20 James Greenhalgh Ramana Radhakrishnan * gcc.target/aarch64/vector_initialization_nostack.c: New.