This patch generates better code for loads and stores on SPU. The SPU can only do 16-byte, aligned loads and stores. To load something smaller with a smaller alignment requires a load and a rotate. To store something smaller requires a load, insert, and store. Currently, there are two obvious ways to generate rtl for loads and stores. Generate the multiple instructions at expand time, or split them at some later phase. When expanded early we lose alias information (because that 16-byte load could contain anything), and in general do worse optimization on memory. When we split late, the compiler has no opportunity to combine loads/stores of the same 16 bytes. This patch introduces an additional split pass, split0, right before the CSE2 pass. Before this pass, loads and stores are modeled as a single rtl instruction, and can be optimized well. This pass splits them into multiple instructions, allowing CSE2 and combine to optimize the 16 byte loads and stores. The pass is only enabled when a target defines SPLIT_BEFORE_CSE2. The test case is an example which is improved by the earlier split pass. This patch also makes other small improvements to the code generated for loads and stores on SPU. Ok for mainline? In particular, the new split pass. Trevor 2008-08-27 Trevor Smigiel Improve code generated for loads and stores on SPU. * doc/tm.texi (SPLIT_BEFORE_CSE2) : Document. * tree-pass.h (pass_split_before_cse2) : Declare. * final.c (rest_of_clean_state) : Initialize split0_completed. * recog.c (split0_completed) : Define. (gate_handle_split_before_cse2, rest_of_handle_split_before_cse2) : New functions. (pass_split_before_cse2) : New pass. * rtl.h (split0_completed) : Declare. * passes.c (init_optimization_passes) : Add pass_split_before_cse2 before pass_cse2 . * config/spu/spu-protos.h (spu_legitimate_address) : Add for_split argument. (aligned_mem_p, spu_valid_move) : Remove prototypes. (spu_split_load, spu_split_store) : Change return type to int. * config/spu/predicates.md (spu_mem_operand) : Remove. (spu_dest_operand) : Add. * config/spu/spu-builtins.md (spu_lqd, spu_lqx, spu_lqa, spu_lqr, spu_stqd, spu_stqx, spu_stqa, spu_stqr) : Remove AND operation. * config/spu/spu.c (regno_aligned_for_load) : Remove. (reg_aligned_for_addr, address_needs_split) : New functions. (spu_legitimate_address, spu_expand_mov, spu_split_load, spu_split_store) : Update. (spu_init_expanders) : Pregenerate a couple of pseudo-registers. * config/spu/spu.h (REG_ALIGN, SPLIT_BEFORE_CSE2) : Define. (GO_IF_LEGITIMATE_ADDRESS) : Update for spu_legitimate_address. * config/spu/spu.md ("_mov", "_movdi", "_movti") : Update predicates. ("load", "store") : Change to define_split. testsuite/ * testsuite/gcc.target/spu/split0-1.c : Add test.