This is a new version of the ldm/stm peepholes patch I posted as a WIP earlier. This is now in a state where I think it's an improvement over the existing code and suitable for review. The motivation here is PR40457, which notes that we cannot convert multiple ldr/str instructions into ldm/stm for Thumb-1. To do that, we need to be able to show that the base register dies, as Thumb-1 only supports the updating version of these instructions. This means we have to use peephole2 instead of peephole. In addition to converting the existing peepholes, I've also added new classes which try to optimize special situations requested in the PR. 1. Constant to memory moves: if all the input registers of an stm are dead after the instruction, it doesn't much matter which value goes into which register. This can also use peephole2's ability to allocate free registers: mov ip, #1 - str ip, [sp, #0] - mov ip, #0 - str ip, [sp, #4] + mov lr, #0 + stmia sp, {ip, lr} 2. When loading registers for use in a commutative operation, their order also doesn't matter if they are dead afterwards. I don't know whether to include the generator program as documentation or as the master copy, or whether to leave it out altogether. It might make sense to keep it if we want to increase the limit of 4 instructions per pattern. The patch may need more tuning to decide when to use these instructions; I'll need suggestions as I don't think I have enough information available to make these decisions. For now, this uses the same heuristics as previously, but it may trigger in slightly more situations. I'm open to suggestions. I've tested this quite often with arm-none-linux-gnueabi QEMU testsuite runs. Yesterday, after fixing an aliasing issue, I managed to get a successful SPEC2000 run on a Cortex-A9 board (with a 4.4-based compiler; overall performance minimally higher than before); since then I've only made cleanups and added comments. Once approved, I'll rerun all tests before committing. Bernd