Hi all, This is another revision of the pass addressing Richard's feedback [1] I believe I've addressed all of it and added more comments to the code where needed. The output_merged_store function now uses the new split_group helper to break up the merged store into multiple regular-sized stores. The apply_stores function that splats the stores in a group together can now return a bool to indicate failure and is used to reject quickly one-store groups and other store groups that we cannot output. One thing I've been struggling with is reimplementing encode_tree_to_bitpos, the function that applies a tree constant to the merged byte array. I've tried to reimplement it by writing the constant to a byte array with native_encode_expr and manipulating the bytes directly to insert them into the appropriate bit position without constructing an intermediate wide_int. This works, but only for little-endian. On big-endian it generated wrong code. So this patch doesn't include that implementation but rather uses the previous one that uses a wide_int but is correct on both endiannesses. Richard, I am sending out a patch that implements the cheaper algorithm separately if you want to help debug it. This has been bootstrapped and tested on arm, aarch64, aarch64_be, x86_64. Besides the encode_tree_to_bitpos reimplementation (which will have its own thread) does this version look good? Thanks, Kyrill [1] https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02225.html 2016-10-10 Kyrylo Tkachov PR middle-end/22141 * Makefile.in (OBJS): Add gimple-ssa-store-merging.o. * common.opt (fstore-merging): New Optimization option. * opts.c (default_options_table): Add entry for OPT_ftree_store_merging. * fold-const.h (can_native_encode_type_p): Declare prototype. * fold-const.c (can_native_encode_type_p): Define. * params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define. * passes.def: Insert pass_tree_store_merging. * tree-pass.h (make_pass_store_merging): Declare extern prototype. * gimple-ssa-store-merging.c: New file. * doc/invoke.texi (Optimization Options): Document -fstore-merging. 2016-10-10 Kyrylo Tkachov Jakub Jelinek Andrew Pinski PR middle-end/22141 PR rtl-optimization/23684 * gcc.c-torture/execute/pr22141-1.c: New test. * gcc.c-torture/execute/pr22141-2.c: Likewise. * gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging. * gcc.target/aarch64/ldp_stp_4.c: Likewise. * gcc.dg/store_merging_1.c: New test. * gcc.dg/store_merging_2.c: Likewise. * gcc.dg/store_merging_3.c: Likewise. * gcc.dg/store_merging_4.c: Likewise. * gcc.dg/store_merging_5.c: Likewise. * gcc.dg/store_merging_6.c: Likewise. * gcc.dg/store_merging_7.c: Likewise. * gcc.target/i386/pr22141.c: Likewise. * gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options. * g++.dg/init/new17.C: Likewise.