From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 75516 invoked by alias); 12 Aug 2015 01:11:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 75504 invoked by uid 89); 12 Aug 2015 01:11:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pa0-f49.google.com Received: from mail-pa0-f49.google.com (HELO mail-pa0-f49.google.com) (209.85.220.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 12 Aug 2015 01:11:49 +0000 Received: by pacgr6 with SMTP id gr6so2228031pac.2 for ; Tue, 11 Aug 2015 18:11:47 -0700 (PDT) X-Received: by 10.66.62.202 with SMTP id a10mr62451986pas.42.1439341907666; Tue, 11 Aug 2015 18:11:47 -0700 (PDT) Received: from bigtime.twiddle.net (50-194-63-110-static.hfc.comcastbusiness.net. [50.194.63.110]) by smtp.gmail.com with ESMTPSA id kv10sm4167810pbc.2.2015.08.11.18.11.46 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Aug 2015 18:11:46 -0700 (PDT) From: Richard Henderson To: gcc-patches@gcc.gnu.org Cc: David Edelsohn , Marcus Shawcroft , Richard Earnshaw Subject: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation Date: Wed, 12 Aug 2015 01:11:00 -0000 Message-Id: <1439341904-9345-1-git-send-email-rth@redhat.com> X-IsSubscribed: yes X-SW-Source: 2015-08/txt/msg00549.txt.bz2 Something last week had me looking at ppc64 code generation, and some of what I saw was fairly bad. Fixing it wasn't going to be easy, due to the fact that the logic for generating constants wasn't contained within a single function. Better is the way that aarch64 and alpha have done it in the past, sharing a single function with all of the logical that can be used for both cost calculation and the actual emission of the constants. However, the way that aarch64 and alpha have done it hasn't been ideal, in that there's a fairly costly search that must be done every time. I've thought before about changing this so that we would be able to cache results, akin to how we do it in expmed.c for multiplication. I've implemented such a caching scheme for three targets, as a test of how much code could be shared. The answer appears to be about 100 lines of boiler-plate. Minimal, true, but it may still be worth it as a way of encouraging backends to do similar things in a similar way. Some notes about ppc64 in particular: * Constants aren't split until quite late, preventing all hope of CSE'ing portions of the generated code. My gut feeling is that this is in general a mistake, but... I did attempt to fix it, and got nothing for my troubles except poorer code generation for AND/IOR/XOR with non-trivial constants. I'm somewhat surprised that the operands to the logicals aren't visible at rtl generation time, given all the work done in gimple. And failing that, combine has enough REG_EQUAL notes that it ought to be able to put things back together and see the simpler pattern. Perhaps there's some other predication or costing error that's getting in the way, and it simply wasn't obvious to me. In any case, nothing in this patch set addresses this at all. * I go on to add 4 new methods of generating a constant, each of which typically saves 2 insns over the current algorithm. There are a couple more that might be useful but... * Constants are split *really* late. In particular, after reload. It would be awesome if we could at least have them all split before register allocation so that we arrange to use ADDI and ADDIS when that could save a few instructions. But that does of course mean avoiding r0 for the input. Again, nothing here attempts to change when constants are split. * This is the only platform for which I bothered collecting any sort of performance data: As best I can tell, there is a 9% improvement in bootstrap speed for ppc64. That is, 10 minutes off the original 109 minute build. For aarch64 and alpha, I simply assumed there would be no loss, since the basic search algorithm is unchanged for each. Comments? Especially on the shared header? r~ Cc: David Edelsohn Cc: Marcus Shawcroft Cc: Richard Earnshaw Richard Henderson (15): rs6000: Split out rs6000_is_valid_and_mask_wide rs6000: Make num_insns_constant_wide static rs6000: Tidy num_insns_constant vs CONST_DOUBLE rs6000: Implement set_const_data infrastructure rs6000: Move constant via mask into build_set_const_data rs6000: Use rldiwi in constant construction rs6000: Generalize left shift in constant generation rs6000: Generalize masking in constant generation rs6000: Use xoris in constant construction rs6000: Use rotldi in constant generation aarch64: Use hashing infrastructure for generating constants aarch64: Test for duplicated 32-bit halves alpha: Use hashing infrastructure for generating constants alpha: Split out alpha_cost_set_const alpha: Remove alpha_emit_set_long_const gcc/config/aarch64/aarch64.c | 463 ++++++++++++++++------------ gcc/config/alpha/alpha.c | 583 +++++++++++++++++------------------ gcc/config/rs6000/rs6000-protos.h | 1 - gcc/config/rs6000/rs6000.c | 617 ++++++++++++++++++++++++-------------- gcc/config/rs6000/rs6000.md | 15 - gcc/genimm-hash.h | 122 ++++++++ 6 files changed, 1057 insertions(+), 744 deletions(-) create mode 100644 gcc/genimm-hash.h -- 2.4.3