From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15783 invoked by alias); 7 Nov 2012 23:32:40 -0000 Received: (qmail 15506 invoked by uid 48); 7 Nov 2012 23:31:23 -0000 From: "olegendo at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/54089] [SH] Refactor shift patterns Date: Wed, 07 Nov 2012 23:32:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: olegendo at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: olegendo at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-11/txt/msg00679.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089 --- Comment #25 from Oleg Endo 2012-11-07 23:31:20 UTC --- Created attachment 28633 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28633 Arithmetic right shift rework 2 This could be an alternative approach for the arith right shifts. Applied to rev 193240: CSiBE -m4-single -O2 -ml -mpretend-cmove sum: 3159747 -> 3160047 +300 / +0.009494 % CSiBE -m2 -O2 -ml sum: 3314907 -> 3313319 -1588 / -0.047905 % The idea here is to let arith shifts initially go through a slightly more complex pattern 'ashrsi3_bridge_const', which allows the lib function address or shad constant to be CSE'ed early on. As mentioned in comment #18, the arith -> logical shift conversion is done by combine only under certain circumstances. After some trial and error I arrived at the 'ashrsi3_bridge_const' pattern, which makes combine convert most of the arith shifts into logical shifts. In some cases it also finds more opportunities, but regressions like this one still remain: void MC_put_xy_16_c (uint8_t * dest, const uint8_t * ref, const int stride, int height) { // must see a shlr instead of 2x shar. // somehow the loop prevents this... do { dest[0] = (((ref[0]+ref[0 +1]+(ref+stride)[0]+(ref+stride)[0 +1]+2)>>2)); ref += stride; dest += stride; } while (--height); } In addition to that, some unlucky register allocations appear, which cause unneeded reg-reg copies (it seems something has trouble utilizing commutative insns such as 'add'). The other problem is that patterns such as: (define_insn_and_split "*lshrsi3" [(set (match_operand:SI 0 "arith_reg_dest") (lshiftrt:SI (match_operand:SI 1 "arith_reg_operand") (match_operand:SI 2 "shift_count_operand"))) (clobber (reg:SI T_REG)) (use (match_operand:SI 3 "general_operand")) (use (match_operand:SI 4 "general_operand"))] "TARGET_SH1 && can_create_pseudo_p () && CONST_INT_P (operands[2])" "#" "&& 1" [(const_int 0)] { emit_insn (gen_lshrsi3 (operands[0], operands[1], operands[2])); DONE; }) have to be added for combine to convert more arith -> logical shifts. In this patch I've added only one such pattern. However, if for example zero_extract patterns are added, variations such as above would be required, too. This looks discomforting to me. Maybe the better solution would indeed be to add a arith -> logical shift conversion pass before combine, or try to convert arith shifts in the split pass after combine by looking at the data flow of the shifted values.