From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 127182 invoked by alias); 20 Feb 2018 12:15:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 127165 invoked by uid 89); 20 Feb 2018 12:14:59 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=H*UA:Outlook, H*x:Outlook, H*UA:Microsoft, H*x:Microsoft X-HELO: relmlie1.idc.renesas.com Received: from relmlor2.renesas.com (HELO relmlie1.idc.renesas.com) (210.160.252.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Feb 2018 12:14:56 +0000 Received: from unknown (HELO relmlir2.idc.renesas.com) ([10.200.68.152]) by relmlie1.idc.renesas.com with ESMTP; 20 Feb 2018 21:14:53 +0900 Received: from relmlii1.idc.renesas.com (relmlii1.idc.renesas.com [10.200.68.65]) by relmlir2.idc.renesas.com (Postfix) with ESMTP id 6100F84DF1; Tue, 20 Feb 2018 21:14:53 +0900 (JST) Received: from unknown (HELO REEBED03978) ([10.226.37.54]) by relmlii1.idc.renesas.com with ESMTP; 20 Feb 2018 21:14:51 +0900 From: "Sebastian Perta" To: Cc: "'DJ Delorie'" Subject: [PATCH] RL78 one_cmplhi2 improvement Date: Tue, 20 Feb 2018 12:15:00 -0000 Message-ID: <000501d3aa44$69a44e30$3cecea90$@renesas.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-SW-Source: 2018-02/txt/msg01157.txt.bz2 Hello, The following patch defines one_cmplhi2 pattern. The improvement does not come from the two xor instructions used (in fact for the second xor the pattern uses xor saddr , #byte is used which is 1 bytes longer than xor a, #byte) it comes from the fact that that the number of move instructions is reduced (16 bit movw is used instead of 8 bit mov). This can be seen on a very simple test case: uint16_t test_one_cmplhi(uint16_t x) { return ~x; } _test_one_cmplhi: mov a, [sp+4] xor a, #-1 mov r8, a mov a, [sp+5] xor a, #-1 mov r9, a ret _test_one_cmplhi: movw ax, [sp+4] xor a, #-1 xor 0xFFEF8, #-1 ;one_cmplhi2 ax, ax movw r8, ax ret In "gcc.c-torture/execute/" I have seen the patch being effective in 18 cases with the biggest improvement is occurring in gcc.c-torture/execute/pr68376-2 where this patch reduces the code size from 792 bytes to 704. Unfortunately there's also a downside to this patch. In cases when the second xor is redundant GCC never gets the chance to remove the second xor. This can be seen in gcc.c-torture/execute/pr42833.c for example line 57: vdest.v1 = (vsrc1.v1 + (1 << (-1 - tmp))) >> -tmp; This is because all variables are 8 bit while the operation is being promoted to 16 bit (because of the constants being of type int by default) this is why the one_cmplhi2 is used in this case. The relevant insns for this case are: (insn 35 257 260 14 (set (reg:HI 0 x) (xor:HI (reg:HI 0 x) (const_int -1 [0xffffffffffffffff]))) "./xgcc_rw_trunk2/gcc/testsuite/gcc.c-torture/execute/pr42833.c":57 59 {*one_cmplhi2_real} (nil)) (insn 260 35 261 14 (set (reg:QI 1 a) (reg:QI 0 x [8])) "./xgcc_rw_trunk2/gcc/testsuite/gcc.c-torture/execute/pr42833.c":57 44 {*movqi_real} (expr_list:REG_DEAD (reg:QI 8 r8) (nil))) As it can be seen it is quite obvious that reg "a" is dead after insn 35. I think I can write a simple rtl_opt_pass (similar to pass_rl78_move_elim) to look for such cases and replace the one_cmplhi2 with one_cmplqi2. Even without this new pass (which I plan to do in the near future) I still think (based on testing) the advantages of this patch outweigh the disadvantage described above. Is OK to check-in? Or do I need to add the new pass as well for this to be considered? Thank you! Regression test is OK, tested with the following command: make -k check-gcc RUNTESTFLAGS=--target_board=rl78-sim Best Regards, Sebastian 2018-02-20 Sebastian Perta * config/rl78/rl78-expand.md (one_cmplhi2): New define expand. * config/rl78/rl78-virt.md (one_cmplhi2_virt): New define insn. * config/rl78/rl78-real.md (one_cmplhi2_real): New define insn. Index: rl78-expand.md =================================================================== --- rl78-expand.md (revision 257806) +++ rl78-expand.md (working copy) @@ -211,6 +211,16 @@ DONE;" ) +(define_expand "one_cmplhi2" + [(set (match_operand:HI 0 "nonimmediate_operand") + (xor:HI (match_operand:HI 1 "general_operand") + (const_int -1))) + ] + "" + "if (rl78_force_nonfar_2 (operands, gen_one_cmplhi2)) + DONE;" +) + ;;---------- Shifts ------------------------ (define_expand "ashl3" Index: rl78-real.md =================================================================== --- rl78-real.md (revision 257806) +++ rl78-real.md (working copy) @@ -240,6 +240,16 @@ [(set (attr "update_Z") (const_string "update_Z"))] ) +(define_insn "*one_cmplhi2_real" + [(set (match_operand:HI 0 "register_operand" "=A") + (xor:HI (match_operand:HI 1 "general_operand" "0") + (const_int -1))) + ] + "rl78_real_insns_ok ()" + "xor a, #-1 \;xor 0xFFEF8, #-1 ;one_cmplhi2 %0, %1" + [(set_attr "update_Z" "clobber")] +) + ;;---------- Shifts ------------------------ (define_insn "*ashlqi3_real" Index: rl78-virt.md =================================================================== --- rl78-virt.md (revision 257806) +++ rl78-virt.md (working copy) @@ -165,6 +165,16 @@ "v.xor\t%0, %1, %2" ) +(define_insn "*one_cmplhi2_virt" + [(set (match_operand:HI 0 "rl78_nonfar_nonimm_operand" "=v") + (xor:HI (match_operand:HI 1 "general_operand" "ivU") + (const_int -1))) + ] + "rl78_virt_insns_ok ()" + "v.one_cmplhi2\t%0, %1" + [(set_attr "valloc" "op1")] +) + ;;---------- Shifts ------------------------ (define_insn "*ashl3_virt"