From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2119) id 991EB3858C74; Sun, 10 Dec 2023 17:42:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 991EB3858C74 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1702230171; bh=z4b0hc8JGZIBizg8u6C21JxeQkM5edFKzwoZf/0Ygxo=; h=From:To:Subject:Date:From; b=UIk4HKkiovAWf+1i0bVb49dr3sTTIY5hwXFHDLGxunvFtx2nCnXFJx+JTDF/XedGy xhb19/WbsgzCQLBci7q+48I/81x6Fi+npgzYWGmqwhHL13vyER/smF+gWij/RBoLD4 wxcjZz6zLwbihe2aU/29qOf8vQKxo/hFQZ9qSEzQ= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Jeff Law To: gcc-cvs@gcc.gnu.org Subject: [gcc r14-6365] [committed] Support uaddv and usubv on the H8 X-Act-Checkin: gcc X-Git-Author: Jeff Law X-Git-Refname: refs/heads/master X-Git-Oldrev: 73f6e1fe8835085ccc6de5c5f4428d47e853913b X-Git-Newrev: 7fb9454c748632d148a07c275ea1f77b290b0c2d Message-Id: <20231210174251.991EB3858C74@sourceware.org> Date: Sun, 10 Dec 2023 17:42:51 +0000 (GMT) List-Id: https://gcc.gnu.org/g:7fb9454c748632d148a07c275ea1f77b290b0c2d commit r14-6365-g7fb9454c748632d148a07c275ea1f77b290b0c2d Author: Jeff Law Date: Sun Dec 10 10:41:05 2023 -0700 [committed] Support uaddv and usubv on the H8 This patch adds uaddv/usubv support on the H8 port to speed up those pesky builtin-overflow tests. It's a variant of something I'd been running for a while -- the major change between the old approach I'd been using and this patch is this version does not expose the CC register until after reload to be consistent with the rest of the H8 port. The general approach is to first clear the GPR that's going to hold the overflow status, perform the arithmetic operation (add/sub), then use addx to move the overflow indicator (in the C bit) into the GPR holding the overflow status. That's a significant improvement over the mess of logicals that's generated by the generic code. Handling signed overflow is possible and something I'll probably port to this scheme at some point. It's a bit more complex because we can't trivially move the bit from CCR into the right position in a GPR and other quirks of the H8. This has been regression tested on the H8 without problems. Pushing to the trunk. gcc/ * config/h8300/addsub.md (uaddv4, usubv4): New expanders. (uaddv): New define_insn_and_split plus post-reload pattern. Diff: --- gcc/config/h8300/addsub.md | 77 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/gcc/config/h8300/addsub.md b/gcc/config/h8300/addsub.md index b1eb0d20188..32eba9df67a 100644 --- a/gcc/config/h8300/addsub.md +++ b/gcc/config/h8300/addsub.md @@ -239,3 +239,80 @@ "reload_completed" "xor.w\\t#32768,%e0" [(set_attr "length" "4")]) + +(define_expand "uaddv4" + [(set (match_operand:QHSI 0 "register_operand" "") + (plus:QHSI (match_operand:QHSI 1 "register_operand" "") + (match_operand:QHSI 2 "register_operand" ""))) + (set (pc) + (if_then_else (ltu (match_dup 0) (match_dup 1)) + (label_ref (match_operand 3 "")) + (pc)))] + "") + +(define_insn_and_split "*uaddv" + [(set (match_operand:QHSI2 3 "register_operand" "=&r") + (ltu:QHSI2 (plus:QHSI (match_operand:QHSI 1 "register_operand" "%0") + (match_operand:QHSI 2 "register_operand" "r")) + (match_dup 1))) + (set (match_operand:QHSI 0 "register_operand" "=r") + (plus:QHSI (match_dup 1) (match_dup 2)))] + "" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 3) (ltu:QHSI2 (plus:QHSI (match_dup 1) (match_dup 2)) + (match_dup 1))) + (set (match_dup 0) (plus:QHSI (match_dup 1) (match_dup 2))) + (clobber (reg:CC CC_REG))])]) + +(define_insn "*uaddv" + [(set (match_operand:QHSI2 3 "register_operand" "=&r") + (ltu:QHSI2 (plus:QHSI (match_operand:QHSI 1 "register_operand" "%0") + (match_operand:QHSI 2 "register_operand" "r")) + (match_dup 1))) + (set (match_operand:QHSI 0 "register_operand" "=r") + (plus (match_dup 1) (match_dup 2))) + (clobber (reg:CC CC_REG))] + "" +{ + if (E_mode == E_QImode) + { + if (E_mode == E_QImode) + return "sub.b\t%X3,%X3\;add.b\t%X2,%X0\;addx\t%X3,%X3"; + else if (E_mode == E_HImode) + return "sub.b\t%X3,%X3\;add.w\t%T2,%T0\;addx\t%X3,%X3"; + else if (E_mode == E_SImode) + return "sub.b\t%X3,%X3\;add.l\t%S2,%S0\;addx\t%X3,%X3"; + } + else if (E_mode == E_HImode) + { + if (E_mode == E_QImode) + return "sub.w\t%T3,%T3\;add.b\t%X2,%X0\;addx\t%X3,%X3"; + else if (E_mode == E_HImode) + return "sub.w\t%T3,%T3\;add.w\t%T2,%T0\;addx\t%X3,%X3"; + else if (E_mode == E_SImode) + return "sub.w\t%T3,%T3\;add.l\t%S2,%S0\;addx\t%X3,%X3"; + } + else if (E_mode == E_SImode) + { + if (E_mode == E_QImode) + return "sub.l\t%S3,%S3\;add.b\t%X2,%X0\;addx\t%X3,%X3"; + else if (E_mode == E_HImode) + return "sub.l\t%S3,%S3\;add.w\t%T2,%T0\;addx\t%X3,%X3"; + else if (E_mode == E_SImode) + return "sub.l\t%S3,%S3\;add.l\t%S2,%S0\;addx\t%X3,%X3"; + } + else + gcc_unreachable (); +} + [(set_attr "length" "6")]) + +(define_expand "usubv4" + [(set (match_operand:QHSI 0 "register_operand" "") + (minus:QHSI (match_operand:QHSI 1 "register_operand" "") + (match_operand:QHSI 2 "register_operand" ""))) + (set (pc) + (if_then_else (ltu (match_dup 1) (match_dup 2)) + (label_ref (match_operand 3 "")) + (pc)))] + "")