From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1461) id 98A303858D33; Thu, 27 Apr 2023 16:33:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 98A303858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1682613220; bh=+8Jak6+8tuZ626tGc9jKIyzEV/8YiGRrAVGkSLTXL50=; h=From:To:Subject:Date:From; b=TsXsuVv6tvCUaJrxr5GKPHfjaxVA50KzbfY0oACctoLFPuLQR0GU6XBIHf5kreFYu Tu5SrwYyNs7C5JvjyK4T0SBYCCmywGHXUg27Q/b2gzduwNyUUNzBGTzeolnwXHfshq fWM6UR7xJ7rfeh2S0VCqwzbeylMffUjLLTRd9krw= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Andrew Stubbs To: gcc-cvs@gcc.gnu.org Subject: [gcc r14-310] amdgcn: Fix addsub bug X-Act-Checkin: gcc X-Git-Author: Andrew Stubbs X-Git-Refname: refs/heads/master X-Git-Oldrev: 14e881eb0305090e5b184806b917d492373d32ea X-Git-Newrev: b17c57b06d90f2ca12ea0395046c4ea7d439065f Message-Id: <20230427163340.98A303858D33@sourceware.org> Date: Thu, 27 Apr 2023 16:33:40 +0000 (GMT) List-Id: https://gcc.gnu.org/g:b17c57b06d90f2ca12ea0395046c4ea7d439065f commit r14-310-gb17c57b06d90f2ca12ea0395046c4ea7d439065f Author: Andrew Stubbs Date: Wed Apr 26 15:23:48 2023 +0100 amdgcn: Fix addsub bug The vec_fmsubadd instuction actually had add twice, by mistake. Also improve code-gen for all the complex patterns by using properly undefined values. Mostly this just prevents the compiler reserving space in the stack frame. gcc/ChangeLog: * config/gcn/gcn-valu.md (cmul3): Use gcn_gen_undef. (cml4): Likewise. (vec_addsub3): Likewise. (cadd3): Likewise. (vec_fmaddsub4): Likewise. (vec_fmsubadd4): Likewise, and use sub for the odd lanes. Diff: --- gcc/config/gcn/gcn-valu.md | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index 44c48468dd6..7290cdc2fd0 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -2323,8 +2323,9 @@ rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_3_exec (dest, t1, t1_perm, dest, even)); - // a*c-b*d 0 + emit_insn (gen_3_exec (dest, t1, t1_perm, + gcn_gen_undef (mode), + even)); // a*c-b*d 0 rtx t2_perm = gen_reg_rtx (mode); emit_insn (gen_dpp_swap_pairs (t2_perm, t2)); // b*c a*d @@ -2368,7 +2369,8 @@ rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_sub3_exec (dest, t1, t2_perm, dest, even)); + emit_insn (gen_sub3_exec (dest, t1, t2_perm, + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); @@ -2392,7 +2394,8 @@ rtx dest = operands[0]; rtx x = operands[1]; rtx y = operands[2]; - emit_insn (gen_sub3_exec (dest, x, y, dest, even)); + emit_insn (gen_sub3_exec (dest, x, y, gcn_gen_undef (mode), + even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_add3_exec (dest, x, y, dest, odd)); @@ -2419,7 +2422,9 @@ rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); - emit_insn (gen_3_exec (dest, x, y, dest, even)); + emit_insn (gen_3_exec (dest, x, y, + gcn_gen_undef (mode), + even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_3_exec (dest, x, y, dest, odd)); @@ -2439,7 +2444,8 @@ rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_sub3_exec (dest, t1, operands[3], dest, even)); + emit_insn (gen_sub3_exec (dest, t1, operands[3], + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_add3_exec (dest, t1, operands[3], dest, odd)); @@ -2459,10 +2465,11 @@ rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_add3_exec (dest, t1, operands[3], dest, even)); + emit_insn (gen_add3_exec (dest, t1, operands[3], + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); - emit_insn (gen_add3_exec (dest, t1, operands[3], dest, odd)); + emit_insn (gen_sub3_exec (dest, t1, operands[3], dest, odd)); DONE; })