From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1461) id E6CB7381DC41; Wed, 12 Oct 2022 10:45:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E6CB7381DC41 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1665571553; bh=aSUlRnLDCrQxx3zSEy3Dh2T7j5kSGp7y5sDU1jFcIeg=; h=From:To:Subject:Date:From; b=vpmdFKon0M6/T4k08xPVSpPf8GFi/WPKHWXMOL9Khn0addXVA+TEZMBsqzAC5Fk0y KEA3EnbQDbsjwUo9FgMnSHJrDKgR5953ccED8ZE0fQO4FvXBPTAeT8wYOc0oHikMpZ 7Yb+aem+13qQVrWE/0F5p7xugLp3WXQa1pCeIdog= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Andrew Stubbs To: gcc-cvs@gcc.gnu.org Subject: [gcc/devel/omp/gcc-12] vect: while_ult for integer masks X-Act-Checkin: gcc X-Git-Author: Andrew Stubbs X-Git-Refname: refs/heads/devel/omp/gcc-12 X-Git-Oldrev: b3f25511fb4bd4dbfc4068df665f4ea34f416b2d X-Git-Newrev: 46a155c5bb487e0378e52bc3dc016d66bdf426f1 Message-Id: <20221012104553.E6CB7381DC41@sourceware.org> Date: Wed, 12 Oct 2022 10:45:53 +0000 (GMT) List-Id: https://gcc.gnu.org/g:46a155c5bb487e0378e52bc3dc016d66bdf426f1 commit 46a155c5bb487e0378e52bc3dc016d66bdf426f1 Author: Andrew Stubbs Date: Fri Oct 2 15:12:50 2020 +0100 vect: while_ult for integer masks Add a vector length parameter needed by amdgcn without breaking aarch64. All amdgcn vector masks are DImode, regardless of vector length, so we can't tell what length is implied simply from the operator mode. (Even if we used different integer modes there's no mode small enough to differenciate a 2 or 4 lane mask). Without knowing the intended length we end up using a mask with too many lanes enabled, which leads to undefined behaviour.. The extra operand is not added for vector mask types so AArch64 does not need to be adjusted. gcc/ChangeLog: * config/gcn/gcn-valu.md (while_ultsidi): Limit mask length using operand 3. * doc/md.texi (while_ult): Document new operand 3 usage. * internal-fn.cc (expand_while_optab_fn): Set operand 3 when lhs_type maps to a non-vector mode. Diff: --- gcc/ChangeLog.omp | 11 +++++++++++ gcc/config/gcn/gcn-valu.md | 8 +++++++- gcc/doc/md.texi | 15 ++++++++++++--- gcc/internal-fn.cc | 18 ++++++++++++++++-- 4 files changed, 46 insertions(+), 6 deletions(-) diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp index 52bc562d95e..ed52feafa40 100644 --- a/gcc/ChangeLog.omp +++ b/gcc/ChangeLog.omp @@ -1,3 +1,14 @@ +2022-10-12 Andrew Stubbs + + Backport from mainline: + 2022-10-03 Andrew Stubbs + + * config/gcn/gcn-valu.md (while_ultsidi): Limit mask length using + operand 3. + * doc/md.texi (while_ult): Document new operand 3 usage. + * internal-fn.cc (expand_while_optab_fn): Set operand 3 when lhs_type + maps to a non-vector mode. + 2022-10-05 Tobias Burnus Backport from mainline: diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index 838ce84dacb..08ff397e225 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -3063,7 +3063,8 @@ (define_expand "while_ultsidi" [(match_operand:DI 0 "register_operand") (match_operand:SI 1 "") - (match_operand:SI 2 "")] + (match_operand:SI 2 "") + (match_operand:SI 3 "")] "" { if (GET_CODE (operands[1]) != CONST_INT @@ -3088,6 +3089,11 @@ : ~((unsigned HOST_WIDE_INT)-1 << diff)); emit_move_insn (operands[0], gen_rtx_CONST_INT (VOIDmode, mask)); } + if (INTVAL (operands[3]) < 64) + emit_insn (gen_anddi3 (operands[0], operands[0], + gen_rtx_CONST_INT (VOIDmode, + ~((unsigned HOST_WIDE_INT)-1 + << INTVAL (operands[3]))))); DONE; }) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 3b544358bb5..fc6f3e2d2f7 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5132,9 +5132,10 @@ This pattern is not allowed to @code{FAIL}. @cindex @code{while_ult@var{m}@var{n}} instruction pattern @item @code{while_ult@var{m}@var{n}} Set operand 0 to a mask that is true while incrementing operand 1 -gives a value that is less than operand 2. Operand 0 has mode @var{n} -and operands 1 and 2 are scalar integers of mode @var{m}. -The operation is equivalent to: +gives a value that is less than operand 2, for a vector length up to operand 3. +Operand 0 has mode @var{n} and operands 1 and 2 are scalar integers of mode +@var{m}. Operand 3 should be omitted when @var{n} is a vector mode, and +a @code{CONST_INT} otherwise. The operation for vector modes is equivalent to: @smallexample operand0[0] = operand1 < operand2; @@ -5142,6 +5143,14 @@ for (i = 1; i < GET_MODE_NUNITS (@var{n}); i++) operand0[i] = operand0[i - 1] && (operand1 + i < operand2); @end smallexample +And for non-vector modes the operation is equivalent to: + +@smallexample +operand0[0] = operand1 < operand2; +for (i = 1; i < operand3; i++) + operand0[i] = operand0[i - 1] && (operand1 + i < operand2); +@end smallexample + @cindex @code{check_raw_ptrs@var{m}} instruction pattern @item @samp{check_raw_ptrs@var{m}} Check whether, given two pointers @var{a} and @var{b} and a length @var{len}, diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index a2227270d8e..cefa4daa826 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -3644,7 +3644,7 @@ expand_direct_optab_fn (internal_fn fn, gcall *stmt, direct_optab optab, static void expand_while_optab_fn (internal_fn, gcall *stmt, convert_optab optab) { - expand_operand ops[3]; + expand_operand ops[4]; tree rhs_type[2]; tree lhs = gimple_call_lhs (stmt); @@ -3660,10 +3660,24 @@ expand_while_optab_fn (internal_fn, gcall *stmt, convert_optab optab) create_input_operand (&ops[i + 1], rhs_rtx, TYPE_MODE (rhs_type[i])); } + int opcnt; + if (!VECTOR_MODE_P (TYPE_MODE (lhs_type))) + { + /* When the mask is an integer mode the exact vector length may not + be clear to the backend, so we pass it in operand[3]. + Use the vector in arg2 for the most reliable intended size. */ + tree type = TREE_TYPE (gimple_call_arg (stmt, 2)); + create_integer_operand (&ops[3], TYPE_VECTOR_SUBPARTS (type)); + opcnt = 4; + } + else + /* The mask has a vector type so the length operand is unnecessary. */ + opcnt = 3; + insn_code icode = convert_optab_handler (optab, TYPE_MODE (rhs_type[0]), TYPE_MODE (lhs_type)); - expand_insn (icode, 3, ops); + expand_insn (icode, opcnt, ops); if (!rtx_equal_p (lhs_rtx, ops[0].value)) emit_move_insn (lhs_rtx, ops[0].value); }