From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) by sourceware.org (Postfix) with ESMTPS id D557C3858C41 for ; Fri, 17 Nov 2023 08:24:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D557C3858C41 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D557C3858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::133 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700209477; cv=none; b=q2FW8+p2kyBoW58hxzye4XfAzSLkm+dZ3lYrsJ8VphHVGv6jKI8ouCmZYshiiSBg4e7AVYHQ0KAWK87PQ1/BiyEcP7FNELh7Cdo9DR1QJoppZIAblRet+8/47DLc7M7AA68eEDgXAcfSInyJ20Qd13nFus6Oz6OteQm7kEd6LWM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700209477; c=relaxed/simple; bh=fnwQvofzGYV5Q3v52RR+2wpL4kkeySyAals5IMqNjrE=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=CVwl1aAgnL2g0BraS9g9lp2eFxH3WHuqflGok8GGpBu2Bzt5RvoH1Vf+e+zvX2gCC1lCfuI+3wwMRsB7fvSAcMffVDxnT6MFEEopRyZ4/6AJmu+gzaJYblAC8xsOztLnsiHBy1SwIvnlOjknFeFk9vRHu85nB/BGhqhydi+ZQog= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x133.google.com with SMTP id 2adb3069b0e04-507cee17b00so2226987e87.2 for ; Fri, 17 Nov 2023 00:24:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700209473; x=1700814273; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ch6pkNAaDSNpy583GL//gLvSUOWKbBcFXg8fCT3iwXs=; b=YoylcPJUxevO6QIov7mJ+9IRl0QL5aKwUkLIkfDYX0KSzg8os1BXypCVwCisuQyD4c blOQ/7NxogXq3l50VbB4qcg8XGLbwWGdTcMHz6mwHCZ2nr1Dv7UNymXwHeX7EV2u4Y3g dLuf8FSAYKZHPiaACcWkc+xApKkX3+wlSDCbdh12v/Pnp8Bn5oUcsKqkk3lcLoGKTvSo bdPDY5MuFXzveEpb31sqvK63XhmXQ19S/ekCQ227vr1NbBXT+l+dyQ2YXWJJ0c6pwzWY csBD+zL4IFjfsV7WtakuKDncZA9ZX5LD71jgq/JJ2SzcbqDcMrkmnvCh4zNcMa9Rp+Q7 8YjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700209473; x=1700814273; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ch6pkNAaDSNpy583GL//gLvSUOWKbBcFXg8fCT3iwXs=; b=cP7/DDA3Ts4hyC48QY4BCKjHIJcHqdUqaHb7GGWhlXySj0YugU31uPQ+ABzlJpQBRu J3OPNxDk8D0rOSbQ2k64WqGIjl0w1yxWBZl14lt0Q5iCgsjSMBCKw9X9DsCfl32aqT0Q 09CF+IJBxrN77jVQpbxOgXWsuGWNu6Xmzxff0N7J9b8rFaPqi4N8MS0+Rn7df/XvPn10 1jemMRnjdifK0k5Ibav50NIULWooMrpCWoBzthgU7s9fZ022a5coPhsr1KjayX/+73mt Btk9EkcRG2TusN1ewUIqcMrk+9FtVXe6r15O9PXSUnYaS1uSSgJp4cTLNFVjszQl76J6 T1ew== X-Gm-Message-State: AOJu0YwCAas1QyHhF4KYt/36v7t2J8QXZzqvrFZNGeg3mldzKteEgMd/ d87J/frdMVMmr96qn9uqNrtQCAALfzJgiihz0t444vxaGxo= X-Google-Smtp-Source: AGHT+IE6nIM25Bhi1WbQAHifeS36msmzdYxln3MkrVGrGL4DrG7zDbQK2XwtbPWVVT3Od/lxdgdrBm6zyuvWuTkfvek= X-Received: by 2002:ac2:5049:0:b0:500:7f71:e46b with SMTP id a9-20020ac25049000000b005007f71e46bmr12269087lfm.1.1700209472353; Fri, 17 Nov 2023 00:24:32 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Fri, 17 Nov 2023 09:24:20 +0100 Message-ID: Subject: Re: [PATCH] vect: Use statement vectype for conditional mask. To: Robin Dapp , Richard Sandiford Cc: gcc-patches , Tamar Christina Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Nov 16, 2023 at 11:30=E2=80=AFPM Robin Dapp w= rote: > > > For the fortran testcase we don't even run into this but hit an > > internal def and assert on > > > > gcc_assert (STMT_VINFO_VEC_STMTS (def_stmt_info).length () =3D=3D= ncopies); > > > > I think this shows missing handling of .COND_* in the bool pattern reco= gnition > > as we get the 'bool' condition as boolean data vector rather than a mas= k. The > > same is true for the testcase with the invariant condition. This cause= s us to > > select the wrong vector type here. The "easiest" might be to look at > > how COND_EXPR is handled in vect_recog_bool_pattern and friends and > > handle .COND_* IFNs the same for the mask operand. > > For the first (imagick) testcase adding a bool pattern does not help > because we always pass NULL as vectype to vect_get_vec_defs. > Doing so we will always use get_vectype_for_scalar_type (i.e. > a "full" bool vector) because vectype of the (conditional) stmt > is the lhs type and not the mask's type. > For cond_exprs in vectorizable_condition we directly pass a > comp_vectype instead (truth_type). Wouldn't that, independently > of the pattern recog, make sense? The issue with the first testcase is that the condition operand of the .COND_ADD is loop invariant. When not using SLP there's no way to store the desired vector type for those as they are not code-generated explicitly but implicitly when we use them via vect_get_vec_defs. But note you can explicitly specify a vector type as well, there's an overload for it, so we can fix the "invariant" case with the following (OK if you can test this on relevant targets) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 3f59139cb01..936a3de9534 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -8450,12 +8450,14 @@ vect_transform_reduction (loop_vec_info loop_vinfo, value. */ bool cond_fn_p =3D code.is_internal_fn () && conditional_internal_fn_code (internal_fn (code)) !=3D ERROR_MARK; + int cond_index =3D -1; if (cond_fn_p) { gcc_assert (code =3D=3D IFN_COND_ADD || code =3D=3D IFN_COND_SUB || code =3D=3D IFN_COND_MUL || code =3D=3D IFN_COND_AND || code =3D=3D IFN_COND_IOR || code =3D=3D IFN_COND_XOR); gcc_assert (op.num_ops =3D=3D 4 && (op.ops[1] =3D=3D op.ops[3])); + cond_index =3D 0; } bool masked_loop_p =3D LOOP_VINFO_FULLY_MASKED_P (loop_vinfo); @@ -8486,12 +8488,13 @@ vect_transform_reduction (loop_vec_info loop_vinfo, vect_get_vec_defs (loop_vinfo, stmt_info, slp_node, ncopies, single_defuse_cycle && reduc_index =3D=3D 0 ? NULL_TREE : op.ops[0], &vec_oprnds0, + cond_index =3D=3D 0 ? truth_type_for (vectype_in) : NU= LL_TREE, single_defuse_cycle && reduc_index =3D=3D 1 - ? NULL_TREE : op.ops[1], &vec_oprnds1, + ? NULL_TREE : op.ops[1], &vec_oprnds1, NULL_TREE, op.num_ops =3D=3D 4 || (op.num_ops =3D=3D 3 && !(single_defuse_cycle && reduc_index =3D=3D 2)) - ? op.ops[2] : NULL_TREE, &vec_oprnds2); + ? op.ops[2] : NULL_TREE, &vec_oprnds2, NULL_TREE); /* For single def-use cycles get one copy of the vectorized reduction definition. */ > > Now for the Fortran testcase I'm still a bit lost. Opposed to > before we now vectorize with a variable VF and hit the problem > in the epilogue with ncopies =3D 2. > > .COND_ADD (_7, __gcov0.__brute_force_MOD_brute_I_lsm.21_67, 1, __gcov0.__= brute_force_MOD_brute_I_lsm.21_67); > where > _7 =3D *_6 > which is an internal_def. > > I played around with doing it analogously to the COND_EXPR > handling, so creating a COND_ADD (_7 !=3D 0) which will required > several fixups in other places because we're not prepared to > handle that. In the end it seems to only shift the problem > because we will still need the definition of _7. No, you shouldn't place _7 !=3D 0 inside the .COND_ADD but instead have an extra pattern stmt producing that so patt_8 =3D _7 !=3D 0; patt_9 =3D .COND_ADD (patt_8, ...); that's probably still not enough, but I always quickly forget how bool patterns work ... basically a comparison like patt_8 =3D _7 !=3D 0 vectorizes to a mask (aka vector boolean) while any "data" uses of bools are replaced by mask ? 1 : 0; - there's a complication for bool data producing loads which is why we need to insert the "fake" compares to produce a mask. IIRC. > I guess you're implying that the definition should have already > been handled by pattern recognition so that at the point when > we need it, it has a related pattern stmt with the proper mask > type? As said, the compare needs to be a separate pattern stmt part of the .COND_ADD patterns pattern sequence. Richard. > > Regards > Robin >