From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 0E77238582B7; Tue, 11 Oct 2022 12:43:06 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0E77238582B7
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1665492187;
	bh=FvnbdHnxbf7hPLtFGtGUnNg07IxJauG96snuULDgng8=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=DNiGVHlKIbNZ7RVAfHGmMPQxWm11ONPxO9bOEPy+wWlPPrhlapNUpX9aBc3I1eG34
	 GPxc3M4cKPKj/CQpO0O1ir5IDx/BtsYJVMPqN8PMGoY9+OrynbTLCmN5CUEY3jEfnM
	 pv9h8mA6CsbWFWuzQnHwceFtq3s5NdrHigbO0fr4=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/107096] Fully masking vectorization with
 AVX512 ICEs gcc.dg/vect/vect-over-widen-*.c
Date: Tue, 11 Oct 2022 12:43:05 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-107096-4-hREKrkLqHx@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-107096-4@http.gcc.gnu.org/bugzilla/>
References: <bug-107096-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107096

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to rsandifo@gcc.gnu.org from comment #8)
> (In reply to rguenther@suse.de from comment #7)
> > more like precision but x86 uses QImode for two-element, four-element
> > and eight-element masks (rather than two partial integer modes with
> > two and four bits precision).
> Ah, OK.  So yeah, maybe the precision of the vector boolean element *
> the number of elements.

For SVE the following holds:
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 1996ecfee7a..9b24b481867 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -10097,6 +10097,12 @@ vect_get_loop_mask (gimple_stmt_iterator *gsi,
vec_loop_masks *masks,
                              TYPE_VECTOR_SUBPARTS (vectype)));
       gimple_seq seq =3D NULL;
       mask_type =3D truth_type_for (vectype);
+      /* Assert that both mask types have the same total number of value
+        bits.  */
+      gcc_assert (known_eq (TYPE_PRECISION (TREE_TYPE (TREE_TYPE (mask)))
+                           * TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask)),
+                           TYPE_PRECISION (TREE_TYPE (mask_type))
+                           * TYPE_VECTOR_SUBPARTS (mask_type)));
       mask =3D gimple_build (&seq, VIEW_CONVERT_EXPR, mask_type, mask);
       if (seq)
        gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);

for AVX the TYPE_PRECISION is always 1 so for unequal subparts we cannot
directly share masks.

I'm going to change LOOP_VINFO_MASKS from an array indexed by nV to
a two-dimensional indexed by nV and bit-precision * subparts.  Well,
probably using a hash_map instead since this will be quite sparse.
Or maybe not, but at least dynamically growing as we do now is difficult
and subparts can be non-constant.=