From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 0EED4394D83B; Thu, 17 Nov 2022 09:06:15 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0EED4394D83B
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1668675975;
	bh=MH82/s8Z+9ytnEIU05jH735c3ZTWRegoJWg2JEUbkpE=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=HOP7PbHv3t5Al0UHoGl4nhYkbWYZt2/V1Y729Wb5Idx8534MtoIZUuzveBdaEoWXV
	 qIbyJign9PM4P5l6yuhwXZukOzZjykCcJ1fT190ffT2VVQ1RydnbzJ/QgySNTbqfBP
	 sgvGxf5NtVKvMdp/cnXSHTFGdAm4YXz3zfrfkwuw=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/107451] [11/12/13 Regression] Segmentation
 fault with vectorized code since r11-6434
Date: Thu, 17 Nov 2022 09:06:13 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 11.3.1
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 11.4
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-107451-4-0tY0J26mtm@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-107451-4@http.gcc.gnu.org/bugzilla/>
References: <bug-107451-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107451
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Apart from the permute issue that's maybe there the issue of the segfault is
failure to code generate the loads correctly to match the SLP analysis.  We
generate loads as if we'd use a VF of 2 but use only the lower part but DCE=
 /
simplification doesn't simplify

  _64 =3D MEM <vector(2) double> [(const double *)ivtmp_66];
  ivtmp_63 =3D ivtmp_66 + _75;
  _62 =3D MEM <vector(2) double> [(const double *)ivtmp_63];
  vect_cst__60 =3D {_64, _62};
  vect__4.12_59 =3D VEC_PERM_EXPR <vect_cst__60, vect_cst__60, { 1, 0, 1, 0=
 }>;

to, for example

  _64 =3D MEM <vector(2) double> [(const double *)ivtmp_66];
  ivtmp_63 =3D ivtmp_66 + _75;
  vect_cst__60 =3D {_64, _64};
  vect__4.12_59 =3D VEC_PERM_EXPR <vect_cst__60, vect_cst__60, { 1, 0, 1, 0=
 }>;

(we now also allow VEC_PERM of _64, _64 directly with GCC 13, but the targe=
ts
need to be ready for this)

That's probably a latent issue in other cases as well.  We'd either need to
disallow these kind of load permutations or make sure we only reference
the actually loaded DR group when filling the input vectors.=