From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 1EF7D3858D3C; Mon, 26 Feb 2024 02:51:32 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1EF7D3858D3C
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1708915892;
	bh=Gt1gxaWnt9CICCU2XCw3jNbhO+6S/zm8yVLlkNJxQ+s=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=QePbFtItuYKJnk1N54WFnfYf37Nt0w9ODySH96p18SAIbcp8UFZ9bM3h0j4q9uyBr
	 UC4IevPSEuwj15TskZO+fphBsSmlWXC8WXyUp6vIaM3E8tNwuPc+KclS1uvq/LWjko
	 qxdzMyTnAohwlMKG1fu3MS1hcGsPI0s9pVU9y6zk=
From: "liuhongt at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/114107] poor vectorization at -O3 when dealing with
 arrays of different multiplicity, good with -O2
Date: Mon, 26 Feb 2024 02:51:31 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 13.2.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: liuhongt at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-114107-4-MME5OIHZnF@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-114107-4@http.gcc.gnu.org/bugzilla/>
References: <bug-114107-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114107

Hongtao Liu <liuhongt at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |liuhongt at gcc dot gnu.org
--- Comment #7 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
perm_cost is very low in x86 backend, and it maybe ok for 128-bit vectors,
pshufb/shufps are avaible for most cases.
But for 256/512-bit vectors, when the permuation is cross-lane, the cost co=
uld
be higher. One solution is increase perm_cost when vector size is more than=
 128
since vperm is most likely used instead of vblend/vpblend/vpshuf/vshuf.=