From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 844DC3851534; Thu, 27 Oct 2022 16:14:38 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 844DC3851534
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1666887283;
	bh=NBkNOr0+w14NHhcwtzf6y02rlv8AraKFgKr3U5D353A=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=nY+9tXgA4f9HIO+JpJ1f2GNPUVKKXGKRWZS4oJUASNWxo0jktcVjc4ILfPolSOT7R
	 kua0wdvkHk5tR6VpdBCunK/birpyRuNwQynCeQzvH2s0deLVggfUKk/TLzAWsk5mey
	 LT35duBktsrCYGp8SpN1/PsHI50StxmhZcqpDvDs=
From: "g.peterhoff@t-online.de" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/107432] __builtin_convertvector generates inefficient
 code
Date: Thu, 27 Oct 2022 16:14:38 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: unknown
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: g.peterhoff@t-online.de
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-107432-4-wmmKSLNB74@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-107432-4@http.gcc.gnu.org/bugzilla/>
References: <bug-107432-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107432
--- Comment #2 from g.peterhoff@t-online.de ---
Another example. I want to convert an array<Bool> to array<Float64>.
There are basically 3 options:
- Copy
- Test (b2f64_default)
- optimized version (b2f64_manually)

gcc12.2 + gcctrunc
convertSIZE_copy only generates scalar code (_mm_cvtsi64_sd)
convertSIZE_default always generates conditional jumps

convertSIZE_manually
gcctrunc always generates branch-free scalar code
gcc12.2
convert1024_manually generates vector code, but does not use HW conversion
int8->int64 (_mm(256)_cvtepi8_epi64) and converts int8->int16->int32->int64
manually
convert8_manually generates branch-free scalar code
convert4_manually generates vector code and uses HW conversion int8->int64


NONE of these conversions are transformed/optimized to the extent that alwa=
ys
- all available intrinsics are used
- no "normal" registers are used
- branch-free code is generated

https://godbolt.org/z/f74vK79of

thx
Gero=