From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 844DC3851534; Thu, 27 Oct 2022 16:14:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 844DC3851534 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1666887283; bh=NBkNOr0+w14NHhcwtzf6y02rlv8AraKFgKr3U5D353A=; h=From:To:Subject:Date:In-Reply-To:References:From; b=nY+9tXgA4f9HIO+JpJ1f2GNPUVKKXGKRWZS4oJUASNWxo0jktcVjc4ILfPolSOT7R kua0wdvkHk5tR6VpdBCunK/birpyRuNwQynCeQzvH2s0deLVggfUKk/TLzAWsk5mey LT35duBktsrCYGp8SpN1/PsHI50StxmhZcqpDvDs= From: "g.peterhoff@t-online.de" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/107432] __builtin_convertvector generates inefficient code Date: Thu, 27 Oct 2022 16:14:38 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: unknown X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: g.peterhoff@t-online.de X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107432 --- Comment #2 from g.peterhoff@t-online.de --- Another example. I want to convert an array to array. There are basically 3 options: - Copy - Test (b2f64_default) - optimized version (b2f64_manually) gcc12.2 + gcctrunc convertSIZE_copy only generates scalar code (_mm_cvtsi64_sd) convertSIZE_default always generates conditional jumps convertSIZE_manually gcctrunc always generates branch-free scalar code gcc12.2 convert1024_manually generates vector code, but does not use HW conversion int8->int64 (_mm(256)_cvtepi8_epi64) and converts int8->int16->int32->int64 manually convert8_manually generates branch-free scalar code convert4_manually generates vector code and uses HW conversion int8->int64 NONE of these conversions are transformed/optimized to the extent that alwa= ys - all available intrinsics are used - no "normal" registers are used - branch-free code is generated https://godbolt.org/z/f74vK79of thx Gero=