From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id E47B93858009; Thu,  1 Feb 2024 13:52:38 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E47B93858009
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1706795558;
	bh=E/HWLiYaAqcj4b4ZnwkSieQ1Gn0eiSsJCZ5Mjvh64Ns=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=mhw/PNIxSUfA8YSYXPX/AhY9QppFAIHKoFknCwXh+q07wIV91ciU4hJxDuR05dc3K
	 0qvpKzDAyWONcsmfP6Xhzq3fDUsqbLEJtQ60HLIUjwD/MVQkf5FCar1xfkO/hsMQYa
	 4mSJrYg82AODoBoERS0wrTd8mHN5xqtp7soe1unk=
From: "redbeard0531 at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/113682] Branches in branchless binary search rather
 than cmov/csel/csinc
Date: Thu, 01 Feb 2024 13:52:37 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: middle-end
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: redbeard0531 at gmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-113682-4-afCkNaHA5X@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-113682-4@http.gcc.gnu.org/bugzilla/>
References: <bug-113682-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113682
--- Comment #5 from Mathias Stearn <redbeard0531 at gmail dot com> ---
(In reply to Andrew Pinski from comment #2)
> I should note that on x86, 2 cmov in a row might be an issue and worse th=
an
> branches. There is a cost model and the x86 backend rejects that.
>=20
> There are some cores where it is worse. I don't know if it applies to rec=
ent
> ones though.

Do you know if that applies to any cores that support x86_64? I checked Agn=
er
Fog's tables, and only very very old cores (P4 era) had high reciprocal
throughput, but even then it was less than latency. It looks like all AMD c=
ores
and intel cores newer than ivy bridge (ie everything from the last 10 years)
are able to execute multiple CMOVs per cycle (reciprocal throughput < 1). F=
rom
what I can see, it looks like bad CMOV was a particular problem of the Pent=
ium
4 and Prescott cores, and possibly PPro, but I don't see the numbers for it=
. I
don't think any of those cores should have an impact on the default cost mo=
del
in 2024.=