From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id B52883858412; Thu, 13 Apr 2023 17:29:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B52883858412 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1681406966; bh=R/NG5DVdGFt6a5UsS48P9k9S2DwazYw7DbpFqaUGbq4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=XDIoP5ULGe9pPQaHsoZpOiBX4mtXjeQsLEOGvL6KCTeSLJML8iJHPn4mQNvJUQgcE 9Tlz5Szc4UkynnVb8JZZBgFL63wbseqyDbT+u5ckedwebRcXWDqdzE7eB4ReX2jxth GwARlzIHEHrpM6cef1yUyLnEypZGpHzczIjB6nW4= From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/109154] [13 regression] jump threading de-optimizes nested floating point comparisons Date: Thu, 13 Apr 2023 17:29:26 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109154 --- Comment #47 from Jakub Jelinek --- The testcase then doesn't have to be floating point, say on x86 -O3 -mavx51= 2f void foo (int *f, int d, int e) { for (int i =3D 0; i < 1024; i++) { int a =3D f[i]; int t; if (a < 0) t =3D 1; else if (a < e) t =3D 1 - a * d; else t =3D 0; f[i] =3D t; } } shows similar problems. Strangely, for void foo (int *f, int d, int e) { if (e < 32 || e > 64) __builtin_unreachable (); for (int i =3D 0; i < 1024; i++) { int a =3D f[i]; f[i] =3D (a < 0 ? 1 : 1 - a * d) * (a < e ? 1 : 0); } } the threader doesn't do what it does for floating point code and we use jus= t 2 comparisons rather than 3 (or more). Still, only one multiplication, not 2. Strangely, in that case the second multiplication is there until vrp2, which folds it using /* Transform x * { 0 or 1, 0 or 1, ... } into x & { 0 or -1, 0 or -1, ...}, unless the target has native support for the former but not the latter. = */ match.pd pattern and others into oblivion.=