From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D2583392AC09; Fri, 10 Feb 2023 17:46:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D2583392AC09 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1676051177; bh=Lo8UXOV2Wd+3S6TzQc63bPSgFVV3EKzDAoIzVtyvtOA=; h=From:To:Subject:Date:In-Reply-To:References:From; b=H+fH1WmZupW44UH20yLx2X1FRSEmLb4AFjfMXjTFcxoNA1wtT6bzUGQHkEZHzbUgR 61L0WzPErTqOvGNY4RxuP4jB01+6YfGPR8S9BHD6iOQs+NlUIHVDLidqXR4ddtY4T0 vEy5iN03XfozwrbFXgPRvq8soN5kFvR+LLwfPPjg= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/108599] [12 Regression] Incorrect code generation newer intel architectures Date: Fri, 10 Feb 2023 17:46:16 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.1.1 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: jakub at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108599 --- Comment #12 from CVS Commits --- The releases/gcc-12 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:7d7f275ebe7295264a0406876c0670e25a50169a commit r12-9147-g7d7f275ebe7295264a0406876c0670e25a50169a Author: Jakub Jelinek Date: Tue Jan 31 10:12:19 2023 +0100 i386: Fix up ix86_convert_const_wide_int_to_broadcast [PR108599] The following testcase is miscompiled. The problem is that during RTL DSE we see a V4DI register is being loaded { 16, 16, 0, 0 } value and DSE mostly works in terms of scalar modes, so it calls movoi to set an OImode REG to (const_wide_int 0x100000000000000010) and ix86_convert_const_wide_int_to_broadcast thinks it can compute that value by broadcasting DImode 0x10. While it is true that for TImode result the broadcast could be used, for OImode/XImode it can't be, because all but the lowest 2 HOST_WIDE_INTs aren't present (so are 0 or -1 depending on sign), not 0x10 in this case. The function checks if the least significant HOST_WIDE_INT elt of the CONST_WIDE_INT is broadcastable from QI/HI/SI/DImode and then /* Check if OP can be broadcasted from VAL. */ for (int i =3D 1; i < CONST_WIDE_INT_NUNITS (op); i++) if (val !=3D CONST_WIDE_INT_ELT (op, i)) return nullptr; That is needed of course, but nothing checks that CONST_WIDE_INT_NUNITS (op) isn't too small for the mode in question. I think if op would be 0 or -1, it ought to be never CONST_WIDE_INT, but CONST_INT and so we can just punt whenever the number of CONST_WIDE_INT elts is not the expected one. 2023-01-31 Jakub Jelinek PR target/108599 * config/i386/i386-expand.cc (ix86_convert_const_wide_int_to_broadcast): Return nullptr if CONST_WIDE_INT_NUNITS (op) times HOST_BITS_PER_WIDE_INT isn't equal to bitsize of mode. * gcc.target/i386/avx2-pr108599.c: New test. (cherry picked from commit 963315a922e228c4f6853826666151fc540f111a)=