From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id B275F386F802; Fri, 10 May 2024 17:01:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B275F386F802 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1715360498; bh=RjcorNKdCGurc4PqjnFxRhLnnZ7MPYUcOpS0qsh79UI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=k8dILx+NlcCi8UgNjEsAar2XYyDblwrKQ4s6lPdpd7DSl0vNgK1PrN+YdgDgRqm7v 31UFOYaSz43XORkpvS32C6WNWTARzAWaklJKt0d7MTRKYMRp5o2yELBKEoLZv2vwBb PreBOk4O2hYg0Y44d1xTLXjzMKFRkZOw/3ftkAiw= From: "roger at nextmovesoftware dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/115021] [14/15 regression] unnecessary spill for vpternlog Date: Fri, 10 May 2024 17:01:38 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: roger at nextmovesoftware dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: roger at nextmovesoftware dot com X-Bugzilla-Target-Milestone: 14.2 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D115021 --- Comment #2 from Roger Sayle --- Here's a reduced test case that should be unaffected by the pending changes= to how V8QI shifts are expanded. Note that the final "t -=3D t4" is required = to convince the register allocator to "spill". typedef signed char v16qi __attribute__ ((__vector_size__ (16))); // sign-extend low 3 bits to a byte. v16qi foo (v16qi x) { v16qi t7 =3D (v16qi){7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7}; v16qi t4 =3D (v16qi){4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4}; v16qi t =3D x & t7; t ^=3D t4; t -=3D t4; return t; } which produces: foo: movl $67372036, %eax vmovdqa %xmm0, %xmm2 vpbroadcastd %eax, %xmm1 movl $117901063, %eax vpbroadcastd %eax, %xmm3 vmovdqa %xmm1, %xmm0 vmovdqa %xmm3, -24(%rsp) vmovdqa -24(%rsp), %xmm4 vpternlogd $120, %xmm2, %xmm4, %xmm0 vpsubb %xmm1, %xmm0, %xmm0 ret=