From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 6986F3858409; Tue, 19 Oct 2021 17:42:23 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6986F3858409
From: "roger at nextmovesoftware dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/102840] [12 Regression]
 gcc.target/i386/pr22076.c by r12-4475
Date: Tue, 19 Oct 2021 17:42:23 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: roger at nextmovesoftware dot com
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-102840-4-L249n5N49J@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-102840-4@http.gcc.gnu.org/bugzilla/>
References: <bug-102840-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Oct 2021 17:42:23 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102840
--- Comment #3 from Roger Sayle <roger at nextmovesoftware dot com> ---
With -m64, before:
test:   movq    .LC1(%rip), %mm0
        paddb   .LC0(%rip), %mm0
        movq    %xmm0, x(%rip)
        ret

And after:
test:   movq    .LC2(%rip), %rax
        movq    %rax, x(%rip)
        ret

So we have two movq before, and two movq after, but clearly we've avoided t=
he
computation at run-time.

It's difficult (for me) to judge whether the -m32's use of immediate consta=
nts
is now better than -m64's load memory/store memory idiom in the "average ca=
se",
but worst case [data cache miss], the former is clearly better [requiring o=
nly
fewer memory transactions].=