From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 600493858D37; Mon, 12 Feb 2024 13:18:31 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 600493858D37
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1707743911;
	bh=cmlZXm6+ko+tV/qBi7zOQ6gdftgmQvca0yYg72icGUU=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=k2Rm5DT6lu0nzGWkUJAD8CmhBs2iMY5RjbPQimRY4fVZERG+pGkWxfVk/HGXl0yhX
	 We9WPKBwrfCVFWperc+0yknRijBA+VSFTYT727fHBQs9BPTLmZ0c524DhlWD0beO97
	 BitbaVCdZ7MEv7SOeKv+ncK3E00jynCU+/HZd/Wk=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/113847] [14 Regression] 10% slowdown of 462.libquantum
 on AMD Ryzen 7700X and Ryzen 7900X
Date: Mon, 12 Feb 2024 13:18:30 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 14.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-113847-4-yOqRtXs2oO@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-113847-4@http.gcc.gnu.org/bugzilla/>
References: <bug-113847-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113847
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I can't confirm a regression (testing r14-8925-g1e3f78dbb328a2 with the
offending rev reverted vs bare).

462.libquantum  20720       61.9        335 S   20720       62.6        331=
 *
462.libquantum  20720       62.2        333 *   20720       61.9        335=
 S
462.libquantum  20720       62.4        332 S   20720       62.7        330=
 S

so the "best" run with the change is faster than the best run with it rever=
ted
while the worst runs are the same.

There's only code-gen changes in quantum_bmeasure.part.0 and we can see
it's likely

{component_ref<node>,mem_ref<0B>,reg_3(D)}@.MEM_166 (0030)

vs

{component_ref<hash>,mem_ref<0B>,reg_3(D)}@.MEM_9 (0022)

where once the size is 256 and once 64.  The types are

 <record_type 0x7ffff6a753f0 quantum_reg BLK
    size <integer_cst 0x7ffff6c29138 type <integer_type 0x7ffff6c250a8
bitsizetype> constant 256>
    unit-size <integer_cst 0x7ffff6c29228 type <integer_type 0x7ffff6c25000
sizetype> constant 32>

vs.

 <pointer_type 0x7ffff6a813f0
    type <record_type 0x7ffff6a81348 quantum_reg_node TI
        size <integer_cst 0x7ffff6c0be10 constant 128>
        unit-size <integer_cst 0x7ffff6c0be28 constant 16>

the former is subsetted by a COMPONENT_REF to eventually

 <pointer_type 0x7ffff6e752a0
    type <record_type 0x7ffff6e751f8 quantum_reg_node VOID
        align:8 warn_if_not_align:0 symtab:0 alias-set -1 structural-equali=
ty
        pointer_to_this <pointer_type 0x7ffff6e752a0>>
    unsigned DI

so we have basically MEM<ptr + off> vs. MEM<ptr>.member-with-off.

That's indeed a case where we maybe like to avoid applying this fix, but
maybe only when strict-aliasing is in effect.=