From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 6F58D384AB42; Thu, 11 Apr 2024 06:54:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6F58D384AB42 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1712818471; bh=YaOigfRkg9tA9d3K6ffupR5NasJNpB+vlcK7V27wldI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Mn9sAtHQOkBxU9bpqx1LtK2/4dxOWFmSIgAUm/y1QjSM/eWM+U5dMFfruYdu3ZVcH 0PCy8GiUsd4TqkpyGAQ7k109wI42Fp0eewcvZPuAKeA/xS6Vj/Hr3MjHz90/nxG93f VqjLvEt2FBcCoRJKHjC+mNVJToNqxLu4xUAgCr+A= From: "ubizjak at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/114591] [12/13/14 Regression] register allocators introduce an extra load operation since gcc-12 Date: Thu, 11 Apr 2024 06:54:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 13.2.0 X-Bugzilla-Keywords: missed-optimization, ra X-Bugzilla-Severity: normal X-Bugzilla-Who: ubizjak at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114591 --- Comment #13 from Uro=C5=A1 Bizjak --- (In reply to Hongtao Liu from comment #12) > short a; > short c; > short d; > void > foo (short b, short f) > { > c =3D b + a; > d =3D f + a; > } >=20 > foo(short, short): > addw a(%rip), %di > addw a(%rip), %si > movw %di, c(%rip) > movw %si, d(%rip) > ret >=20 > this one is bad since gcc10.1 and there's no subreg, The problem is if the > operand is used by more than 1 insn, and they all support separate m > constraint, mem_cost is quite small(just 1, reg move cost is 2), and this > makes RA more inclined to propagate memory across insns. I guess RA assum= es > the separate m means the insn only support memory_operand? I don't see this as problematic. IIRC, there was a discussion in the past t= hat a couple (two?) memory accesses from the same location close to each other = can be faster (so, -O2, not -Os) than preloading the value to the register firs= t. In contrast, the example from the Comment #11 already has the correct value= in %eax, so there is no need to reload it again from memory, even in a narrower mode.=