From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id DE8A83858C74; Thu, 20 Apr 2023 08:46:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DE8A83858C74 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1681980399; bh=aT7xASVJVbHaARVtWlmt4frEV8Fqey5cASZsR8eB+rA=; h=From:To:Subject:Date:In-Reply-To:References:From; b=J68+ZZk818Q0AB4PuF5ZRdf4BHTC5SPg0hZQMnlEdBMYX+OezolDlnI/5/jYQX1L1 AX5xUYErdqqnJCsR3PQLQrgQ8fNFr2jNNbHmywxOfse5PMvOA5GsND8gkMcW6C3dnZ 7/xlJQm5PAP0HBTxk2enemy9dx6cYrvFux8no4oo= From: "guihaoc at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/54063] [10/11/12/13/14 regression] on powerpc64 gcc 4.9/8 generates larger code for global variable accesses than gcc 4.7 Date: Thu, 20 Apr 2023 08:46:35 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 4.8.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: guihaoc at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: steven at gcc dot gnu.org X-Bugzilla-Target-Milestone: 10.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D54063 HaoChen Gui changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |guihaoc at gcc dot gnu.org --- Comment #26 from HaoChen Gui --- I made an experiment to move the split of "tocref" berfore the reload = (do it at split1). The additional addis can be optimized out by postreload cse = on P9. Also Tested SPEC 2017, it seems not hit the problems Alan pointed out. = But, there are several other issues. 1. The optimization relies on the sequence of insns. On P8, the memory load insn is moved ahead to the second addis by sched pass. So the postreload cse can't optimzies it as the r9 is used by the load. 2. The patch causes different register assignment. By comparing the object files in SPEC, we can see that the register assignment changes and it tends= to use less registers with the patch.=20 3. The patch has side effect on BB head merging in jump2 pass. The sched pa= ss commonly separates the two tocref insns if they're already split. Thus the sequence of insns in two branche arms might be changed. Sometime the BB head merging can be done with the patch, can't be done without the patch. While sometime it can't be done with the patch, but it can be done without the pa= tch. The both positive and negative examples can be found in object files.=