From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 242B83858C60; Thu, 8 Feb 2024 17:37:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 242B83858C60 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1707413824; bh=z5NGtn7Rk7DZdsKKl2+ZF89Hmd/8Ps0wb0h7ksaZT+Q=; h=From:To:Subject:Date:From; b=NbBWIvxdHaBo6it+OzUR4RT0exK2MPGWkarVpXyVt9odkoVp67Uf4cP4jfSFE78iM 5TJdzlCNmB9og7e35GUCU8En6SPyTw4QFlksW8H8KzpJ5Nst97JYrT3EjujA3iarfZ YbRXZyFRbPRcXDarPQkZ8lp6LgLCVSjeC6L3nswY= From: "absoler at smail dot nju.edu.cn" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce Date: Thu, 08 Feb 2024 17:37:03 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.2.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: absoler at smail dot nju.edu.cn X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113838 Bug ID: 113838 Summary: regression of redundant load operation introduced by -fno-tree-forwprop introduce Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: absoler at smail dot nju.edu.cn Target Milestone: --- hi, I have found for the following code, with -O2 option, gcc-10.2.0 will generate a redundant load, and gcc-13.2.0 won't. However, with an extra fla= g " -fno-tree-forwprop", gcc-13.2.0 will produce the same bad code full code https://godbolt.org/z/objsWGnY6 ``` func_37() ... l_50[1] =3D &g_51; if (((*l_50[1]) ^=3D g_26[5][3][0])) { /* block id: 13 */ int32_t **l_52[5] =3D {&l_50[2],&l_50[2],&l_50[2],&l_50[2],&l_50[2]= }; int i; (*l_52[4]) =3D (((void*)0 !=3D &g_51) , &g_36[3][4]); return p_39; } else { /* block id: 16 */ int32_t *l_53 =3D &g_54[0][1][1]; return l_53; } ``` ``` func_37(): 401d74: mov 0x364e(%rip),%edx # 4053c8 func_11(): 401d7a: mov %al,0x33d9(%rip) # 405159 func_37(): 401d80: mov 0x33c2(%rip),%eax # 405148 401d86: xor %eax,%edx 401d88: cmp 0x363a(%rip),%eax # 4053c8 401d8e: mov $0x40510c,%eax 401d93: mov %edx,0x33af(%rip) # 405148 401d99: mov $0x4051a4,%edx 401d9e: cmovne %rdx,%rax ``` the second load of g_26[5][3][0], i.e. "cmp 0x363a(%rip),%eax" can be optimized away. The better code generated by gcc-13.2.0 is: ``` func_37(): 401e40: mov 0x3582(%rip),%eax # 4053c8 401e46: mov %edx,%ecx 401e48: xor %eax,%ecx 401e4a: cmp %eax,%edx 401e4c: mov $0x40510c,%edx 401e51: mov $0x4051a4,%eax 401e56: cmove %rdx,%rax ```=