From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BB0393858405; Fri, 22 Jul 2022 17:32:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BB0393858405 From: "undefinedopcode2 at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/106415] New: loop-ivopts prevents correct usage of dbra with 16-bit loop counters on m68k Date: Fri, 22 Jul 2022 17:32:05 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 11.2.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: undefinedopcode2 at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2022 17:32:05 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106415 Bug ID: 106415 Summary: loop-ivopts prevents correct usage of dbra with 16-bit loop counters on m68k Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: undefinedopcode2 at gmail dot com Target Milestone: --- Created attachment 53338 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D53338&action=3Dedit C file that reproduces the problem. When targeting m68k and compiling certain loops with 16-bit counters that should trivially generate a DBRA instruction, GCC's optimization passes end= up converting the IV to 32-bit, which requires extra logic to check the upper half. More specifically, these are loops where the number of iterations is known at compile time. This additional code is completely useless since we know the loop count fit= s in 16 bits. I am using GCC 11.2.0 hosted on ARM64 macOS and targeting m68k. All code snippets were compiled with `-O3 -std=3Dc99 -march=3D68000 -mtune=3D68000`. Consider the following function: void dbra_test1(short i) { do { foo(i); } while(--i !=3D -1); } As expected, the generated body is a tiny loop consisting solely of call se= tup, the call itself, call cleanup, and a DBRA: .L2: movew %d2,%a0 movel %a0,%sp@- jsr %a2@ addql #4,%sp dbra %d2,.L2 Now consider this function, where we change the initial value of the loop c= ount to be a constant: void dbra_test2(void) { short i =3D 15; do { foo(i); } while(--i !=3D -1); } GCC generates the following code for the body of the loop: .L7: movel %d2,%sp@- jsr %a2@ addql #4,%sp dbra %d2,.L7 clrw %d2 subql #1,%d2 jcc .L7 Note the extraneous clr/subq/jcc. During ivcanon, GCC transforms the second loop to run from 16 to 0 instead = of 15 to -1. Later during ivopts, it transforms back into 15 to -1 form, but promotes the variable from short to int. Future transformations are no long= er able to optimize around the short variable, and we end up with extraneous checks inserted during codegen. I've attached a simple file that reproduces the problem. GCC 2.95.3 perform= ed the operation correctly, but it's been broken since at least 4.3.2, possibly earlier. Thanks --UD2=