From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id CF2A33858D33; Tue, 22 Aug 2023 05:25:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CF2A33858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692681959; bh=R2v2Q4Yjtn1Uj9CbEzUhAg6F345YJRmq8w7x4C7kqng=; h=From:To:Subject:Date:From; b=k6zsaHKX7h8h+QV76u51msghKF19jr+EDGG/pOQOwC8xoaBc6cJKJbf5DnciL+Cm9 WF1j0+LSggx1o/vC+EFh8iPd8OjNa9vwgBQbI4AwY2r/OQjm2HSJ8dQYQvJvjiKSvK srPqFl5sXlH/19ZZUGC3NTO3Ys0WX4dbYopqOIME= From: "tkoenig at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/111096] New: Frame pointer is not used even when -fomit-frame-pointer is specified Date: Tue, 22 Aug 2023 05:25:59 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: tkoenig at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111096 Bug ID: 111096 Summary: Frame pointer is not used even when -fomit-frame-pointer is specified Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- The code, by Kent Dickey posted to comp.arch typedef unsigned int u32; typedef unsigned long long u64; u64 do_op(u64 out0, u64 in0, u64 in1, u32 opcode, int size); void calc_loop(u64 *optr, u64 *iptr0, u64 *iptr1, u32 opcode, int size, int len) { u64 o0, i0, i1, val, result; int num, shift, pos; int i, j; // size is 0,1,2,3 representing 8,16,32,64 bytes num =3D 8 >> size; // 8,4,2,1 shift =3D 8 << size; // 8,16,32,64 for(i =3D 0; i < len; i++) { o0 =3D optr[i]; i0 =3D iptr0[i]; i1 =3D iptr1[i]; result =3D 0; pos =3D 0; for(j =3D 0; j < num; j++) { val =3D do_op(o0, i0, i1, opcode, size); result =3D result | (val << pos); pos +=3D shift; if(shift < 64) { o0 =3D o0 >> shift; i0 =3D i0 >> shift; i1 =3D i1 >> shift; } } optr[i] =3D result; } } compiled for aarch64 on godbolt with recent trunk and -O3 -fomit-frame-poin= ter (see https://godbolt.org/z/5bKPeGWrK ) does not set up the frame pointer, but it also does not use it for aoviding spill/restore pairs: calc_loop: stp x19, x20, [sp, -144]! mov w6, 8 asr w19, w6, w4 stp x27, x28, [sp, 64] lsl w27, w6, w4 str x30, [sp, 80] stp x0, x1, [sp, 112] str x2, [sp, 128] cmp w5, 0 ble .L1 sbfiz x0, x5, 3, 32 stp x21, x22, [sp, 16] mov w20, w4 stp x23, x24, [sp, 32] mov w21, w3 stp x25, x26, [sp, 48] str x0, [sp, 136] cmp w27, 63 ble .L3 mov x25, 0 .L6: ldr x0, [sp, 112] ldr x23, [x0, x25] ldr x0, [sp, 120] ldr x0, [x0, x25] str x0, [sp, 104] ldr x0, [sp, 128] ldr x24, [x0, x25] cbz w19, .L10 mov w22, 0 mov w28, 0 mov x26, 0 .L5: ldr x1, [sp, 104] mov w4, w20 mov w3, w21 mov x2, x24 mov x0, x23 add w22, w22, 1 bl do_op lsl x0, x0, x28 add w28, w28, w27 orr x26, x26, x0 cmp w19, w22 bne .L5 ldr x0, [sp, 112] str x26, [x0, x25] add x25, x25, 8 ldr x0, [sp, 136] cmp x0, x25 bne .L6 .L17: ldp x21, x22, [sp, 16] ldp x23, x24, [sp, 32] ldp x25, x26, [sp, 48] .L1: ldp x27, x28, [sp, 64] ldr x30, [sp, 80] ldp x19, x20, [sp], 144 ret .L3: str xzr, [sp, 104] ldp x0, x1, [sp, 104] ldr x24, [x1, x0] ldr x1, [sp, 120] ldr x25, [x1, x0] ldr x1, [sp, 128] ldr x22, [x1, x0] cbz w19, .L11 .L20: mov w26, 0 mov w28, 0 mov x23, 0 .L8: mov x2, x22 mov x1, x25 mov x0, x24 mov w4, w20 mov w3, w21 add w26, w26, 1 bl do_op lsr x24, x24, x27 lsl x0, x0, x28 add w28, w28, w27 orr x23, x23, x0 lsr x25, x25, x27 lsr x22, x22, x27 cmp w19, w26 bne .L8 ldp x0, x1, [sp, 104] str x23, [x1, x0] add x0, x0, 8 ldr x1, [sp, 136] str x0, [sp, 104] cmp x1, x0 beq .L17 .L19: ldp x0, x1, [sp, 104] ldr x24, [x1, x0] ldr x1, [sp, 120] ldr x25, [x1, x0] ldr x1, [sp, 128] ldr x22, [x1, x0] cbnz w19, .L20 .L11: ldp x0, x1, [sp, 104] mov x23, 0 str x23, [x1, x0] add x0, x0, 8 ldr x1, [sp, 136] str x0, [sp, 104] cmp x1, x0 bne .L19 b .L17 .L10: ldr x0, [sp, 112] mov x26, 0 str x26, [x0, x25] add x25, x25, 8 ldr x0, [sp, 136] cmp x0, x25 bne .L6 b .L17=