public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "paulf at free dot fr" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/111797] New: Code generation of -march=znver2 -O3 includes frame pointer
Date: Fri, 13 Oct 2023 11:09:58 +0000	[thread overview]
Message-ID: <bug-111797-4@http.gcc.gnu.org/bugzilla/> (raw)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111797

            Bug ID: 111797
           Summary: Code generation of -march=znver2 -O3 includes frame
                    pointer
           Product: gcc
           Version: 13.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: paulf at free dot fr
  Target Milestone: ---

I was a bit surprised recently when I (unintentinally) ran perf record on the
exe that I work on with an -O3 build without -fno-omit-frame-pointer and I
could see the callstacks.

The function prolog that I see is

0000000000000000 <function>:
       0:       4c 8d 54 24 08          lea    0x8(%rsp),%r10
       5:       48 83 e4 e0             and    $0xffffffffffffffe0,%rsp
       9:       41 ff 72 f8             push   -0x8(%r10)
       d:       55                      push   %rbp
       e:       48 89 e5                mov    %rsp,%rbp
      11:       41 57                   push   %r15
      13:       41 56                   push   %r14
      15:       41 55                   push   %r13
      17:       41 54                   push   %r12
      19:       41 52                   push   %r10
      1b:       53                      push   %rbx
      1c:       49 89 ce                mov    %rcx,%r14
      1f:       48 81 ec 40 10 00 00    sub    $0x1040,%rsp

I asked on SO and got pointed to this post

https://stackoverflow.com/questions/45423338/whats-up-with-gcc-weird-stack-manipulation-when-it-wants-extra-stack-alignment

That problem seems to be fixed

https://godbolt.org/z/qc6fqb5hn

I can't post the source code as it is proprietary, and it doesn't seem to
reproduce with trivial examples (the function that I tried is 23kloc plus it
#includes other stuff).

I was able to reproduce the problem with the following steps (Valgrind chosen
because I'm one of the maintainers and I'm in the habit of building it).

git clone https://sourceware.org/git/valgrind.git march_zen2
cd march_zen2
./autogen.sh
./configure CFLAGS=-march=znver2
make -j 16
objdump -d --disassemble=mc_pre_clo_init mc_pre_clo_init
.in_place/memcheck-amd64-linux | less

That shows

000000005800c220 <mc_pre_clo_init>:
    5800c220:   41 55                   push   %r13
    5800c222:   bf 8c 65 1d 58          mov    $0x581d658c,%edi
    5800c227:   4c 8d 6c 24 10          lea    0x10(%rsp),%r13
    5800c22c:   48 83 e4 e0             and    $0xffffffffffffffe0,%rsp
    5800c230:   41 ff 75 f8             push   -0x8(%r13)
    5800c234:   55                      push   %rbp
    5800c235:   48 89 e5                mov    %rsp,%rbp
    5800c238:   41 55                   push   %r13
    5800c23a:   48 83 ec 08             sub    $0x8,%rsp

which I believe illustrates the same problem.

mc_pre_clo_init looks like this


static void mc_pre_clo_init(void)
{
   VG_(details_name)            ("Memcheck");
   VG_(details_version)         (NULL);
   VG_(details_description)     ("a memory error detector");
   VG_(details_copyright_author)(
      "Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.");
   VG_(details_bug_reports_to)  (VG_BUGS_TO);

VG_ is a macro that implements a kind of C namespace. The functions are all
outputting the memcheck startup banner.

I think that I understand that there is a need for a 32byte-aligned stack and
also to shuffle the return address. Is it really necessary to also use the
frame pointer?

             reply	other threads:[~2023-10-13 11:09 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-13 11:09 paulf at free dot fr [this message]
2023-10-13 11:49 ` [Bug target/111797] " rguenth at gcc dot gnu.org
2023-10-13 12:44 ` paulf at free dot fr
2023-10-13 18:02 ` paulf at free dot fr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-111797-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).