public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Indu Bhagat <indu.bhagat@oracle.com>
To: binutils@sourceware.org
Subject: [PATCH,RFC 0/7] Definition and Implementation of CTF Frame format
Date: Fri,  6 May 2022 17:52:16 -0700	[thread overview]
Message-ID: <20220507005223.3093035-1-indu.bhagat@oracle.com> (raw)

Hello folks,

This patch series is to invite comments, feedback and discussion on the CTF
Frame format.

At the GNU Tools Track at LPC 2021 
(https://linuxplumbersconf.org/event/11/contributions/1003/), it was proposed
to create a lightweight and compact unwind format intended to be kept in
binaries and to be used by online debugging tools like stack unwinders.  We are
calling this format "CTF Frame".  Some notes around the motivation and use case
of the CTF Frame format were previously also discussed here
https://sourceware.org/pipermail/binutils/2021-December/118880.html.

This patch series contains:
- The definition of the CTF Frame format.  This is the binary representation of
  what does in the .ctf_frame section and is generated using the CFI assembler
  directives.
- A little and simple library that provides encoders and decoders to write and
  read the CTF Frame binary data - libctfframe.
- Implementation in assembler and linker to create and merge .ctf_frame
  sections.
- Support for CTF Frame in readelf and objdump.
- A very simple example online unwinder that is based in CTF Frame.  For
  simplicity it provides the same interface as the backtrace(3) glibc call.
- For completeness, build system changes for gdb/ and sim/.  These changes are
  necessary to accommodate the newly created libctfframe library.

CTF Frame format
----------------
CTF Frame format provides basic unwind information for programs and libraries
via a .ctf_frame section. The .ctf_frame section is an allocatable, loadable
data section in a segment of its own (new segment introduced as
PT_GNU_CTF_FRAME) in ELF linked binaries. The CTF Frame format specifies the
minimal necessary unwind information, i.e., information needed to recover only
the CFA and the return address (RA) for all insns of a program.  This
information is a subset of what .eh_frame can convey: .eh_frame specifies how
to resurrect all callee-saved registers, if need be.

CTF Frame format is meant to be independent of the CTF type format: a
.ctf_frame section can have existence independent of the .ctf section. Type
information of the sort provided by CTF is not needed to unwind stack frames,
so CTF Frame is totally independent from CTF. The term "CTF" in the name,
however, is a deliberate choice as the CTF Frame format aligns well to the core
principles of the CTF debug format: compact and simple. Further, in future,
there will be support to gather the C types and the original value of function
arguments for generating more meaninful backtraces.  Some of this information
will come from the .ctf section.  But the unwind information in .ctf_frame
section will remain usable with or without a .ctf section.  In other words,
that future functionality will rely on additional section(s) that can establish
the relationship between unwind info in .ctf_frame section with the type info
in the .ctf section.

A CTF Frame section consists of a CTF Frame header, a list of CTF FDEs (CTF
Frame Description Entries) and a list of CTF FREs (CTF Frame Row Entries). A
CTF FRE serves a set of consecutive PCs, all of which have the same unwind
information. Together with the associated CTF FDE, each CTF FRE forms a 
self-sufficient record to recover the CFA and RA of the set of PCs served by
the CTF FRE. For more details on the CTF Frame format, please see the
ctf-frame.h header file included in this series.  The CTF Frame format supports
x86_64 and aarch64 at this time.

Compared to the size of the unwind information in .eh_frame (plus
.eh_frame_hdr), .ctf_frame sections currently are faring OK.  Here is ratio of
the size of (.ctf_frame) over size of (.eh_frame+.eh_frame_hdr) for x86_64 and
aarch64 for some randomly chosen programs in binutils:

ratio = (.ctf_frame / (.eh_frame+.eh_frame_hdr))

---------------------------------------------------
program    |  [x86_64] ratio  |  [aarch64] ratio
---------------------------------------------------
addr2line  |       1.13       |    0.67
ar         |       1.03       |    0.70
as         |       1.00       |    0.73
c++filt    |       1.08       |    0.64
elfedit    |       1.03       |    0.66
gprof      |       1.06       |    0.71
ld         |       1.02       |    0.73
nm         |       1.07       |    0.69
objcopy    |       1.08       |    0.72
objdump    |       1.08       |    0.73
size       |       1.10       |    0.67
strings    |       1.09       |    0.67

On an average, for x86_64, the .ctf_frame sections are about 6% larger than the
.eh_frame* sections.  There exist ways to improve .ctf_frame sizes on x86_64,
which have not been explored in depth yet.  One of the causes of the bloat in
.ctf_frame sections is known to be the unwind information for PLT entries.  The
.eh_frame unwind information for PLT section on x86_64 is particularly compact
as it can be encoded using a DWARF expression which evaluates
"rsp + 8 + ((((rip & 15)>= 11)? 1 : 0)<< 3)" for each value of rip in the PLT
section at run-time.  The CTF Frame format does not support expressions at all,
causing a bloat because subsequent PCs in the PLT sections have different unwind
rules, landing each PC of the PLT entry an entry of its own in the CTF Frame
section (each PC hence has a distinct CTF Frame row entry).  This issue can be
resolved by incorporating some more careful design ideas to compactly encode
information in the CTF Frame Row Entries, such that it exploits the "regular
pattern" in the unwind information for the PLT entries.

High-level implementation notes
-------------------------------
Creation and Linking:
The creation of .ctf_frame section is the onus of the assembler which creates
the section by using the .cfi_* directives embedded by the compiler.  If
.ctf_frame section is to be created, the compiler must emit a ".ctf_frame" in
the .cfi_sections directive.  Because the CTF Frame format does not support
complex expressions, those functions whose .cfi_ directives specify complex
expressions are skipped.  In practice, DWARF CFA expressions are not very
prevalent for CFA, RA recovery.

At link-time, the linker merges the input .ctf_sections by combining the
contents and sorting the FDEs on the start PC.  This helps an unwinder retrieve
the relevant unwind information quickly.

The libctfframe library:
The linker uses a library called libctfframe which provides the basic
functionality needed to decode a .ctf_frame section, and also to encode and
eventually write out the .ctf_frame in its binary format.

Unwinder:
A basic unwinder based on CTF Frame format is relatively simple to write.  We
are providing one as an example and proof of concept.

Testing
-------
Testing on x86_64 and aarch64 looks promising so far - basic unwinding is
working. This patch series is work in progress.  There still remain open issues
to be resolved,  crude code stubs to be revisited and addition of testsuites.
This will be taken care of in subsequent iterations of the series.
 
Please comment and provide feedback, it will help shape the format.  Here are a
few of the aspects that particularly need discussion:

1. What is a good place for an unwinder based on CTF Frame format ? Currently
to facilitate discussion, it is presented in a library of its own:
libctfbacktrace which, in turn, uses the libctfframe library for decoding the
.ctf_frame section for unwinding.  We brainstormed a bit about the possible
candidates being libbacktace, libgcc or libunwind ? Are there any
recommendations ?
2. Are there some ideas for smartly dealing with the issue of bloat caused by
the CTF Frame unwind information for the PLT entries ?
 
Thanks,

Indu Bhagat (5):
  ctf-frame.h: Add CTF Frame format definition
  gas: generate .ctf_frame
  bfd: linker: merge .ctf_frame sections
  readelf/objdump: support for CTF Frame section
  gdb: sim: buildsystem changes to accommodate libctfframe

Weimin Pan (2):
  libctfframe: add the CTF Frame library
  unwinder: generate backtrace using CTF Frame format

 Makefile.def                       |     5 +
 Makefile.in                        |  1289 ++-
 bfd/Makefile.am                    |     6 +-
 bfd/Makefile.in                    |     7 +-
 bfd/bfd-in2.h                      |     1 +
 bfd/configure                      |     2 +-
 bfd/configure.ac                   |     2 +-
 bfd/elf-bfd.h                      |    55 +
 bfd/elf-ctf-frame.c                |   490 +
 bfd/elf.c                          |    31 +
 bfd/elf64-x86-64.c                 |    97 +-
 bfd/elflink.c                      |    52 +
 bfd/elfxx-x86.c                    |   303 +-
 bfd/elfxx-x86.h                    |    46 +
 bfd/section.c                      |     1 +
 binutils/Makefile.am               |    10 +-
 binutils/Makefile.in               |    10 +-
 binutils/doc/binutils.texi         |     4 +
 binutils/doc/ctfframe.options.texi |    10 +
 binutils/objdump.c                 |    74 +
 binutils/readelf.c                 |    44 +
 configure                          |     2 +-
 configure.ac                       |     2 +-
 gas/Makefile.am                    |     3 +
 gas/Makefile.in                    |    22 +-
 gas/as.h                           |    10 +-
 gas/config/tc-aarch64.c            |    42 +
 gas/config/tc-aarch64.h            |    29 +
 gas/config/tc-i386.c               |    46 +
 gas/config/tc-i386.h               |    26 +
 gas/config/tc-xtensa.c             |     1 +
 gas/ctffreopt.c                    |   158 +
 gas/dw2gencfi.c                    |    30 +-
 gas/dw2gencfi.h                    |     1 +
 gas/gen-ctf-frame.c                |  1188 +++
 gas/gen-ctf-frame.h                |   142 +
 gas/write.c                        |    13 +
 gdb/Makefile.in                    |     8 +-
 gdb/acinclude.m4                   |     4 +-
 gdb/configure                      |    35 +-
 gdb/configure.ac                   |    11 +
 include/ctf-backtrace-api.h        |    57 +
 include/ctf-frame-api.h            |   213 +
 include/ctf-frame.h                |   257 +
 include/elf/common.h               |     1 +
 include/elf/internal.h             |     1 +
 ld/ld.texi                         |     4 +-
 ld/scripttempl/elf.sc              |     2 +
 libctfframe/Makefile.am            |    41 +
 libctfframe/Makefile.in            |   966 ++
 libctfframe/aclocal.m4             |  1241 +++
 libctfframe/config.h.in            |   144 +
 libctfframe/configure              | 15118 +++++++++++++++++++++++++++
 libctfframe/configure.ac           |    75 +
 libctfframe/ctf-backtrace-err.c    |    46 +
 libctfframe/ctf-backtrace.c        |   617 ++
 libctfframe/ctf-frame-dump.c       |   162 +
 libctfframe/ctf-frame-error.c      |    49 +
 libctfframe/ctf-frame-impl.h       |    55 +
 libctfframe/ctf-frame.c            |  1515 +++
 libctfframe/ttest.c                |    78 +
 sim/common/Make-common.in          |     7 +-
 62 files changed, 24914 insertions(+), 47 deletions(-)
 create mode 100644 bfd/elf-ctf-frame.c
 create mode 100644 binutils/doc/ctfframe.options.texi
 create mode 100644 gas/ctffreopt.c
 create mode 100644 gas/gen-ctf-frame.c
 create mode 100644 gas/gen-ctf-frame.h
 create mode 100644 include/ctf-backtrace-api.h
 create mode 100644 include/ctf-frame-api.h
 create mode 100644 include/ctf-frame.h
 create mode 100644 libctfframe/Makefile.am
 create mode 100644 libctfframe/Makefile.in
 create mode 100644 libctfframe/aclocal.m4
 create mode 100644 libctfframe/config.h.in
 create mode 100755 libctfframe/configure
 create mode 100644 libctfframe/configure.ac
 create mode 100644 libctfframe/ctf-backtrace-err.c
 create mode 100644 libctfframe/ctf-backtrace.c
 create mode 100644 libctfframe/ctf-frame-dump.c
 create mode 100644 libctfframe/ctf-frame-error.c
 create mode 100644 libctfframe/ctf-frame-impl.h
 create mode 100644 libctfframe/ctf-frame.c
 create mode 100644 libctfframe/ttest.c

-- 
2.31.1


             reply	other threads:[~2022-05-07  0:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-07  0:52 Indu Bhagat [this message]
2022-05-07  0:52 ` [PATCH,RFC 1/7] ctf-frame.h: Add CTF Frame format definition Indu Bhagat
2022-05-07  0:52 ` [PATCH,RFC 2/7] gas: generate .ctf_frame Indu Bhagat
2022-05-07  0:52 ` [PATCH,RFC 4/7] bfd: linker: merge .ctf_frame sections Indu Bhagat
2022-05-07  0:52 ` [PATCH,RFC 5/7] readelf/objdump: support for CTF Frame section Indu Bhagat
2022-05-07  0:52 ` [PATCH,RFC 6/7] unwinder: generate backtrace using CTF Frame format Indu Bhagat
2022-05-07  0:52 ` [PATCH, RFC 7/7] gdb: sim: buildsystem changes to accommodate libctfframe Indu Bhagat
     [not found] ` <20220507005223.3093035-4-indu.bhagat@oracle.com>
2022-05-08 22:00   ` [PATCH,RFC 3/7] libctfframe: add the CTF Frame library Indu Bhagat
2022-05-17 13:44     ` Jan Beulich
2022-05-13 13:08 ` [PATCH,RFC 0/7] Definition and Implementation of CTF Frame format Michael Matz
2022-05-13 17:50   ` [PATCH, RFC " Indu Bhagat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220507005223.3093035-1-indu.bhagat@oracle.com \
    --to=indu.bhagat@oracle.com \
    --cc=binutils@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).