From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <sebastian.huber@embedded-brains.de>
Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148])
 by sourceware.org (Postfix) with ESMTPS id 2B61F385840F
 for <gcc-patches@gcc.gnu.org>; Mon, 25 Apr 2022 07:09:36 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2B61F385840F
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=embedded-brains.de
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=embedded-brains.de
Received: from sslproxy05.your-server.de ([78.46.172.2])
 by dedi548.your-server.de with esmtpsa  (TLS1.3) tls TLS_AES_256_GCM_SHA384
 (Exim 4.94.2) (envelope-from <sebastian.huber@embedded-brains.de>)
 id 1nisql-0004ms-0T; Mon, 25 Apr 2022 09:09:35 +0200
Received: from [82.100.198.138] (helo=mail.embedded-brains.de)
 by sslproxy05.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256)
 (Exim 4.92) (envelope-from <sebastian.huber@embedded-brains.de>)
 id 1nisqk-000XDp-Tm; Mon, 25 Apr 2022 09:09:34 +0200
Received: from localhost (localhost [127.0.0.1])
 by mail.embedded-brains.de (Postfix) with ESMTP id A0ABF4800CA;
 Mon, 25 Apr 2022 09:09:34 +0200 (CEST)
Received: from mail.embedded-brains.de ([127.0.0.1])
 by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id rUF0FSZa5dDi; Mon, 25 Apr 2022 09:09:34 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1])
 by mail.embedded-brains.de (Postfix) with ESMTP id F1EC7480147;
 Mon, 25 Apr 2022 09:09:33 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zimbra.eb.localhost
Received: from mail.embedded-brains.de ([127.0.0.1])
 by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id 2nLOXAwrc7iQ; Mon, 25 Apr 2022 09:09:33 +0200 (CEST)
Received: from zimbra.eb.localhost (unknown [192.168.96.242])
 by mail.embedded-brains.de (Postfix) with ESMTPSA id 8B9134801ED;
 Mon, 25 Apr 2022 09:09:33 +0200 (CEST)
From: Sebastian Huber <sebastian.huber@embedded-brains.de>
To: gcc-patches@gcc.gnu.org
Subject: [gcov v2 14/14] gcov: Add section for freestanding environments
Date: Mon, 25 Apr 2022 09:09:29 +0200
Message-Id: <20220425070929.7466-15-sebastian.huber@embedded-brains.de>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20220425070929.7466-1-sebastian.huber@embedded-brains.de>
References: <20220425070929.7466-1-sebastian.huber@embedded-brains.de>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Authenticated-Sender: smtp-embedded@poldinet.de
X-Virus-Scanned: Clear (ClamAV 0.103.5/26522/Sun Apr 24 10:22:35 2022)
X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Apr 2022 07:09:39 -0000

gcc/

	* doc/gcov.texi (Profiling and Test Coverage in Freestanding
	Environments): New section.
---
 gcc/doc/gcov.texi | 375 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 375 insertions(+)

diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index fc39da0f02d..751a11314f3 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -41,6 +41,8 @@ test code coverage in your programs.
 * Gcov and Optimization::       Using gcov with GCC optimization.
 * Gcov Data Files::             The files used by gcov.
 * Cross-profiling::             Data file relocation.
+* Freestanding Environments::   How to use profiling and test
+                                coverage in freestanding environments.
 @end menu
=20
 @node Gcov Intro
@@ -971,3 +973,376 @@ setting will name the data file @file{/target/run/b=
uild/foo.gcda}.
 You must move the data files to the expected directory tree in order to
 use them for profile directed optimizations (@option{-fprofile-use}), or=
 to
 use the @command{gcov} tool.
+
+@node Freestanding Environments
+@section Profiling and Test Coverage in Freestanding Environments
+
+In case your application runs in a hosted environment such as GNU/Linux,=
 then
+this section is likely not relevant to you.  This section is intended fo=
r
+application developers targeting freestanding environments (for example
+embedded systems) with limited resources.  In particular, systems or tes=
t cases
+which do not support constructors/destructors or the C library file I/O.=
  In
+this section, the @dfn{target system} runs your application instrumented=
 for
+profiling or test coverage.  You develop and analyze your application on=
 the
+@dfn{host system}.  We give now an overview how profiling and test cover=
age can
+be obtained in this scenario followed by a tutorial which can be exercis=
ed on
+the host system.  Finally, some system initialization caveats are listed=
.
+
+@subsection Overview
+
+For an application instrumented for profiling or test coverage, the comp=
iler
+generates some global data structures which are updated by instrumentati=
on code
+while the application runs.  These data structures are called the @dfn{g=
cov
+information}.  Normally, when the application exits, the gcov informatio=
n is
+stored to @file{.gcda} files.  There is one file per translation unit
+instrumented for profiling or test coverage.  The function
+@code{__gcov_exit()}, which stores the gcov information to a file, is ca=
lled by
+a global destructor function for each translation unit instrumented for
+profiling or test coverage.  It runs at process exit.  In a global const=
ructor
+function, the @code{__gcov_init()} function is called to register the gc=
ov
+information of a translation unit in a global list.  In some situations,=
 this
+procedure does not work.  Firstly, if you want to profile the global
+constructor or exit processing of an operating system, the compiler gene=
rated
+functions may conflict with the test objectives.  Secondly, you may want=
 to
+test early parts of the system initialization or abnormal program behavi=
our
+which do not allow a global constructor or exit processing.  Thirdly, yo=
u need
+a filesystem to store the files.
+
+The @option{-fprofile-info-section} GCC option enables you to use profil=
ing and
+test coverage in freestanding environments.  This option disables the us=
e of
+global constructors and destructors for the gcov information.  Instead, =
a
+pointer to the gcov information is stored in a special linker input sect=
ion for
+each translation unit which is compiled with this option.  By default, t=
he
+section name is @code{.gcov_info}.  The gcov information is statically
+initialized.  The pointers to the gcov information from all translation =
units
+of an executable can be collected by the linker in a continuous memory b=
lock.
+For the GNU linker, the below linker script output section definition ca=
n be
+used to achieve this:
+
+@smallexample
+  .gcov_info      :
+  @{
+    PROVIDE (__gcov_info_start =3D .);
+    KEEP (*(.gcov_info))
+    PROVIDE (__gcov_info_end =3D .);
+  @}
+@end smallexample
+
+The linker will provide two global symbols, @code{__gcov_info_start} and
+@code{__gcov_info_end}, which define the start and end of the array of p=
ointers
+to gcov information blocks, respectively.  The @code{KEEP ()} directive =
is
+required to prevent a garbage collection of the pointers.  They are not
+directly referenced by anything in the executable.  The section may be p=
laced
+in a read-only memory area.
+
+In order to transfer the profiling and test coverage data from the targe=
t to
+the host system, the application has to provide a function to produce a
+reliable in order byte stream from the target to the host.  The byte str=
eam may
+be compressed and encoded using error detection and correction codes to =
meet
+application-specific requirements.  The GCC provided @file{libgcov} targ=
et
+library provides two functions, @code{__gcov_info_to_gcda()} and
+@code{__gcov_filename_to_gcfn()}, to generate a byte stream from a gcov
+information bock.  The functions are declared in @code{#include <gcov.h>=
}.  The
+byte stream can be deserialized by the @command{merge-stream} subcommand=
 of the
+@command{gcov-tool} to create or update @file{.gcda} files in the host
+filesystem for the instrumented application.=20
+
+@subsection Tutorial
+
+This tutorial should be exercised on the host system.  We will build a p=
rogram
+instrumented for test coverage.  The program runs an application and dum=
ps the
+gcov information to @file{stderr} encoded as a printable character strea=
m.  The
+application simply decodes such character streams from @file{stdin} and =
writes
+the decoded character stream to @file{stdout} (warning: this is binary d=
ata).
+The decoded character stream is consumed by the @command{merge-stream}
+subcommand of the @command{gcov-tool} to create or update the @file{.gcd=
a}
+files.
+
+To get started, create an empty directory.  Change into the new director=
y.
+Create a header file @file{app.h} with the following content:
+
+@smallexample
+static const unsigned char a =3D 'a';
+
+static inline unsigned char *
+encode (unsigned char c, unsigned char buf[2])
+@{
+  buf[0] =3D c % 16 + a;
+  buf[1] =3D (c / 16) % 16 + a;
+  return buf;
+@}
+
+extern void application (void);
+@end smallexample
+
+Create a source file @file{app.c} with the following content:
+
+@smallexample
+#include "app.h"
+
+#include <stdio.h>
+
+/* The application reads a character stream encoded by encode() from std=
in,
+   decodes it, and writes the decoded characters to stdout.  Characters =
other
+   than the 16 characters 'a' to 'p' are ignored.  */
+
+static int can_decode (unsigned char c)
+@{
+  return (unsigned char)(c - a) < 16;
+@}
+
+void
+application (void)
+@{
+  int first =3D 1;
+  int i;
+  unsigned char c;
+
+  while ((i =3D fgetc (stdin)) !=3D EOF)
+    @{
+      unsigned char x =3D (unsigned char)i;
+
+      if (can_decode (x))
+        @{
+          if (first)
+            c =3D x - a;
+          else
+            fputc (c + 16 * (x - a), stdout);
+          first =3D !first;
+        @}
+      else
+        first =3D 1;
+    @}
+@}
+@end smallexample
+
+Create a source file @file{main.c} with the following content:
+
+@smallexample
+#include "app.h"
+
+#include <gcov.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+/* The start and end symbols are provided by the linker script.  We use =
the
+   array notation to avoid issues with a potential small-data area.  */
+
+extern const struct gcov_info *const __gcov_info_start[];
+extern const struct gcov_info *const __gcov_info_end[];
+
+/* This function shall produce a reliable in order byte stream to transf=
er the
+   gcov information from the target to the host system.  */
+
+static void
+dump (const void *d, unsigned n, void *arg)
+@{
+  (void)arg;
+  const unsigned char *c =3D d;
+  unsigned char buf[2];
+
+  for (unsigned i =3D 0; i < n; ++i)
+    fwrite (encode (c[i], buf), sizeof (buf), 1, stderr);
+@}
+
+/* The filename is serialized to a gcfn data stream by the
+   __gcov_filename_to_gcfn() function.  The gcfn data is used by the
+   "merge-stream" subcommand of the "gcov-tool" to figure out the filena=
me
+   associated with the gcov information. */
+
+static void
+filename (const char *f, void *arg)
+@{
+  __gcov_filename_to_gcfn (f, dump, arg);
+@}
+
+/* The __gcov_info_to_gcda() function may have to allocate memory under
+   certain conditions.  Simply try it out if it is needed for your appli=
cation
+   or not.  */
+
+static void *
+allocate (unsigned length, void *arg)
+@{
+  (void)arg;
+  return malloc (length);
+@}
+
+/* Dump the gcov information of all translation units.  */
+
+static void
+dump_gcov_info (void)
+@{
+  const struct gcov_info *const *info =3D __gcov_info_start;
+  const struct gcov_info *const *end =3D __gcov_info_end;
+
+  /* Obfuscate variable to prevent compiler optimizations.  */
+  __asm__ ("" : "+r" (info));
+
+  while (info !=3D end)
+  @{
+    void *arg =3D NULL;
+    __gcov_info_to_gcda (*info, filename, dump, allocate, arg);
+    fputc ('\n', stderr);
+    ++info;
+  @}
+@}
+
+/* The main() function just runs the application and then dumps the gcov
+   information to stderr.  */
+
+int
+main (void)
+@{
+  application ();
+  dump_gcov_info ();
+  return 0;
+@}
+@end smallexample
+
+If we compile @file{app.c} with test coverage and no extra profiling opt=
ions,
+then a global constructor (@code{_sub_I_00100_0} here, it may have a dif=
ferent
+name in your environment) and destructor (@code{_sub_D_00100_1}) is used=
 to
+dump the gcov information.  We also see undefined references to
+@code{__gcov_init} and @code{__gcov_exit}:
+
+@smallexample
+$ gcc -ftest-coverage -fprofile-arcs -c app.c
+$ nm app.o
+0000000000000000 r a
+0000000000000030 T application
+0000000000000000 t can_decode
+                 U fgetc
+                 U fputc
+0000000000000000 b __gcov0.application
+0000000000000038 b __gcov0.can_decode
+0000000000000000 d __gcov_.application
+00000000000000c0 d __gcov_.can_decode
+                 U __gcov_exit
+                 U __gcov_init
+                 U __gcov_merge_add
+                 U stdin
+                 U stdout
+0000000000000161 t _sub_D_00100_1
+0000000000000151 t _sub_I_00100_0
+@end smallexample
+
+Compile @file{app.c} and @file{main.c} with test coverage and
+@option{-fprofile-info-section}.  Now, a read-only pointer size object i=
s
+present in the @code{.gcov_info} section and there are no undefined refe=
rences
+to @code{__gcov_init} and @code{__gcov_exit}:
+
+@smallexample
+$ gcc -ftest-coverage -fprofile-arcs -fprofile-info-section -c main.c
+$ gcc -ftest-coverage -fprofile-arcs -fprofile-info-section -c app.c
+$ objdump -h app.o=20
+
+app.o:     file format elf64-x86-64
+
+Sections:
+Idx Name          Size      VMA               LMA               File off=
  Algn
+  0 .text         00000151  0000000000000000  0000000000000000  00000040=
  2**0
+                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
+  1 .data         00000100  0000000000000000  0000000000000000  000001a0=
  2**5
+                  CONTENTS, ALLOC, LOAD, RELOC, DATA
+  2 .bss          00000040  0000000000000000  0000000000000000  000002a0=
  2**5
+                  ALLOC
+  3 .rodata       0000003c  0000000000000000  0000000000000000  000002a0=
  2**3
+                  CONTENTS, ALLOC, LOAD, READONLY, DATA
+  4 .gcov_info    00000008  0000000000000000  0000000000000000  000002e0=
  2**3
+                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
+  5 .comment      0000004e  0000000000000000  0000000000000000  000002e8=
  2**0
+                  CONTENTS, READONLY
+  6 .note.GNU-stack 00000000  0000000000000000  0000000000000000  000003=
36  2**0
+                  CONTENTS, READONLY
+  7 .eh_frame     00000058  0000000000000000  0000000000000000  00000338=
  2**3
+                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
+@end smallexample
+
+We have to customize the program link process so that all the @code{.gco=
v_info}
+linker input sections are placed in a continuous memory block with a beg=
in and
+end symbol.  Firstly, get the default linker script using the following
+commands (we assume a GNU linker):
+
+@smallexample
+$ ld --verbose | sed '1,/^=3D=3D=3D/d' | sed '/^=3D=3D=3D/d' > linkcmds
+@end smallexample
+
+Secondly, open the file @file{linkcmds} with a text editor and place the=
 linker
+output section definition from the overview after the @code{.rodata} sec=
tion
+definition.  Link the program executable using the customized linker scr=
ipt:
+
+@smallexample
+$ gcc main.o app.o -T linkcmds -lgcov -Wl,-Map,app.map
+@end smallexample
+
+In the linker map file @file{app.map}, we see that the linker placed the
+read-only pointer size objects of our objects files @file{main.o} and
+@file{app.o} into a continuous memory block and provided the symbols
+@code{__gcov_info_start} and @code{__gcov_info_end}:
+
+@smallexample
+$ grep -C 1 "\.gcov_info" app.map
+
+.gcov_info      0x0000000000403ac0       0x10
+                0x0000000000403ac0                PROVIDE (__gcov_info_s=
tart =3D .)
+ *(.gcov_info)
+ .gcov_info     0x0000000000403ac0        0x8 main.o
+ .gcov_info     0x0000000000403ac8        0x8 app.o
+                0x0000000000403ad0                PROVIDE (__gcov_info_e=
nd =3D .)
+@end smallexample
+
+Make sure no @file{.gcda} files are present.  Run the program with nothi=
ng to
+decode and dump @file{stderr} to the file @file{gcda-0.txt} (first run).=
  Run
+the program to decode @file{gcda-0.txt} and send it to the @command{gcov=
-tool}
+using the @command{merge-stream} subcommand to create the @file{.gcda} f=
iles
+(second run).  Run @command{gcov} to produce a report for @file{app.c}. =
 We see
+that the first run with nothing to decode resulted in a partially covere=
d
+application:
+
+@smallexample
+$ rm -f app.gcda main.gcda
+$ echo "" | ./a.out 2>gcda-0.txt
+$ ./a.out <gcda-0.txt 2>gcda-1.txt | gcov-tool merge-stream
+$ gcov -bc app.c
+File 'app.c'
+Lines executed:69.23% of 13
+Branches executed:66.67% of 6
+Taken at least once:50.00% of 6
+Calls executed:66.67% of 3
+Creating 'app.c.gcov'
+
+Lines executed:69.23% of 13
+@end smallexample
+
+Run the program to decode @file{gcda-1.txt} and send it to the
+@command{gcov-tool} using the @command{merge-stream} subcommand to updat=
e the
+@file{.gcda} files.  Run @command{gcov} to produce a report for @file{ap=
p.c}.
+Since the second run decoded the gcov information of the first run, we h=
ave now
+a fully covered application:
+
+@smallexample
+$ ./a.out <gcda-1.txt 2>gcda-2.txt | gcov-tool merge-stream
+$ gcov -bc app.c
+File 'app.c'
+Lines executed:100.00% of 13
+Branches executed:100.00% of 6
+Taken at least once:100.00% of 6
+Calls executed:100.00% of 3
+Creating 'app.c.gcov'
+
+Lines executed:100.00% of 13
+@end smallexample
+
+@subsection System Initialization Caveats
+
+The gcov information of a translation unit consists of several global da=
ta
+structures.  For example, the instrumented code may update program flow =
graph
+edge counters in a zero-initialized data structure.  It is safe to run
+instrumented code before the zero-initialized data is cleared to zero.  =
The
+coverage information obtained before the zero-initialized data is cleare=
d to
+zero is unusable.  Dumping the gcov information using
+@code{__gcov_info_to_gcda()} before the zero-initialized data is cleared=
 to
+zero or the initialized data is loaded, is undefined behaviour.  Clearin=
g the
+zero-initialized data to zero through a function instrumented for profil=
ing or
+test coverage is undefined behaviour, since it may produce inconsistent =
program
+flow graph edge counters for example.
--=20
2.34.1