public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
@ 2015-03-19 23:22 Sergio Durigan Junior
  2015-03-19 23:22 ` [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter Sergio Durigan Junior
  2015-03-19 23:22 ` [PATCH 2/2] Documentation and testcase Sergio Durigan Junior
  0 siblings, 2 replies; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-19 23:22 UTC (permalink / raw)
  To: GDB Patches; +Cc: Pedro Alves, Sergio Durigan Junior

This version of the patches incorporates changes suggested by Pedro.

I simplified the patch a lot, and withdrawed some parts of it.  For
example, I am not including a new 'enum memory_mapping_state', nor am I
changing the current behavior/implementation of gcore_create_callback.
IOW, I focused my changes on the Linux portion of the patch, and on the
problem that I am trying to solve: make GDB take the value of
/proc/PID/coredump_filter into consideration when dumping a corefile.

As I explained in
<https://sourceware.org/ml/gdb-patches/2015-03/msg00607.html>, I still
think gcore_create_callback and its callers should be updated, but I
am not proposing this here anymore.  Hopefully, this will make the patch
easier to be approved.

Comments?

Sergio Durigan Junior (2):
  Implement support for checking /proc/PID/coredump_filter
  Documentation and testcase

 gdb/doc/gdb.texinfo                        |  33 +++
 gdb/linux-tdep.c                           | 408 +++++++++++++++++++++++++++--
 gdb/testsuite/gdb.base/coredump-filter.c   |  61 +++++
 gdb/testsuite/gdb.base/coredump-filter.exp | 128 +++++++++
 4 files changed, 602 insertions(+), 28 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/coredump-filter.c
 create mode 100644 gdb/testsuite/gdb.base/coredump-filter.exp

-- 
1.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/2] Documentation and testcase
  2015-03-19 23:22 [PATCH v4 0/2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Sergio Durigan Junior
  2015-03-19 23:22 ` [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter Sergio Durigan Junior
@ 2015-03-19 23:22 ` Sergio Durigan Junior
  2015-03-20 19:46   ` Pedro Alves
  1 sibling, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-19 23:22 UTC (permalink / raw)
  To: GDB Patches; +Cc: Pedro Alves, Sergio Durigan Junior

This patch extends the documentation to include the new feature, and
implements a new testcase for it.

Changes from v3:

  - Included 'bit 5' when mentioning the default value of
    /proc/PID/coredump_filter.

gdb/doc/ChangeLog:
2015-03-19  Sergio Durigan Junior  <sergiodj@redhat.com>

	PR corefiles/16092
	* gdb.texinfo (gcore): Mention new command 'set
	use-coredump-filter'.
	(set use-coredump-filter): Document new command.

gdb/testsuite/ChangeLog:
2015-03-19  Sergio Durigan Junior  <sergiodj@redhat.com>

	PR corefiles/16092
	* gdb.base/coredump-filter.c: New file.
	* gdb.base/coredump-filter.exp: Likewise.
---
 gdb/doc/gdb.texinfo                        |  33 ++++++++
 gdb/testsuite/gdb.base/coredump-filter.c   |  61 ++++++++++++++
 gdb/testsuite/gdb.base/coredump-filter.exp | 128 +++++++++++++++++++++++++++++
 3 files changed, 222 insertions(+)
 create mode 100644 gdb/testsuite/gdb.base/coredump-filter.c
 create mode 100644 gdb/testsuite/gdb.base/coredump-filter.exp

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 552da31..5382e91 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -10952,6 +10952,39 @@ specified, the file name defaults to @file{core.@var{pid}}, where
 
 Note that this command is implemented only for some systems (as of
 this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
+
+On @sc{gnu}/Linux, this command can take into account the value of the
+file @file{/proc/@var{pid}/coredump_filter} when generating the core
+dump (@pxref{set use-coredump-filter}).
+
+@kindex set use-coredump-filter
+@anchor{set use-coredump-filter}
+@item set use-coredump-filter on
+@itemx set use-coredump-filter off
+Enable or disable the use of the file
+@file{/proc/@var{pid}/coredump_filter} when generating core dump
+files.  This file is used by the Linux kernel to decide what types of
+memory mappings will be dumped or ignored when generating a core dump
+file.  @var{pid} is the process ID of a currently running process.
+
+
+To make use of this feature, you have to write in the
+@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
+which is a bit mask representing the memory mapping types.  If a bit
+is set in the bit mask, then the memory mappings of the corresponding
+types will be dumped; otherwise, they will be ignored.  For more
+information about the bits that can be set in the
+@file{/proc/@var{pid}/coredump_filter} file, please refer to the
+manpage of @code{core(5)}.
+
+By default, this option is @code{on}.  If this option is turned
+@code{off}, @value{GDBN} does not read the @file{coredump_filter} file
+and instead uses the same default value as the Linux kernel in order
+to decide which pages will be dumped in the core dump file.  This
+value is currently @code{0x33}, which means that bits @code{0}
+(anonymous private mappings), @code{1} (anonymous shared mappings),
+@code{4} (ELF headers) and @code{5} (private huge pages) are active.
+This will cause these memory mappings to be dumped automatically.
 @end table
 
 @node Character Sets
diff --git a/gdb/testsuite/gdb.base/coredump-filter.c b/gdb/testsuite/gdb.base/coredump-filter.c
new file mode 100644
index 0000000..192c469
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.c
@@ -0,0 +1,61 @@
+/* Copyright 2015 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <assert.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+
+static void *
+do_mmap (void *addr, size_t size, int prot, int flags, int fd, off_t offset)
+{
+  void *ret = mmap (addr, size, prot, flags, fd, offset);
+
+  assert (ret != NULL);
+  return ret;
+}
+
+int
+main (int argc, char *argv[])
+{
+  const size_t size = 10;
+  const int default_prot = PROT_READ | PROT_WRITE;
+  char *private_anon, *shared_anon;
+  char *dont_dump;
+  int i;
+
+  private_anon = do_mmap (NULL, size, default_prot,
+			  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (private_anon, 0x11, size);
+
+  shared_anon = do_mmap (NULL, size, default_prot,
+			 MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+  memset (shared_anon, 0x22, size);
+
+  dont_dump = do_mmap (NULL, size, default_prot,
+		       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (dont_dump, 0x55, size);
+  i = madvise (dont_dump, size, MADV_DONTDUMP);
+  assert_perror (errno);
+  assert (i == 0);
+
+  return 0; /* break-here */
+}
diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
new file mode 100644
index 0000000..b7eb74d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.exp
@@ -0,0 +1,128 @@
+# Copyright 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile debug] } {
+    untested "could not compile test program"
+    return -1
+}
+
+if { ![runto_main] } {
+    untested "could not run to main"
+    return -1
+}
+
+gdb_breakpoint [gdb_get_line_number "break-here"]
+gdb_continue_to_breakpoint "break-here" ".* break-here .*"
+
+proc do_save_core { filter_flag core ipid } {
+    verbose -log "writing $filter_flag to /proc/$ipid/coredump_filter"
+
+    remote_exec target "sh -c \"echo $filter_flag > /proc/$ipid/coredump_filter\""
+
+    # Generate a corefile
+    gdb_gcore_cmd "$core" "save corefile"
+}
+
+proc do_load_and_test_core { core var working_var working_value } {
+    global hex decimal addr
+
+    set core_loaded [gdb_core_cmd "$core" "load core"]
+    if { $core_loaded == -1 } {
+	fail "loading $core"
+	return
+    }
+
+    # The variables are 'char', and using it here would be OK because
+    # GDB actually reads the contents of the memory (i.e., it
+    # dereferences the pointer).  However, to make it clear that we
+    # are interested not in the pointer itself, but in the memory it
+    # points to, we are using '*(unsigned int *)'.
+    gdb_test "print *(unsigned int *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
+	"printing $var when core is loaded (should not work)"
+    gdb_test "print/x *(unsigned int *) $addr($working_var)" " = $working_value.*" \
+	"print/x *$working_var ( = $working_value)"
+}
+
+set non_private_anon_core [standard_output_file non-private-anon.gcore]
+set non_shared_anon_core [standard_output_file non-shared-anon.gcore]
+set dont_dump_core [standard_output_file dont-dump.gcore]
+
+# We will generate a few corefiles.
+#
+# This list is composed by sub-lists, and their elements are (in
+# order):
+#
+# - name of the test
+# - hexadecimal value to be put in the /proc/PID/coredump_filter file
+# - name of the variable that contains the name of the corefile to be
+#   generated (including the initial $).
+# - name of the variable in the C source code that points to the
+#   memory mapping that will NOT be present in the corefile.
+# - name of a variable in the C source code that points to a memory
+#   mapping that WILL be present in the corefile
+# - corresponding value expected for the above variable
+
+set all_corefiles { { "non-Private-Anonymous" "0x7e" \
+			  $non_private_anon_core \
+			  "private_anon" \
+			  "shared_anon" "0x22" }
+    { "non-Shared-Anonymous" "0x7d" \
+	  $non_shared_anon_core "shared_anon" \
+	  "private_anon" "0x11" }
+    { "DoNotDump" "0x33" \
+	  $dont_dump_core "dont_dump" \
+	  "shared_anon" "0x22" } }
+
+set core_supported [gdb_gcore_cmd "$non_private_anon_core" "save a corefile"]
+if { !$core_supported } {
+    untested "corefile generation is not supported"
+    return -1
+}
+
+# Getting the inferior's PID
+set infpid ""
+gdb_test_multiple "info inferiors" "getting inferior pid" {
+    -re "process \($decimal\).*\r\n$gdb_prompt $" {
+	set infpid $expect_out(1,string)
+    }
+}
+
+foreach item $all_corefiles {
+    foreach name [list [lindex $item 3] [lindex $item 4]] {
+	set test "print/x $name"
+	gdb_test_multiple $test $test {
+	    -re " = \($hex\)\r\n$gdb_prompt $" {
+		set addr($name) $expect_out(1,string)
+	    }
+	}
+    }
+}
+
+foreach item $all_corefiles {
+    with_test_prefix "saving corefile for [lindex $item 0]" {
+	do_save_core [lindex $item 1] [subst [lindex $item 2]] $infpid
+    }
+}
+
+clean_restart $testfile
+
+foreach item $all_corefiles {
+    with_test_prefix "loading and testing corefile for [lindex $item 0]" {
+	do_load_and_test_core [subst [lindex $item 2]] [lindex $item 3] \
+	    [lindex $item 4] [lindex $item 5]
+    }
+}
-- 
1.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter
  2015-03-19 23:22 [PATCH v4 0/2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Sergio Durigan Junior
@ 2015-03-19 23:22 ` Sergio Durigan Junior
  2015-03-20 19:12   ` Pedro Alves
  2015-03-19 23:22 ` [PATCH 2/2] Documentation and testcase Sergio Durigan Junior
  1 sibling, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-19 23:22 UTC (permalink / raw)
  To: GDB Patches; +Cc: Pedro Alves, Sergio Durigan Junior

This is the core patch.  It implements the support for honoring the
/proc/PID/coredump_filter file.

Changes from v3:

  - Removed the new 'enum memory_mapping_state'; now
    linux_find_memory_regions_full is either calling
    gcore_create_callback when the mapping should be dumped (with
    'modified = 1'), or not calling the function at all.

  - Reviewed comments.  Added new variable 'should_dump_p', and
    removed 'modified' from linux_find_memory_regions_full.

gdb/ChangeLog:
2015-03-19  Sergio Durigan Junior  <sergiodj@redhat.com>
	    Jan Kratochvil  <jan.kratochvil@redhat.com>
	    Oleg Nesterov  <oleg@redhat.com>

	PR corefiles/16092
	* linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'.
	New enum identifying the various options of the coredump_filter
	file.
	(struct smaps_vmflags): New struct.
	(use_coredump_filter): New variable.
	(decode_vmflags): New function.
	(mapping_is_anonymous_p): Likewise.
	(dump_mapping_p): Likewise.
	(linux_find_memory_regions_full): New variables
	'coredumpfilter_name', 'coredumpfilterdata', 'pid', 'filterflags',
	'should_dump_p'.  Removed variable 'modified'.  Read
	/proc/<PID>/smaps file; improve parsing of its information.
	Implement memory mapping filtering based on its contents.
	(show_use_coredump_filter): New function.
	(_initialize_linux_tdep): New command 'set use-coredump-filter'.
---
 gdb/linux-tdep.c | 408 +++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 380 insertions(+), 28 deletions(-)

diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index ea0d4cd..3831580 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -35,9 +35,61 @@
 #include "observer.h"
 #include "objfiles.h"
 #include "infcall.h"
+#include "gdbcmd.h"
+#include "gdb_regex.h"
 
 #include <ctype.h>
 
+/* This enum represents the values that the user can choose when
+   informing the Linux kernel about which memory mappings will be
+   dumped in a corefile.  They are described in the file
+   Documentation/filesystems/proc.txt, inside the Linux kernel
+   tree.  */
+
+enum
+  {
+    COREFILTER_ANON_PRIVATE = 1 << 0,
+    COREFILTER_ANON_SHARED = 1 << 1,
+    COREFILTER_MAPPED_PRIVATE = 1 << 2,
+    COREFILTER_MAPPED_SHARED = 1 << 3,
+    COREFILTER_ELF_HEADERS = 1 << 4,
+    COREFILTER_HUGETLB_PRIVATE = 1 << 5,
+    COREFILTER_HUGETLB_SHARED = 1 << 6,
+  };
+
+/* This struct is used to map flags found in the "VmFlags:" field (in
+   the /proc/<PID>/smaps file).  */
+
+struct smaps_vmflags
+  {
+    /* Zero if this structure has not been initialized yet.  It
+       probably means that the Linux kernel being used does not emit
+       the "VmFlags:" field on "/proc/PID/smaps".  */
+
+    unsigned int initialized_p : 1;
+
+    /* Memory mapped I/O area (VM_IO, "io").  */
+
+    unsigned int io_page : 1;
+
+    /* Area uses huge TLB pages (VM_HUGETLB, "ht").  */
+
+    unsigned int uses_huge_tlb : 1;
+
+    /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd").  */
+
+    unsigned int exclude_coredump : 1;
+
+    /* Is this a MAP_SHARED mapping (VM_SHARED, "sh").  */
+
+    unsigned int shared_mapping : 1;
+  };
+
+/* Whether to take the /proc/PID/coredump_filter into account when
+   generating a corefile.  */
+
+static int use_coredump_filter = 1;
+
 /* This enum represents the signals' numbers on a generic architecture
    running the Linux kernel.  The definition of "generic" comes from
    the file <include/uapi/asm-generic/signal.h>, from the Linux kernel
@@ -381,6 +433,220 @@ read_mapping (const char *line,
   *filename = p;
 }
 
+/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
+
+   This function was based on the documentation found on
+   <Documentation/filesystems/proc.txt>, on the Linux kernel.
+
+   Linux kernels before commit
+   834f82e2aa9a8ede94b17b656329f850c1471514 do not have this field on
+   smaps.  */
+
+static void
+decode_vmflags (char *p, struct smaps_vmflags *v)
+{
+  char *saveptr;
+  const char *s;
+
+  v->initialized_p = 1;
+  p = skip_to_space (p);
+  p = skip_spaces (p);
+
+  for (s = strtok_r (p, " ", &saveptr);
+       s != NULL;
+       s = strtok_r (NULL, " ", &saveptr))
+    {
+      if (strcmp (s, "io") == 0)
+	v->io_page = 1;
+      else if (strcmp (s, "ht") == 0)
+	v->uses_huge_tlb = 1;
+      else if (strcmp (s, "dd") == 0)
+	v->exclude_coredump = 1;
+      else if (strcmp (s, "sh") == 0)
+	v->shared_mapping = 1;
+    }
+}
+
+/* Return 1 if the memory mapping is anonymous, 0 otherwise.
+
+   FILENAME is the name of the file present in the first line of the
+   memory mapping, in the "/proc/PID/smaps" output.  For example, if
+   the first line is:
+
+   7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770   /path/to/file
+
+   Then FILENAME will be "/path/to/file".  */
+
+static int
+mapping_is_anonymous_p (const char *filename)
+{
+  static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex;
+  static int init_regex_p = 0;
+
+  if (!init_regex_p)
+    {
+      struct cleanup *c = make_cleanup (null_cleanup, NULL);
+
+      /* Let's be pessimistic and assume there will be an error while
+	 compiling the regex'es.  */
+      init_regex_p = -1;
+
+      /* DEV_ZERO_REGEX matches "/dev/zero" filenames (with or
+	 without the "(deleted)" string in the end).  We know for
+	 sure, based on the Linux kernel code, that memory mappings
+	 whose associated filename is "/dev/zero" are guaranteed to be
+	 MAP_ANONYMOUS.  */
+      compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match /dev/zero "
+			     "filename"));
+      /* SHMEM_FILE_REGEX matches "/SYSV%08x" filenames (with or
+	 without the "(deleted)" string in the end).  These filenames
+	 refer to shared memory (shmem), and memory mappings
+	 associated with them are MAP_ANONYMOUS as well.  */
+      compile_rx_or_error (&shmem_file_regex,
+			   "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match shmem "
+			     "filenames"));
+      /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the
+	 Linux kernel's 'n_link == 0' code, which is responsible to
+	 decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS'
+	 mapping.  In other words, if FILE_DELETED_REGEX matches, it
+	 does not necessarily mean that we are dealing with an
+	 anonymous shared mapping.  However, there is no easy way to
+	 detect this currently, so this is the best approximation we
+	 have.
+
+	 As a result, GDB will dump readonly pages of deleted
+	 executables when using the default value of coredump_filter
+	 (0x33), while the Linux kernel will not dump those pages.
+	 But we can live with that.  */
+      compile_rx_or_error (&file_deleted_regex, " (deleted)$",
+			   _("Could not compile regex to match "
+			     "'<file> (deleted)'"));
+      /* We will never release these regexes, so just discard the
+	 cleanups.  */
+      discard_cleanups (c);
+
+      /* If we reached this point, then everything succeeded.  */
+      init_regex_p = 1;
+    }
+
+  if (init_regex_p == -1)
+    {
+      /* There was an error while compiling the regex'es above.  In
+	 order to try to give some reliable information to the caller,
+	 we just try to find the string "(deleted)" in the filename.
+	 If we managed to find it, then we assume the mapping is
+	 anonymous.  */
+      return strstr (filename, "(deleted)") != NULL;
+    }
+
+  if (*filename == '\0'
+      || regexec (&dev_zero_regex, filename, 0, NULL, 0) == 0
+      || regexec (&shmem_file_regex, filename, 0, NULL, 0) == 0
+      || regexec (&file_deleted_regex, filename, 0, NULL, 0) == 0)
+    return 1;
+
+  return 0;
+}
+
+/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
+   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
+   greater than 0 if it should.
+
+   In a nutshell, this is the logic that we follow in order to decide
+   if a mapping should be dumped or not.
+
+   - If the mapping is associated to a file whose name ends with "
+     (deleted)", or if the file is "/dev/zero", or if it is
+     "/SYSV%08x" (shared memory), or if there is no file associated
+     with it, or if the AnonHugePages: or the Anonymous: fields in the
+     /proc/PID/smaps have contents, then GDB considers this mapping to
+     be anonymous.  Otherwise, GDB considers this mapping to be a
+     file-backed mapping (because there will be a file associated with
+     it).
+ 
+     It is worth mentioning that, from all those checks described
+     above, the most fragile is the one to see if the file name ends
+     with " (deleted)".  This does not necessarily mean that the
+     mapping is anonymous, because the deleted file associated with
+     the mapping may have been a hard link to another file, for
+     example.  The Linux kernel checks to see if "i_nlink == 0", but
+     GDB cannot easily do this check.  Therefore, we made a compromise
+     here, and we assume that if the file name ends with " (deleted)",
+     then the mapping is indeed anonymous.  FWIW, this is something
+     the Linux kernel could do better: expose this information in a
+     more direct way.
+ 
+   - If we see the flag "sh" in the "VmFlags:" field (in
+     /proc/PID/smaps), then certainly the memory mapping is shared
+     (VM_SHARED).  If we have access to the VmFlags, and we don't see
+     the "sh" there, then certainly the mapping is private.  However,
+     Linux kernels before commit
+     834f82e2aa9a8ede94b17b656329f850c1471514 do not have the
+     "VmFlags:" field; in that case, we use another heuristic: if we
+     see 'p' in the permission flags, then we assume that the mapping
+     is private, even though the presence of the 's' flag there would
+     mean VM_MAYSHARE, which means the mapping could still be private.
+     This should work OK enough, however.  */
+
+static int
+dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
+		int maybe_private_p, int mapping_anon_p, const char *filename)
+{
+  /* Initially, we trust in what we received from outside.  This value
+     may not be very precise (i.e., it was probably gathered from the
+     permission line in the /proc/PID/smaps list, which actually
+     refers to VM_MAYSHARE, and not VM_SHARED), but it is what we have
+     for now.  */
+  int private_p = maybe_private_p;
+
+  /* We always dump vDSO and vsyscall mappings.  */
+  if (strcmp ("[vdso]", filename) == 0
+      || strcmp ("[vsyscall]", filename) == 0)
+    return 1;
+
+  if (v->initialized_p)
+    {
+      /* We never dump I/O mappings.  */
+      if (v->io_page)
+	return 0;
+
+      /* Check if we should exclude this mapping.  */
+      if (v->exclude_coredump)
+	return 0;
+
+      /* Update our notion of whether this mapping is shared or
+	 private based on a trustworthy value.  */
+      private_p = !v->shared_mapping;
+
+      /* HugeTLB checking.  */
+      if (v->uses_huge_tlb)
+	{
+	  if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE))
+	      || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED)))
+	    return 1;
+
+	  return 0;
+	}
+    }
+
+  if (private_p)
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_PRIVATE) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_PRIVATE) != 0;
+    }
+  else
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_SHARED) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_SHARED) != 0;
+    }
+}
+
 /* Implement the "info proc" command.  */
 
 static void
@@ -819,48 +1085,86 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 				void *obfd)
 {
   char mapsfilename[100];
-  char *data;
+  char coredumpfilter_name[100];
+  char *data, *coredumpfilterdata;
+  pid_t pid;
+  /* Default dump behavior of coredump_filter (0x33), according to
+     Documentation/filesystems/proc.txt from the Linux kernel
+     tree.  */
+  unsigned int filterflags = (COREFILTER_ANON_PRIVATE
+			      | COREFILTER_ANON_SHARED
+			      | COREFILTER_ELF_HEADERS
+			      | COREFILTER_HUGETLB_PRIVATE);
 
   /* We need to know the real target PID to access /proc.  */
   if (current_inferior ()->fake_pid_p)
     return 1;
 
-  xsnprintf (mapsfilename, sizeof mapsfilename,
-	     "/proc/%d/smaps", current_inferior ()->pid);
+  pid = current_inferior ()->pid;
+
+  if (use_coredump_filter)
+    {
+      xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name),
+		 "/proc/%d/coredump_filter", pid);
+      coredumpfilterdata = target_fileio_read_stralloc (coredumpfilter_name);
+      if (coredumpfilterdata != NULL)
+	{
+	  sscanf (coredumpfilterdata, "%x", &filterflags);
+	  xfree (coredumpfilterdata);
+	}
+    }
+
+  xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid);
   data = target_fileio_read_stralloc (mapsfilename);
   if (data == NULL)
     {
       /* Older Linux kernels did not support /proc/PID/smaps.  */
-      xsnprintf (mapsfilename, sizeof mapsfilename,
-		 "/proc/%d/maps", current_inferior ()->pid);
+      xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid);
       data = target_fileio_read_stralloc (mapsfilename);
     }
-  if (data)
+
+  if (data != NULL)
     {
       struct cleanup *cleanup = make_cleanup (xfree, data);
-      char *line;
+      char *line, *t;
 
-      line = strtok (data, "\n");
-      while (line)
+      line = strtok_r (data, "\n", &t);
+      while (line != NULL)
 	{
 	  ULONGEST addr, endaddr, offset, inode;
 	  const char *permissions, *device, *filename;
+	  struct smaps_vmflags v;
 	  size_t permissions_len, device_len;
-	  int read, write, exec;
-	  int modified = 0, has_anonymous = 0;
+	  int read, write, exec, private;
+	  int has_anonymous = 0;
+	  int should_dump_p = 0;
+	  int mapping_anon_p;
 
+	  memset (&v, 0, sizeof (v));
 	  read_mapping (line, &addr, &endaddr, &permissions, &permissions_len,
 			&offset, &device, &device_len, &inode, &filename);
+	  mapping_anon_p = mapping_is_anonymous_p (filename);
 
 	  /* Decode permissions.  */
 	  read = (memchr (permissions, 'r', permissions_len) != 0);
 	  write = (memchr (permissions, 'w', permissions_len) != 0);
 	  exec = (memchr (permissions, 'x', permissions_len) != 0);
-
-	  /* Try to detect if region was modified by parsing smaps counters.  */
-	  for (line = strtok (NULL, "\n");
-	       line && line[0] >= 'A' && line[0] <= 'Z';
-	       line = strtok (NULL, "\n"))
+	  /* 'private' here actually means VM_MAYSHARE, and not
+	     VM_SHARED.  In order to know if a mapping is really
+	     private or not, we must check the flag "sh" in the
+	     VmFlags field.  This is done by decode_vmflags.  However,
+	     if we are using a Linux kernel released before the commit
+	     834f82e2aa9a8ede94b17b656329f850c1471514, we will not
+	     have the VmFlags there.  In this case, there is really no
+	     way to know if we are dealing with VM_SHARED, so we just
+	     assume that VM_MAYSHARE is enough.  */
+	  private = memchr (permissions, 'p', permissions_len) != 0;
+
+	  /* Try to detect if region should be dumped by parsing smaps
+	     counters.  */
+	  for (line = strtok_r (NULL, "\n", &t);
+	       line != NULL && line[0] >= 'A' && line[0] <= 'Z';
+	       line = strtok_r (NULL, "\n", &t))
 	    {
 	      char keyword[64 + 1];
 
@@ -869,11 +1173,17 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 		  warning (_("Error parsing {s,}maps file '%s'"), mapsfilename);
 		  break;
 		}
+
 	      if (strcmp (keyword, "Anonymous:") == 0)
-		has_anonymous = 1;
-	      if (strcmp (keyword, "Shared_Dirty:") == 0
-		  || strcmp (keyword, "Private_Dirty:") == 0
-		  || strcmp (keyword, "Swap:") == 0
+		{
+		  /* Older Linux kernels did not support the
+		     "Anonymous:" counter.  Check it here.  */
+		  has_anonymous = 1;
+		}
+	      else if (strcmp (keyword, "VmFlags:") == 0)
+		decode_vmflags (line, &v);
+
+	      if (strcmp (keyword, "AnonHugePages:") == 0
 		  || strcmp (keyword, "Anonymous:") == 0)
 		{
 		  unsigned long number;
@@ -884,19 +1194,38 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 			       mapsfilename);
 		      break;
 		    }
-		  if (number != 0)
-		    modified = 1;
+		  if (number > 0)
+		    {
+		      /* Even if we are dealing with a file-backed
+			 mapping, if it contains anonymous pages we
+			 consider it to be an anonymous mapping,
+			 because this is what the Linux kernel does:
+
+			 // Dump segments that have been written to.
+			 if (vma->anon_vma && FILTER(ANON_PRIVATE))
+			 	goto whole;
+		      */
+		      mapping_anon_p = 1;
+		    }
 		}
 	    }
 
-	  /* Older Linux kernels did not support the "Anonymous:" counter.
-	     If it is missing, we can't be sure - dump all the pages.  */
-	  if (!has_anonymous)
-	    modified = 1;
+	  if (has_anonymous)
+	    should_dump_p = dump_mapping_p (filterflags, &v, private,
+					    mapping_anon_p, filename);
+	  else
+	    {
+	      /* Older Linux kernels did not support the "Anonymous:" counter.
+		 If it is missing, we can't be sure - dump all the pages.  */
+	      should_dump_p = 1;
+	    }
 
 	  /* Invoke the callback function to create the corefile segment.  */
-	  func (addr, endaddr - addr, offset, inode,
-		read, write, exec, modified, filename, obfd);
+	  if (should_dump_p)
+	    func (addr, endaddr - addr, offset, inode,
+		  read, write, exec, 1, /* MODIFIED is true because we
+					   want to dump the mapping.  */
+		  filename, obfd);
 	}
 
       do_cleanups (cleanup);
@@ -1972,6 +2301,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot)
   return retval;
 }
 
+/* Display whether the gcore command is using the
+   /proc/PID/coredump_filter file.  */
+
+static void
+show_use_coredump_filter (struct ui_file *file, int from_tty,
+			  struct cmd_list_element *c, const char *value)
+{
+  fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to generate"
+			    " corefiles is %s.\n"), value);
+}
+
 /* To be called from the various GDB_OSABI_LINUX handlers for the
    various GNU/Linux architectures and machine types.  */
 
@@ -2008,4 +2348,16 @@ _initialize_linux_tdep (void)
   /* Observers used to invalidate the cache when needed.  */
   observer_attach_inferior_exit (invalidate_linux_cache_inf);
   observer_attach_inferior_appeared (invalidate_linux_cache_inf);
+
+  add_setshow_boolean_cmd ("use-coredump-filter", class_files,
+			   &use_coredump_filter, _("\
+Set whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Show whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Use this command to set whether gcore should consider the contents\n\
+of /proc/PID/coredump_filter when generating the corefile.  For more information\n\
+about this file, refer to the manpage of core(5)."),
+			   NULL, show_use_coredump_filter,
+			   &setlist, &showlist);
 }
-- 
1.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter
  2015-03-19 23:22 ` [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter Sergio Durigan Junior
@ 2015-03-20 19:12   ` Pedro Alves
  2015-03-20 20:02     ` Sergio Durigan Junior
  0 siblings, 1 reply; 20+ messages in thread
From: Pedro Alves @ 2015-03-20 19:12 UTC (permalink / raw)
  To: Sergio Durigan Junior, GDB Patches

Thanks, this looks good to me now, just a few minor nits.

On 03/19/2015 11:22 PM, Sergio Durigan Junior wrote:

> +/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
> +
> +   This function was based on the documentation found on
> +   <Documentation/filesystems/proc.txt>, on the Linux kernel.
> +
> +   Linux kernels before commit
> +   834f82e2aa9a8ede94b17b656329f850c1471514 do not have this field on
> +   smaps.  */

Thanks for the version info.  Ideally, what I'd like with these version notes
is at any point in the time in the future being able to look at the code
and easily identify whether we're working around an old enough kernel that
we potentially can stop worrying about backwards compatibility with
such kernel.  If easy, could you also add the kernel version, like
e.g., "$hash (x.y.z)"?

> +  if (init_regex_p == -1)
> +    {
> +      /* There was an error while compiling the regex'es above.  In
> +	 order to try to give some reliable information to the caller,
> +	 we just try to find the string "(deleted)" in the filename.
> +	 If we managed to find it, then we assume the mapping is
> +	 anonymous.  */
> +      return strstr (filename, "(deleted)") != NULL;

Might as well add the space before parens, and match tail
end only, like e.g.,:

   const char deleted[] = " (deleted)";
   size_t del_len = strlen (deleted);
   size_t filename_len = strlen (filename);

   return (filename_len >= del_len
           && strcmp (filename + filename_len - del_len, deleted) == 0);

> +
> +/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
> +   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
> +   greater than 0 if it should.
> +
> +   In a nutshell, this is the logic that we follow in order to decide
> +   if a mapping should be dumped or not.
> +
> +   - If the mapping is associated to a file whose name ends with "
> +     (deleted)"

Nit: it'd be good to avoid breaking " (deleted)" over two lines.

        > , or if the file is "/dev/zero", or if it is
> +     "/SYSV%08x" (shared memory), or if there is no file associated
> +     with it, or if the AnonHugePages: or the Anonymous: fields in the
> +     /proc/PID/smaps have contents, then GDB considers this mapping to
> +     be anonymous.  Otherwise, GDB considers this mapping to be a
> +     file-backed mapping (because there will be a file associated with
> +     it).
> + 
> +     It is worth mentioning that, from all those checks described
> +     above, the most fragile is the one to see if the file name ends
> +     with " (deleted)".  This does not necessarily mean that the
> +     mapping is anonymous, because the deleted file associated with
> +     the mapping may have been a hard link to another file, for
> +     example.  The Linux kernel checks to see if "i_nlink == 0", but
> +     GDB cannot easily do this check.  Therefore, we made a compromise

Cannot do it easily, or cannot do it at all?

> +     here, and we assume that if the file name ends with " (deleted)",
> +     then the mapping is indeed anonymous.  FWIW, this is something
> +     the Linux kernel could do better: expose this information in a
> +     more direct way.
> + 
> +   - If we see the flag "sh" in the "VmFlags:" field (in
> +     /proc/PID/smaps), then certainly the memory mapping is shared
> +     (VM_SHARED).  If we have access to the VmFlags, and we don't see
> +     the "sh" there, then certainly the mapping is private.  However,
> +     Linux kernels before commit
> +     834f82e2aa9a8ede94b17b656329f850c1471514 do not have the
> +     "VmFlags:" field; in that case, we use another heuristic: if we
> +     see 'p' in the permission flags, then we assume that the mapping
> +     is private, even though the presence of the 's' flag there would
> +     mean VM_MAYSHARE, which means the mapping could still be private.
> +     This should work OK enough, however.  */
> +
> +static int
> +dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
> +		int maybe_private_p, int mapping_anon_p, const char *filename)
> +{
> +  /* Initially, we trust in what we received from outside.  This value

outside == the caller?  or the kernel?  The former, right?

> +     may not be very precise (i.e., it was probably gathered from the
> +     permission line in the /proc/PID/smaps list, which actually
> +     refers to VM_MAYSHARE, and not VM_SHARED), but it is what we have
> +     for now.  */

"for now" refers to the flag being updated further down, or future
kernel versions?

> +  int private_p = maybe_private_p;
> +
> +  /* We always dump vDSO and vsyscall mappings.  */

Suggest:

   /* We always dump vDSO and vsyscall mappings, because it's likely that
      there'll be no file to read the contents from at core load time.
      The kernel does the same.  */

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-19 23:22 ` [PATCH 2/2] Documentation and testcase Sergio Durigan Junior
@ 2015-03-20 19:46   ` Pedro Alves
  2015-03-20 21:03     ` Sergio Durigan Junior
  0 siblings, 1 reply; 20+ messages in thread
From: Pedro Alves @ 2015-03-20 19:46 UTC (permalink / raw)
  To: Sergio Durigan Junior, GDB Patches

On 03/19/2015 11:22 PM, Sergio Durigan Junior wrote:

> ---
>  gdb/doc/gdb.texinfo                        |  33 ++++++++
>  gdb/testsuite/gdb.base/coredump-filter.c   |  61 ++++++++++++++
>  gdb/testsuite/gdb.base/coredump-filter.exp | 128 +++++++++++++++++++++++++++++
>  3 files changed, 222 insertions(+)
>  create mode 100644 gdb/testsuite/gdb.base/coredump-filter.c
>  create mode 100644 gdb/testsuite/gdb.base/coredump-filter.exp
> 
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 552da31..5382e91 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -10952,6 +10952,39 @@ specified, the file name defaults to @file{core.@var{pid}}, where
>  
>  Note that this command is implemented only for some systems (as of
>  this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
> +
> +On @sc{gnu}/Linux, this command can take into account the value of the
> +file @file{/proc/@var{pid}/coredump_filter} when generating the core
> +dump (@pxref{set use-coredump-filter}).
> +
> +@kindex set use-coredump-filter
> +@anchor{set use-coredump-filter}
> +@item set use-coredump-filter on
> +@itemx set use-coredump-filter off
> +Enable or disable the use of the file
> +@file{/proc/@var{pid}/coredump_filter} when generating core dump
> +files.  This file is used by the Linux kernel to decide what types of
> +memory mappings will be dumped or ignored when generating a core dump
> +file.  @var{pid} is the process ID of a currently running process.
> +
> +
> +To make use of this feature, you have to write in the
> +@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
> +which is a bit mask representing the memory mapping types.  If a bit
> +is set in the bit mask, then the memory mappings of the corresponding
> +types will be dumped; otherwise, they will be ignored.  For more
> +information about the bits that can be set in the
> +@file{/proc/@var{pid}/coredump_filter} file, please refer to the
> +manpage of @code{core(5)}.

Might be good to mention that the settings are inherited by child
processes.  Reading this, I thought "wow, do I really need to
set every time I'm debugging a new pid/process?"

> +    # The variables are 'char', and using it here would be OK because
> +    # GDB actually reads the contents of the memory (i.e., it
> +    # dereferences the pointer).  However, to make it clear that we
> +    # are interested not in the pointer itself, but in the memory it
> +    # points to, we are using '*(unsigned int *)'.
> +    gdb_test "print *(unsigned int *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
> +	"printing $var when core is loaded (should not work)"
> +    gdb_test "print/x *(unsigned int *) $addr($working_var)" " = $working_value.*" \
> +	"print/x *$working_var ( = $working_value)"

This comment still gave me pause.  The variables are
'char *' not 'char':

  char *private_anon, *shared_anon;
  char *dont_dump;

so I guess you're referring to the issue that plain "print" would
assume they are strings and thus deference the pointer, right?

I honestly think that all that just distracts from what
we're doing.  Why not just:

   # Access the memory the addresses point to.
   gdb_test "print *(char *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \

I would never ever think to do:

   gdb_test "print (char *) $addr($var)"

to test the contents of what addr points to.  IOW, reading

   # Access the memory the addresses point to.
   gdb_test "print *(char *) $addr($var)" ...

I'd never really wonder why the leftmost '*' is in there.  It's super
obvious.

Maybe even throw in an /x for super clarity:

   gdb_test "print /x *(char *) $addr($var)" ...


> +set all_corefiles { { "non-Private-Anonymous" "0x7e" \
> +			  $non_private_anon_core \
> +			  "private_anon" \
> +			  "shared_anon" "0x22" }
> +    { "non-Shared-Anonymous" "0x7d" \
> +	  $non_shared_anon_core "shared_anon" \
> +	  "private_anon" "0x11" }
> +    { "DoNotDump" "0x33" \
> +	  $dont_dump_core "dont_dump" \
> +	  "shared_anon" "0x22" } }

Does this cover the case of making sure we don't dump file-based
regions?  That's important.

If not (I assume not), we could test that by loading the core
into gdb, but _not_ the program, and then disassembling a function's
address.  It should fail.  Then load the program and disassemble
again.  It should work now.  Or something along those lines.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter
  2015-03-20 19:12   ` Pedro Alves
@ 2015-03-20 20:02     ` Sergio Durigan Junior
  2015-03-20 22:02       ` Pedro Alves
  0 siblings, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-20 20:02 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Friday, March 20 2015, Pedro Alves wrote:

> Thanks, this looks good to me now, just a few minor nits.

Thanks for the review again!

> On 03/19/2015 11:22 PM, Sergio Durigan Junior wrote:
>
>> +/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
>> +
>> +   This function was based on the documentation found on
>> +   <Documentation/filesystems/proc.txt>, on the Linux kernel.
>> +
>> +   Linux kernels before commit
>> +   834f82e2aa9a8ede94b17b656329f850c1471514 do not have this field on
>> +   smaps.  */
>
> Thanks for the version info.  Ideally, what I'd like with these version notes
> is at any point in the time in the future being able to look at the code
> and easily identify whether we're working around an old enough kernel that
> we potentially can stop worrying about backwards compatibility with
> such kernel.  If easy, could you also add the kernel version, like
> e.g., "$hash (x.y.z)"?

Yeah.  'git tag --contains 834f82e2aa9a8ede94b17b656329f850c1471514'
shows me that the first release to contain this commit was 3.10.  So I'm
putting it there.

>> +  if (init_regex_p == -1)
>> +    {
>> +      /* There was an error while compiling the regex'es above.  In
>> +	 order to try to give some reliable information to the caller,
>> +	 we just try to find the string "(deleted)" in the filename.
>> +	 If we managed to find it, then we assume the mapping is
>> +	 anonymous.  */
>> +      return strstr (filename, "(deleted)") != NULL;
>
> Might as well add the space before parens, and match tail
> end only, like e.g.,:
>
>    const char deleted[] = " (deleted)";
>    size_t del_len = strlen (deleted);
>    size_t filename_len = strlen (filename);
>
>    return (filename_len >= del_len
>            && strcmp (filename + filename_len - del_len, deleted) == 0);

Done.

>> +
>> +/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
>> +   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
>> +   greater than 0 if it should.
>> +
>> +   In a nutshell, this is the logic that we follow in order to decide
>> +   if a mapping should be dumped or not.
>> +
>> +   - If the mapping is associated to a file whose name ends with "
>> +     (deleted)"
>
> Nit: it'd be good to avoid breaking " (deleted)" over two lines.

Fixed.

>         > , or if the file is "/dev/zero", or if it is
>> +     "/SYSV%08x" (shared memory), or if there is no file associated
>> +     with it, or if the AnonHugePages: or the Anonymous: fields in the
>> +     /proc/PID/smaps have contents, then GDB considers this mapping to
>> +     be anonymous.  Otherwise, GDB considers this mapping to be a
>> +     file-backed mapping (because there will be a file associated with
>> +     it).
>> + 
>> +     It is worth mentioning that, from all those checks described
>> +     above, the most fragile is the one to see if the file name ends
>> +     with " (deleted)".  This does not necessarily mean that the
>> +     mapping is anonymous, because the deleted file associated with
>> +     the mapping may have been a hard link to another file, for
>> +     example.  The Linux kernel checks to see if "i_nlink == 0", but
>> +     GDB cannot easily do this check.  Therefore, we made a compromise
>
> Cannot do it easily, or cannot do it at all?

It cannot do it easily.  If GDB is run by root, then it could (in
theory) inspect the contents of /proc/PID/map_files and determine
whether the " (deleted)" file was a hard link or not.

>> +     here, and we assume that if the file name ends with " (deleted)",
>> +     then the mapping is indeed anonymous.  FWIW, this is something
>> +     the Linux kernel could do better: expose this information in a
>> +     more direct way.
>> + 
>> +   - If we see the flag "sh" in the "VmFlags:" field (in
>> +     /proc/PID/smaps), then certainly the memory mapping is shared
>> +     (VM_SHARED).  If we have access to the VmFlags, and we don't see
>> +     the "sh" there, then certainly the mapping is private.  However,
>> +     Linux kernels before commit
>> +     834f82e2aa9a8ede94b17b656329f850c1471514 do not have the
>> +     "VmFlags:" field; in that case, we use another heuristic: if we
>> +     see 'p' in the permission flags, then we assume that the mapping
>> +     is private, even though the presence of the 's' flag there would
>> +     mean VM_MAYSHARE, which means the mapping could still be private.
>> +     This should work OK enough, however.  */
>> +
>> +static int
>> +dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
>> +		int maybe_private_p, int mapping_anon_p, const char *filename)
>> +{
>> +  /* Initially, we trust in what we received from outside.  This value
>
> outside == the caller?  or the kernel?  The former, right?

Yes, the former.  I improved the comment.

>> +     may not be very precise (i.e., it was probably gathered from the
>> +     permission line in the /proc/PID/smaps list, which actually
>> +     refers to VM_MAYSHARE, and not VM_SHARED), but it is what we have
>> +     for now.  */
>
> "for now" refers to the flag being updated further down, or future
> kernel versions?

The former.  I improved the comment.

>> +  int private_p = maybe_private_p;
>> +
>> +  /* We always dump vDSO and vsyscall mappings.  */
>
> Suggest:
>
>    /* We always dump vDSO and vsyscall mappings, because it's likely that
>       there'll be no file to read the contents from at core load time.
>       The kernel does the same.  */

Right, improved the comment with you suggestion.

> Thanks,
> Pedro Alves

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

gdb/ChangeLog:
2015-03-20  Sergio Durigan Junior  <sergiodj@redhat.com>
	    Jan Kratochvil  <jan.kratochvil@redhat.com>
	    Oleg Nesterov  <oleg@redhat.com>

	PR corefiles/16092
	* linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'.
	New enum identifying the various options of the coredump_filter
	file.
	(struct smaps_vmflags): New struct.
	(use_coredump_filter): New variable.
	(decode_vmflags): New function.
	(mapping_is_anonymous_p): Likewise.
	(dump_mapping_p): Likewise.
	(linux_find_memory_regions_full): New variables
	'coredumpfilter_name', 'coredumpfilterdata', 'pid', 'filterflags',
	'should_dump_p'.  Removed variable 'modified'.  Read
	/proc/<PID>/smaps file; improve parsing of its information.
	Implement memory mapping filtering based on its contents.
	(show_use_coredump_filter): New function.
	(_initialize_linux_tdep): New command 'set use-coredump-filter'.

diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index ea0d4cd..908c42c 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -35,9 +35,61 @@
 #include "observer.h"
 #include "objfiles.h"
 #include "infcall.h"
+#include "gdbcmd.h"
+#include "gdb_regex.h"
 
 #include <ctype.h>
 
+/* This enum represents the values that the user can choose when
+   informing the Linux kernel about which memory mappings will be
+   dumped in a corefile.  They are described in the file
+   Documentation/filesystems/proc.txt, inside the Linux kernel
+   tree.  */
+
+enum
+  {
+    COREFILTER_ANON_PRIVATE = 1 << 0,
+    COREFILTER_ANON_SHARED = 1 << 1,
+    COREFILTER_MAPPED_PRIVATE = 1 << 2,
+    COREFILTER_MAPPED_SHARED = 1 << 3,
+    COREFILTER_ELF_HEADERS = 1 << 4,
+    COREFILTER_HUGETLB_PRIVATE = 1 << 5,
+    COREFILTER_HUGETLB_SHARED = 1 << 6,
+  };
+
+/* This struct is used to map flags found in the "VmFlags:" field (in
+   the /proc/<PID>/smaps file).  */
+
+struct smaps_vmflags
+  {
+    /* Zero if this structure has not been initialized yet.  It
+       probably means that the Linux kernel being used does not emit
+       the "VmFlags:" field on "/proc/PID/smaps".  */
+
+    unsigned int initialized_p : 1;
+
+    /* Memory mapped I/O area (VM_IO, "io").  */
+
+    unsigned int io_page : 1;
+
+    /* Area uses huge TLB pages (VM_HUGETLB, "ht").  */
+
+    unsigned int uses_huge_tlb : 1;
+
+    /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd").  */
+
+    unsigned int exclude_coredump : 1;
+
+    /* Is this a MAP_SHARED mapping (VM_SHARED, "sh").  */
+
+    unsigned int shared_mapping : 1;
+  };
+
+/* Whether to take the /proc/PID/coredump_filter into account when
+   generating a corefile.  */
+
+static int use_coredump_filter = 1;
+
 /* This enum represents the signals' numbers on a generic architecture
    running the Linux kernel.  The definition of "generic" comes from
    the file <include/uapi/asm-generic/signal.h>, from the Linux kernel
@@ -381,6 +433,229 @@ read_mapping (const char *line,
   *filename = p;
 }
 
+/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
+
+   This function was based on the documentation found on
+   <Documentation/filesystems/proc.txt>, on the Linux kernel.
+
+   Linux kernels before commit
+   834f82e2aa9a8ede94b17b656329f850c1471514 (3.10) do not have this
+   field on smaps.  */
+
+static void
+decode_vmflags (char *p, struct smaps_vmflags *v)
+{
+  char *saveptr;
+  const char *s;
+
+  v->initialized_p = 1;
+  p = skip_to_space (p);
+  p = skip_spaces (p);
+
+  for (s = strtok_r (p, " ", &saveptr);
+       s != NULL;
+       s = strtok_r (NULL, " ", &saveptr))
+    {
+      if (strcmp (s, "io") == 0)
+	v->io_page = 1;
+      else if (strcmp (s, "ht") == 0)
+	v->uses_huge_tlb = 1;
+      else if (strcmp (s, "dd") == 0)
+	v->exclude_coredump = 1;
+      else if (strcmp (s, "sh") == 0)
+	v->shared_mapping = 1;
+    }
+}
+
+/* Return 1 if the memory mapping is anonymous, 0 otherwise.
+
+   FILENAME is the name of the file present in the first line of the
+   memory mapping, in the "/proc/PID/smaps" output.  For example, if
+   the first line is:
+
+   7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770   /path/to/file
+
+   Then FILENAME will be "/path/to/file".  */
+
+static int
+mapping_is_anonymous_p (const char *filename)
+{
+  static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex;
+  static int init_regex_p = 0;
+
+  if (!init_regex_p)
+    {
+      struct cleanup *c = make_cleanup (null_cleanup, NULL);
+
+      /* Let's be pessimistic and assume there will be an error while
+	 compiling the regex'es.  */
+      init_regex_p = -1;
+
+      /* DEV_ZERO_REGEX matches "/dev/zero" filenames (with or
+	 without the "(deleted)" string in the end).  We know for
+	 sure, based on the Linux kernel code, that memory mappings
+	 whose associated filename is "/dev/zero" are guaranteed to be
+	 MAP_ANONYMOUS.  */
+      compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match /dev/zero "
+			     "filename"));
+      /* SHMEM_FILE_REGEX matches "/SYSV%08x" filenames (with or
+	 without the "(deleted)" string in the end).  These filenames
+	 refer to shared memory (shmem), and memory mappings
+	 associated with them are MAP_ANONYMOUS as well.  */
+      compile_rx_or_error (&shmem_file_regex,
+			   "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match shmem "
+			     "filenames"));
+      /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the
+	 Linux kernel's 'n_link == 0' code, which is responsible to
+	 decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS'
+	 mapping.  In other words, if FILE_DELETED_REGEX matches, it
+	 does not necessarily mean that we are dealing with an
+	 anonymous shared mapping.  However, there is no easy way to
+	 detect this currently, so this is the best approximation we
+	 have.
+
+	 As a result, GDB will dump readonly pages of deleted
+	 executables when using the default value of coredump_filter
+	 (0x33), while the Linux kernel will not dump those pages.
+	 But we can live with that.  */
+      compile_rx_or_error (&file_deleted_regex, " (deleted)$",
+			   _("Could not compile regex to match "
+			     "'<file> (deleted)'"));
+      /* We will never release these regexes, so just discard the
+	 cleanups.  */
+      discard_cleanups (c);
+
+      /* If we reached this point, then everything succeeded.  */
+      init_regex_p = 1;
+    }
+
+  if (init_regex_p == -1)
+    {
+      const char deleted[] = " (deleted)";
+      size_t del_len = sizeof (deleted) - 1;
+      size_t filename_len = strlen (filename);
+
+      /* There was an error while compiling the regex'es above.  In
+	 order to try to give some reliable information to the caller,
+	 we just try to find the string " (deleted)" in the filename.
+	 If we managed to find it, then we assume the mapping is
+	 anonymous.  */
+      return (filename_len >= del_len
+	      && strcmp (filename + filename_len - del_len, deleted) == 0);
+    }
+
+  if (*filename == '\0'
+      || regexec (&dev_zero_regex, filename, 0, NULL, 0) == 0
+      || regexec (&shmem_file_regex, filename, 0, NULL, 0) == 0
+      || regexec (&file_deleted_regex, filename, 0, NULL, 0) == 0)
+    return 1;
+
+  return 0;
+}
+
+/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
+   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
+   greater than 0 if it should.
+
+   In a nutshell, this is the logic that we follow in order to decide
+   if a mapping should be dumped or not.
+
+   - If the mapping is associated to a file whose name ends with
+     " (deleted)", or if the file is "/dev/zero", or if it is
+     "/SYSV%08x" (shared memory), or if there is no file associated
+     with it, or if the AnonHugePages: or the Anonymous: fields in the
+     /proc/PID/smaps have contents, then GDB considers this mapping to
+     be anonymous.  Otherwise, GDB considers this mapping to be a
+     file-backed mapping (because there will be a file associated with
+     it).
+ 
+     It is worth mentioning that, from all those checks described
+     above, the most fragile is the one to see if the file name ends
+     with " (deleted)".  This does not necessarily mean that the
+     mapping is anonymous, because the deleted file associated with
+     the mapping may have been a hard link to another file, for
+     example.  The Linux kernel checks to see if "i_nlink == 0", but
+     GDB cannot easily do this check.  Therefore, we made a compromise
+     here, and we assume that if the file name ends with " (deleted)",
+     then the mapping is indeed anonymous.  FWIW, this is something
+     the Linux kernel could do better: expose this information in a
+     more direct way.
+ 
+   - If we see the flag "sh" in the "VmFlags:" field (in
+     /proc/PID/smaps), then certainly the memory mapping is shared
+     (VM_SHARED).  If we have access to the VmFlags, and we don't see
+     the "sh" there, then certainly the mapping is private.  However,
+     Linux kernels before commit
+     834f82e2aa9a8ede94b17b656329f850c1471514 (3.10) do not have the
+     "VmFlags:" field; in that case, we use another heuristic: if we
+     see 'p' in the permission flags, then we assume that the mapping
+     is private, even though the presence of the 's' flag there would
+     mean VM_MAYSHARE, which means the mapping could still be private.
+     This should work OK enough, however.  */
+
+static int
+dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
+		int maybe_private_p, int mapping_anon_p, const char *filename)
+{
+  /* Initially, we trust in what we received from our caller.  This
+     value may not be very precise (i.e., it was probably gathered
+     from the permission line in the /proc/PID/smaps list, which
+     actually refers to VM_MAYSHARE, and not VM_SHARED), but it is
+     what we have until we take a look at the "VmFlags:" field
+     (assuming that the version of the Linux kernel being used
+     supports it, of course).  */
+  int private_p = maybe_private_p;
+
+  /* We always dump vDSO and vsyscall mappings, because it's likely that
+     there'll be no file to read the contents from at core load time.
+     The kernel does the same.  */
+  if (strcmp ("[vdso]", filename) == 0
+      || strcmp ("[vsyscall]", filename) == 0)
+    return 1;
+
+  if (v->initialized_p)
+    {
+      /* We never dump I/O mappings.  */
+      if (v->io_page)
+	return 0;
+
+      /* Check if we should exclude this mapping.  */
+      if (v->exclude_coredump)
+	return 0;
+
+      /* Update our notion of whether this mapping is shared or
+	 private based on a trustworthy value.  */
+      private_p = !v->shared_mapping;
+
+      /* HugeTLB checking.  */
+      if (v->uses_huge_tlb)
+	{
+	  if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE))
+	      || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED)))
+	    return 1;
+
+	  return 0;
+	}
+    }
+
+  if (private_p)
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_PRIVATE) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_PRIVATE) != 0;
+    }
+  else
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_SHARED) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_SHARED) != 0;
+    }
+}
+
 /* Implement the "info proc" command.  */
 
 static void
@@ -819,48 +1094,86 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 				void *obfd)
 {
   char mapsfilename[100];
-  char *data;
+  char coredumpfilter_name[100];
+  char *data, *coredumpfilterdata;
+  pid_t pid;
+  /* Default dump behavior of coredump_filter (0x33), according to
+     Documentation/filesystems/proc.txt from the Linux kernel
+     tree.  */
+  unsigned int filterflags = (COREFILTER_ANON_PRIVATE
+			      | COREFILTER_ANON_SHARED
+			      | COREFILTER_ELF_HEADERS
+			      | COREFILTER_HUGETLB_PRIVATE);
 
   /* We need to know the real target PID to access /proc.  */
   if (current_inferior ()->fake_pid_p)
     return 1;
 
-  xsnprintf (mapsfilename, sizeof mapsfilename,
-	     "/proc/%d/smaps", current_inferior ()->pid);
+  pid = current_inferior ()->pid;
+
+  if (use_coredump_filter)
+    {
+      xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name),
+		 "/proc/%d/coredump_filter", pid);
+      coredumpfilterdata = target_fileio_read_stralloc (coredumpfilter_name);
+      if (coredumpfilterdata != NULL)
+	{
+	  sscanf (coredumpfilterdata, "%x", &filterflags);
+	  xfree (coredumpfilterdata);
+	}
+    }
+
+  xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid);
   data = target_fileio_read_stralloc (mapsfilename);
   if (data == NULL)
     {
       /* Older Linux kernels did not support /proc/PID/smaps.  */
-      xsnprintf (mapsfilename, sizeof mapsfilename,
-		 "/proc/%d/maps", current_inferior ()->pid);
+      xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid);
       data = target_fileio_read_stralloc (mapsfilename);
     }
-  if (data)
+
+  if (data != NULL)
     {
       struct cleanup *cleanup = make_cleanup (xfree, data);
-      char *line;
+      char *line, *t;
 
-      line = strtok (data, "\n");
-      while (line)
+      line = strtok_r (data, "\n", &t);
+      while (line != NULL)
 	{
 	  ULONGEST addr, endaddr, offset, inode;
 	  const char *permissions, *device, *filename;
+	  struct smaps_vmflags v;
 	  size_t permissions_len, device_len;
-	  int read, write, exec;
-	  int modified = 0, has_anonymous = 0;
+	  int read, write, exec, private;
+	  int has_anonymous = 0;
+	  int should_dump_p = 0;
+	  int mapping_anon_p;
 
+	  memset (&v, 0, sizeof (v));
 	  read_mapping (line, &addr, &endaddr, &permissions, &permissions_len,
 			&offset, &device, &device_len, &inode, &filename);
+	  mapping_anon_p = mapping_is_anonymous_p (filename);
 
 	  /* Decode permissions.  */
 	  read = (memchr (permissions, 'r', permissions_len) != 0);
 	  write = (memchr (permissions, 'w', permissions_len) != 0);
 	  exec = (memchr (permissions, 'x', permissions_len) != 0);
-
-	  /* Try to detect if region was modified by parsing smaps counters.  */
-	  for (line = strtok (NULL, "\n");
-	       line && line[0] >= 'A' && line[0] <= 'Z';
-	       line = strtok (NULL, "\n"))
+	  /* 'private' here actually means VM_MAYSHARE, and not
+	     VM_SHARED.  In order to know if a mapping is really
+	     private or not, we must check the flag "sh" in the
+	     VmFlags field.  This is done by decode_vmflags.  However,
+	     if we are using a Linux kernel released before the commit
+	     834f82e2aa9a8ede94b17b656329f850c1471514 (3.10), we will
+	     not have the VmFlags there.  In this case, there is
+	     really no way to know if we are dealing with VM_SHARED,
+	     so we just assume that VM_MAYSHARE is enough.  */
+	  private = memchr (permissions, 'p', permissions_len) != 0;
+
+	  /* Try to detect if region should be dumped by parsing smaps
+	     counters.  */
+	  for (line = strtok_r (NULL, "\n", &t);
+	       line != NULL && line[0] >= 'A' && line[0] <= 'Z';
+	       line = strtok_r (NULL, "\n", &t))
 	    {
 	      char keyword[64 + 1];
 
@@ -869,11 +1182,17 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 		  warning (_("Error parsing {s,}maps file '%s'"), mapsfilename);
 		  break;
 		}
+
 	      if (strcmp (keyword, "Anonymous:") == 0)
-		has_anonymous = 1;
-	      if (strcmp (keyword, "Shared_Dirty:") == 0
-		  || strcmp (keyword, "Private_Dirty:") == 0
-		  || strcmp (keyword, "Swap:") == 0
+		{
+		  /* Older Linux kernels did not support the
+		     "Anonymous:" counter.  Check it here.  */
+		  has_anonymous = 1;
+		}
+	      else if (strcmp (keyword, "VmFlags:") == 0)
+		decode_vmflags (line, &v);
+
+	      if (strcmp (keyword, "AnonHugePages:") == 0
 		  || strcmp (keyword, "Anonymous:") == 0)
 		{
 		  unsigned long number;
@@ -884,19 +1203,38 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 			       mapsfilename);
 		      break;
 		    }
-		  if (number != 0)
-		    modified = 1;
+		  if (number > 0)
+		    {
+		      /* Even if we are dealing with a file-backed
+			 mapping, if it contains anonymous pages we
+			 consider it to be an anonymous mapping,
+			 because this is what the Linux kernel does:
+
+			 // Dump segments that have been written to.
+			 if (vma->anon_vma && FILTER(ANON_PRIVATE))
+			 	goto whole;
+		      */
+		      mapping_anon_p = 1;
+		    }
 		}
 	    }
 
-	  /* Older Linux kernels did not support the "Anonymous:" counter.
-	     If it is missing, we can't be sure - dump all the pages.  */
-	  if (!has_anonymous)
-	    modified = 1;
+	  if (has_anonymous)
+	    should_dump_p = dump_mapping_p (filterflags, &v, private,
+					    mapping_anon_p, filename);
+	  else
+	    {
+	      /* Older Linux kernels did not support the "Anonymous:" counter.
+		 If it is missing, we can't be sure - dump all the pages.  */
+	      should_dump_p = 1;
+	    }
 
 	  /* Invoke the callback function to create the corefile segment.  */
-	  func (addr, endaddr - addr, offset, inode,
-		read, write, exec, modified, filename, obfd);
+	  if (should_dump_p)
+	    func (addr, endaddr - addr, offset, inode,
+		  read, write, exec, 1, /* MODIFIED is true because we
+					   want to dump the mapping.  */
+		  filename, obfd);
 	}
 
       do_cleanups (cleanup);
@@ -1972,6 +2310,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot)
   return retval;
 }
 
+/* Display whether the gcore command is using the
+   /proc/PID/coredump_filter file.  */
+
+static void
+show_use_coredump_filter (struct ui_file *file, int from_tty,
+			  struct cmd_list_element *c, const char *value)
+{
+  fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to generate"
+			    " corefiles is %s.\n"), value);
+}
+
 /* To be called from the various GDB_OSABI_LINUX handlers for the
    various GNU/Linux architectures and machine types.  */
 
@@ -2008,4 +2357,16 @@ _initialize_linux_tdep (void)
   /* Observers used to invalidate the cache when needed.  */
   observer_attach_inferior_exit (invalidate_linux_cache_inf);
   observer_attach_inferior_appeared (invalidate_linux_cache_inf);
+
+  add_setshow_boolean_cmd ("use-coredump-filter", class_files,
+			   &use_coredump_filter, _("\
+Set whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Show whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Use this command to set whether gcore should consider the contents\n\
+of /proc/PID/coredump_filter when generating the corefile.  For more information\n\
+about this file, refer to the manpage of core(5)."),
+			   NULL, show_use_coredump_filter,
+			   &setlist, &showlist);
 }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-20 19:46   ` Pedro Alves
@ 2015-03-20 21:03     ` Sergio Durigan Junior
  2015-03-20 22:09       ` Pedro Alves
  0 siblings, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-20 21:03 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Friday, March 20 2015, Pedro Alves wrote:

> On 03/19/2015 11:22 PM, Sergio Durigan Junior wrote:
>
>> ---
>>  gdb/doc/gdb.texinfo                        |  33 ++++++++
>>  gdb/testsuite/gdb.base/coredump-filter.c   |  61 ++++++++++++++
>>  gdb/testsuite/gdb.base/coredump-filter.exp | 128 +++++++++++++++++++++++++++++
>>  3 files changed, 222 insertions(+)
>>  create mode 100644 gdb/testsuite/gdb.base/coredump-filter.c
>>  create mode 100644 gdb/testsuite/gdb.base/coredump-filter.exp
>> 
>> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
>> index 552da31..5382e91 100644
>> --- a/gdb/doc/gdb.texinfo
>> +++ b/gdb/doc/gdb.texinfo
>> @@ -10952,6 +10952,39 @@ specified, the file name defaults to @file{core.@var{pid}}, where
>>  
>>  Note that this command is implemented only for some systems (as of
>>  this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
>> +
>> +On @sc{gnu}/Linux, this command can take into account the value of the
>> +file @file{/proc/@var{pid}/coredump_filter} when generating the core
>> +dump (@pxref{set use-coredump-filter}).
>> +
>> +@kindex set use-coredump-filter
>> +@anchor{set use-coredump-filter}
>> +@item set use-coredump-filter on
>> +@itemx set use-coredump-filter off
>> +Enable or disable the use of the file
>> +@file{/proc/@var{pid}/coredump_filter} when generating core dump
>> +files.  This file is used by the Linux kernel to decide what types of
>> +memory mappings will be dumped or ignored when generating a core dump
>> +file.  @var{pid} is the process ID of a currently running process.
>> +
>> +
>> +To make use of this feature, you have to write in the
>> +@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
>> +which is a bit mask representing the memory mapping types.  If a bit
>> +is set in the bit mask, then the memory mappings of the corresponding
>> +types will be dumped; otherwise, they will be ignored.  For more
>> +information about the bits that can be set in the
>> +@file{/proc/@var{pid}/coredump_filter} file, please refer to the
>> +manpage of @code{core(5)}.
>
> Might be good to mention that the settings are inherited by child
> processes.  Reading this, I thought "wow, do I really need to
> set every time I'm debugging a new pid/process?"

OK, I included a line mentioning this.

>> +    # The variables are 'char', and using it here would be OK because
>> +    # GDB actually reads the contents of the memory (i.e., it
>> +    # dereferences the pointer).  However, to make it clear that we
>> +    # are interested not in the pointer itself, but in the memory it
>> +    # points to, we are using '*(unsigned int *)'.
>> +    gdb_test "print *(unsigned int *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
>> +	"printing $var when core is loaded (should not work)"
>> +    gdb_test "print/x *(unsigned int *) $addr($working_var)" " = $working_value.*" \
>> +	"print/x *$working_var ( = $working_value)"
>
> This comment still gave me pause.  The variables are
> 'char *' not 'char':
>
>   char *private_anon, *shared_anon;
>   char *dont_dump;
>
> so I guess you're referring to the issue that plain "print" would
> assume they are strings and thus deference the pointer, right?

Exactly.

> I honestly think that all that just distracts from what
> we're doing.  Why not just:
>
>    # Access the memory the addresses point to.
>    gdb_test "print *(char *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
>
> I would never ever think to do:
>
>    gdb_test "print (char *) $addr($var)"
>
> to test the contents of what addr points to.  IOW, reading
>
>    # Access the memory the addresses point to.
>    gdb_test "print *(char *) $addr($var)" ...
>
> I'd never really wonder why the leftmost '*' is in there.  It's super
> obvious.
>
> Maybe even throw in an /x for super clarity:
>
>    gdb_test "print /x *(char *) $addr($var)" ...

Yeah, maybe you're right, I think we've got too concerned about
something not really important here.

I replaced the comment by the one you proposed, and used the "print/x"
syntax.

>> +set all_corefiles { { "non-Private-Anonymous" "0x7e" \
>> +			  $non_private_anon_core \
>> +			  "private_anon" \
>> +			  "shared_anon" "0x22" }
>> +    { "non-Shared-Anonymous" "0x7d" \
>> +	  $non_shared_anon_core "shared_anon" \
>> +	  "private_anon" "0x11" }
>> +    { "DoNotDump" "0x33" \
>> +	  $dont_dump_core "dont_dump" \
>> +	  "shared_anon" "0x22" } }
>
> Does this cover the case of making sure we don't dump file-based
> regions?  That's important.

No, it doesn't cover file-backed mappings.  I didn't want to create
yet-another-file during the test.

> If not (I assume not), we could test that by loading the core
> into gdb, but _not_ the program, and then disassembling a function's
> address.  It should fail.  Then load the program and disassemble
> again.  It should work now.  Or something along those lines.

Hm, OK.  I guess I will try this approach, and if it doesn't happen then
I will see about doing a regular file-backed mapping.

I'll submit another revision of the series when I have something.

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter
  2015-03-20 20:02     ` Sergio Durigan Junior
@ 2015-03-20 22:02       ` Pedro Alves
  2015-03-23 18:40         ` Sergio Durigan Junior
  0 siblings, 1 reply; 20+ messages in thread
From: Pedro Alves @ 2015-03-20 22:02 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches

On 03/20/2015 08:02 PM, Sergio Durigan Junior wrote:
>>> >> +     It is worth mentioning that, from all those checks described
>>> >> +     above, the most fragile is the one to see if the file name ends
>>> >> +     with " (deleted)".  This does not necessarily mean that the
>>> >> +     mapping is anonymous, because the deleted file associated with
>>> >> +     the mapping may have been a hard link to another file, for
>>> >> +     example.  The Linux kernel checks to see if "i_nlink == 0", but
>>> >> +     GDB cannot easily do this check.  Therefore, we made a compromise
>> >
>> > Cannot do it easily, or cannot do it at all?
> It cannot do it easily.  If GDB is run by root, then it could (in
> theory) inspect the contents of /proc/PID/map_files and determine
> whether the " (deleted)" file was a hard link or not.

Ah, could you include that in the comment please?
Otherwise every time I read that I'll wonder what wasn't easy.  :-)

Like:

"(..) example.  The Linux kernel checks to see if "i_nlink == 0", but
GDB cannot easily (and normally) do this check (iff running as root,
it could find the mapping in /proc/PID/map_files/ and determine
whether there still are other hard links to the inode/file).
Therefore, we made a compromise (...)"


I quickly tried something like that with "/proc/pid/fd/*", and
unfortunately the link count is always 1, at least on F20's kernel:

$ tail > a&
[1] 12343

[1]+  Stopped                 tail > a
$ ls -l a
-rw-rw-r--. 1 pedro pedro 0 Mar 20 21:29 a
$ ln a b
$ ls -l a
-rw-rw-r--. 2 pedro pedro 0 Mar 20 21:29 a
$ ls -l /proc/12343/fd/1
l-wx------. 1 pedro pedro 64 Mar 20 21:30 /proc/12343/fd/1 -> /tmp/a

If map_files/ works the same, I think we'd have to get the
file's inode number, and look over the whole filesystem for
other files with that inode number (the equivalent
of 'ls -i' 'find / -inum').  Urgh!

Anyway, with the comment tweak, the patch looks good to me.  Thanks
again for doing this.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-20 21:03     ` Sergio Durigan Junior
@ 2015-03-20 22:09       ` Pedro Alves
  2015-03-22 20:46         ` Sergio Durigan Junior
  0 siblings, 1 reply; 20+ messages in thread
From: Pedro Alves @ 2015-03-20 22:09 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches

On 03/20/2015 09:03 PM, Sergio Durigan Junior wrote:
>> > If not (I assume not), we could test that by loading the core
>> > into gdb, but _not_ the program, and then disassembling a function's
>> > address.  It should fail.  Then load the program and disassemble
>> > again.  It should work now.  Or something along those lines.
> Hm, OK.  I guess I will try this approach, and if it doesn't happen then
> I will see about doing a regular file-backed mapping.

Thanks.  If that approach doesn't work (for some odd reason), let's
try to come up with another approach that still makes sure that the
program's .text is not dumped, instead of mapping some other file.
It's important that we don't regress that specifically.

> I'll submit another revision of the series when I have something.

Thanks.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-20 22:09       ` Pedro Alves
@ 2015-03-22 20:46         ` Sergio Durigan Junior
  2015-03-23 20:27           ` Pedro Alves
  0 siblings, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-22 20:46 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Friday, March 20 2015, Pedro Alves wrote:

> On 03/20/2015 09:03 PM, Sergio Durigan Junior wrote:
>>> > If not (I assume not), we could test that by loading the core
>>> > into gdb, but _not_ the program, and then disassembling a function's
>>> > address.  It should fail.  Then load the program and disassemble
>>> > again.  It should work now.  Or something along those lines.
>> Hm, OK.  I guess I will try this approach, and if it doesn't happen then
>> I will see about doing a regular file-backed mapping.
>
> Thanks.  If that approach doesn't work (for some odd reason), let's
> try to come up with another approach that still makes sure that the
> program's .text is not dumped, instead of mapping some other file.
> It's important that we don't regress that specifically.

So, here's what I came up with.  I decided to submit it here, and not
create another revision of the patch series.  Hope that's not a
problem.  (I can certainly send a separate series if you want).

I am now creating corefiles that exclude the file-backed bits as well,
and testing according to your instructions.  I tried to make it as clear
as possible in the .exp code.

WDYT?

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 552da31..bbbc5d4 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -10952,6 +10952,39 @@ specified, the file name defaults to @file{core.@var{pid}}, where
 
 Note that this command is implemented only for some systems (as of
 this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
+
+On @sc{gnu}/Linux, this command can take into account the value of the
+file @file{/proc/@var{pid}/coredump_filter} when generating the core
+dump (@pxref{set use-coredump-filter}).
+
+@kindex set use-coredump-filter
+@anchor{set use-coredump-filter}
+@item set use-coredump-filter on
+@itemx set use-coredump-filter off
+Enable or disable the use of the file
+@file{/proc/@var{pid}/coredump_filter} when generating core dump
+files.  This file is used by the Linux kernel to decide what types of
+memory mappings will be dumped or ignored when generating a core dump
+file.  @var{pid} is the process ID of a currently running process.
+
+To make use of this feature, you have to write in the
+@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
+which is a bit mask representing the memory mapping types.  If a bit
+is set in the bit mask, then the memory mappings of the corresponding
+types will be dumped; otherwise, they will be ignored.  This
+configuration is inherited by child processes.  For more information
+about the bits that can be set in the
+@file{/proc/@var{pid}/coredump_filter} file, please refer to the
+manpage of @code{core(5)}.
+
+By default, this option is @code{on}.  If this option is turned
+@code{off}, @value{GDBN} does not read the @file{coredump_filter} file
+and instead uses the same default value as the Linux kernel in order
+to decide which pages will be dumped in the core dump file.  This
+value is currently @code{0x33}, which means that bits @code{0}
+(anonymous private mappings), @code{1} (anonymous shared mappings),
+@code{4} (ELF headers) and @code{5} (private huge pages) are active.
+This will cause these memory mappings to be dumped automatically.
 @end table
 
 @node Character Sets
diff --git a/gdb/testsuite/gdb.base/coredump-filter.c b/gdb/testsuite/gdb.base/coredump-filter.c
new file mode 100644
index 0000000..192c469
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.c
@@ -0,0 +1,61 @@
+/* Copyright 2015 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <assert.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+
+static void *
+do_mmap (void *addr, size_t size, int prot, int flags, int fd, off_t offset)
+{
+  void *ret = mmap (addr, size, prot, flags, fd, offset);
+
+  assert (ret != NULL);
+  return ret;
+}
+
+int
+main (int argc, char *argv[])
+{
+  const size_t size = 10;
+  const int default_prot = PROT_READ | PROT_WRITE;
+  char *private_anon, *shared_anon;
+  char *dont_dump;
+  int i;
+
+  private_anon = do_mmap (NULL, size, default_prot,
+			  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (private_anon, 0x11, size);
+
+  shared_anon = do_mmap (NULL, size, default_prot,
+			 MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+  memset (shared_anon, 0x22, size);
+
+  dont_dump = do_mmap (NULL, size, default_prot,
+		       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (dont_dump, 0x55, size);
+  i = madvise (dont_dump, size, MADV_DONTDUMP);
+  assert_perror (errno);
+  assert (i == 0);
+
+  return 0; /* break-here */
+}
diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
new file mode 100644
index 0000000..2512761
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.exp
@@ -0,0 +1,210 @@
+# Copyright 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile debug] } {
+    untested "could not compile test program"
+    return -1
+}
+
+if { ![runto_main] } {
+    untested "could not run to main"
+    return -1
+}
+
+gdb_breakpoint [gdb_get_line_number "break-here"]
+gdb_continue_to_breakpoint "break-here" ".* break-here .*"
+
+proc do_save_core { filter_flag core ipid } {
+    verbose -log "writing $filter_flag to /proc/$ipid/coredump_filter"
+
+    remote_exec target "sh -c \"echo $filter_flag > /proc/$ipid/coredump_filter\""
+
+    # Generate a corefile
+    gdb_gcore_cmd "$core" "save corefile"
+}
+
+proc do_load_and_test_core { core var working_var working_value } {
+    global hex decimal addr
+
+    set core_loaded [gdb_core_cmd "$core" "load core"]
+    if { $core_loaded == -1 } {
+	fail "loading $core"
+	return
+    }
+
+    # Accessing the memory the addresses point to.
+    gdb_test "print/x *(char *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
+	"printing $var when core is loaded (should not work)"
+    gdb_test "print/x *(char *) $addr($working_var)" " = $working_value.*" \
+	"print/x *$working_var ( = $working_value)"
+}
+
+# We do not do file-backed mappings in the test program, but it is
+# important to test this anyway.  One way of performing the test is to
+# load GDB with a corefile but without a binary, and then ask for the
+# disassemble of a function (i.e., the binary's .text section).  GDB
+# should fail in this case.  However, it must succeed if the binary is
+# provided along with the corefile.  This is what we test here.
+
+proc test_disasm { core address } {
+    global testfile
+
+    # Restarting GDB without loading the binary
+    gdb_exit
+    gdb_start
+
+    set core_loaded [gdb_core_cmd "$core" "load core"]
+    if { $core_loaded == -1 } {
+	fail "loading $core"
+	return
+    }
+
+    gdb_test "disassemble $address" "No function contains specified address." \
+	"disassemble function with corefile and without a binary"
+
+    clean_restart $testfile
+
+    set core_loaded [gdb_core_cmd "$core" "load core"]
+    if { $core_loaded == -1 } {
+	fail "loading $core"
+	return
+    }
+
+    gdb_test "disassemble $address" "Dump of assembler code for function.*" \
+	"disassemble function with corefile and with a binary"
+}
+
+set non_private_anon_core [standard_output_file non-private-anon.gcore]
+set non_shared_anon_core [standard_output_file non-shared-anon.gcore]
+set non_private_file_core [standard_output_file non-private-file.gcore]
+set non_shared_file_core [standard_output_file non-shared-file.gcore]
+set dont_dump_core [standard_output_file dont-dump.gcore]
+
+# We will generate a few corefiles.
+#
+# This list is composed by sub-lists, and their elements are (in
+# order):
+#
+# - name of the test
+# - hexadecimal value to be put in the /proc/PID/coredump_filter file
+# - name of the variable that contains the name of the corefile to be
+#   generated (including the initial $).
+# - name of the variable in the C source code that points to the
+#   memory mapping that will NOT be present in the corefile.
+# - name of a variable in the C source code that points to a memory
+#   mapping that WILL be present in the corefile
+# - corresponding value expected for the above variable
+#
+# This list refers to the corefiles generated by MAP_ANONYMOUS in the
+# test program.
+
+set all_anon_corefiles { { "non-Private-Anonymous" "0x7e" \
+			       $non_private_anon_core \
+			       "private_anon" \
+			       "shared_anon" "0x22" }
+    { "non-Shared-Anonymous" "0x7d" \
+	  $non_shared_anon_core "shared_anon" \
+	  "private_anon" "0x11" }
+    { "DoNotDump" "0x33" \
+	  $dont_dump_core "dont_dump" \
+	  "shared_anon" "0x22" } }
+
+# We do not do file-backed mappings in the test program, but we still
+# generate coredumps to test it.  See the procedure 'test_disasm'
+# above for details.
+#
+# This list is composed by:
+#
+# - name of the test
+# - hexadecimal value to be put in the /proc/PID/coredump_filter file
+# - name of the variable that contains the name of the corefile to be
+#   generated (including the initial $).
+
+set all_file_corefiles { { "non-Private-File-Backed" "0x7b" \
+			       $non_private_file_core }
+    { "non-Shared-File-Backed" "0x77" \
+	  $non_shared_file_core } }
+
+# If corefile loading is not supported, we do not even try to run the
+# tests.
+set core_supported [gdb_gcore_cmd "$non_private_anon_core" "save a corefile"]
+if { !$core_supported } {
+    untested "corefile generation is not supported"
+    return -1
+}
+
+# Getting the inferior's PID
+set infpid ""
+gdb_test_multiple "info inferiors" "getting inferior pid" {
+    -re "process \($decimal\).*\r\n$gdb_prompt $" {
+	set infpid $expect_out(1,string)
+    }
+}
+
+# Get the main function's address
+set main_addr ""
+gdb_test_multiple "print/x &main" "getting main's address" {
+    -re "$decimal = \($hex\)\r\n$gdb_prompt $" {
+	set main_addr $expect_out(1,string)
+    }
+}
+
+# Obtaining the address of each variable that will be checked on each
+# case.
+foreach item $all_anon_corefiles {
+    foreach name [list [lindex $item 3] [lindex $item 4]] {
+	set test "print/x $name"
+	gdb_test_multiple $test $test {
+	    -re " = \($hex\)\r\n$gdb_prompt $" {
+		set addr($name) $expect_out(1,string)
+	    }
+	}
+    }
+}
+
+# Generate corefiles for the "anon" case.
+foreach item $all_anon_corefiles {
+    with_test_prefix "saving corefile for [lindex $item 0]" {
+	do_save_core [lindex $item 1] [subst [lindex $item 2]] $infpid
+    }
+}
+
+# Generate corefiles for the "file" case.
+foreach item $all_file_corefiles {
+    with_test_prefix "saving corefile for [lindex $item 0]" {
+	do_save_core [lindex $item 1] [subst [lindex $item 2]] $infpid
+    }
+}
+
+clean_restart $testfile
+
+foreach item $all_anon_corefiles {
+    with_test_prefix "loading and testing corefile for [lindex $item 0]" {
+	do_load_and_test_core [subst [lindex $item 2]] [lindex $item 3] \
+	    [lindex $item 4] [lindex $item 5]
+    }
+
+    with_test_prefix "disassembling function main for [lindex $item 0]" {
+	test_disasm [subst [lindex $item 2]] $main_addr
+    }
+}
+
+foreach item $all_file_corefiles {
+    with_test_prefix "disassembling function main for [lindex $item 0]" {
+	test_disasm [subst [lindex $item 2]] $main_addr
+    }
+}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter
  2015-03-20 22:02       ` Pedro Alves
@ 2015-03-23 18:40         ` Sergio Durigan Junior
  2015-03-23 20:36           ` Pedro Alves
  0 siblings, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-23 18:40 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Friday, March 20 2015, Pedro Alves wrote:

> On 03/20/2015 08:02 PM, Sergio Durigan Junior wrote:
>>>> >> +     It is worth mentioning that, from all those checks described
>>>> >> +     above, the most fragile is the one to see if the file name ends
>>>> >> +     with " (deleted)".  This does not necessarily mean that the
>>>> >> +     mapping is anonymous, because the deleted file associated with
>>>> >> +     the mapping may have been a hard link to another file, for
>>>> >> +     example.  The Linux kernel checks to see if "i_nlink == 0", but
>>>> >> +     GDB cannot easily do this check.  Therefore, we made a compromise
>>> >
>>> > Cannot do it easily, or cannot do it at all?
>> It cannot do it easily.  If GDB is run by root, then it could (in
>> theory) inspect the contents of /proc/PID/map_files and determine
>> whether the " (deleted)" file was a hard link or not.
>
> Ah, could you include that in the comment please?
> Otherwise every time I read that I'll wonder what wasn't easy.  :-)
>
> Like:
>
> "(..) example.  The Linux kernel checks to see if "i_nlink == 0", but
> GDB cannot easily (and normally) do this check (iff running as root,
> it could find the mapping in /proc/PID/map_files/ and determine
> whether there still are other hard links to the inode/file).
> Therefore, we made a compromise (...)"

Done.  Patch attached.  I will wait until you look at the second patch
(and approve it) in order to commit everything together.

> I quickly tried something like that with "/proc/pid/fd/*", and
> unfortunately the link count is always 1, at least on F20's kernel:
>
> $ tail > a&
> [1] 12343
>
> [1]+  Stopped                 tail > a
> $ ls -l a
> -rw-rw-r--. 1 pedro pedro 0 Mar 20 21:29 a
> $ ln a b
> $ ls -l a
> -rw-rw-r--. 2 pedro pedro 0 Mar 20 21:29 a
> $ ls -l /proc/12343/fd/1
> l-wx------. 1 pedro pedro 64 Mar 20 21:30 /proc/12343/fd/1 -> /tmp/a
>
> If map_files/ works the same, I think we'd have to get the
> file's inode number, and look over the whole filesystem for
> other files with that inode number (the equivalent
> of 'ls -i' 'find / -inum').  Urgh!

Jan has provided a good example of it before.  When you use stat -L, you
can see if the 'Links' count is zero (which means the real file has been
really deleted).  For example:

  lrw-------. 1 sergio sergio 64 Mar 23 14:34 7f2c5911b000-7f2c5911c000 -> /tmp/t2 (deleted)
  [root@bla map_files]# stat -L 7f2c5911b000-7f2c5911c000 
    File: ‘7f2c5911b000-7f2c5911c000’
    Size: 0               Blocks: 0          IO Block: 4096   regular empty file
  Device: 23h/35d         Inode: 217633770   Links: 1

The 'Links' is 1 because, despite the fact that 't2' has been deleted,
the original file 't1' still exists.  However, if I remove 't1':

  [root@bla map_files]# stat -L 7f2c5911b000-7f2c5911c000 
    File: ‘7f2c5911b000-7f2c5911c000’
    Size: 0               Blocks: 0          IO Block: 4096   regular empty file
  Device: 23h/35d         Inode: 217633770   Links: 0

Therefore, it should be possible to use stat even in this case.  But
anyway, this was just FYI, we're not doing that.

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index ea0d4cd..26edd5d 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -35,9 +35,61 @@
 #include "observer.h"
 #include "objfiles.h"
 #include "infcall.h"
+#include "gdbcmd.h"
+#include "gdb_regex.h"
 
 #include <ctype.h>
 
+/* This enum represents the values that the user can choose when
+   informing the Linux kernel about which memory mappings will be
+   dumped in a corefile.  They are described in the file
+   Documentation/filesystems/proc.txt, inside the Linux kernel
+   tree.  */
+
+enum
+  {
+    COREFILTER_ANON_PRIVATE = 1 << 0,
+    COREFILTER_ANON_SHARED = 1 << 1,
+    COREFILTER_MAPPED_PRIVATE = 1 << 2,
+    COREFILTER_MAPPED_SHARED = 1 << 3,
+    COREFILTER_ELF_HEADERS = 1 << 4,
+    COREFILTER_HUGETLB_PRIVATE = 1 << 5,
+    COREFILTER_HUGETLB_SHARED = 1 << 6,
+  };
+
+/* This struct is used to map flags found in the "VmFlags:" field (in
+   the /proc/<PID>/smaps file).  */
+
+struct smaps_vmflags
+  {
+    /* Zero if this structure has not been initialized yet.  It
+       probably means that the Linux kernel being used does not emit
+       the "VmFlags:" field on "/proc/PID/smaps".  */
+
+    unsigned int initialized_p : 1;
+
+    /* Memory mapped I/O area (VM_IO, "io").  */
+
+    unsigned int io_page : 1;
+
+    /* Area uses huge TLB pages (VM_HUGETLB, "ht").  */
+
+    unsigned int uses_huge_tlb : 1;
+
+    /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd").  */
+
+    unsigned int exclude_coredump : 1;
+
+    /* Is this a MAP_SHARED mapping (VM_SHARED, "sh").  */
+
+    unsigned int shared_mapping : 1;
+  };
+
+/* Whether to take the /proc/PID/coredump_filter into account when
+   generating a corefile.  */
+
+static int use_coredump_filter = 1;
+
 /* This enum represents the signals' numbers on a generic architecture
    running the Linux kernel.  The definition of "generic" comes from
    the file <include/uapi/asm-generic/signal.h>, from the Linux kernel
@@ -381,6 +433,231 @@ read_mapping (const char *line,
   *filename = p;
 }
 
+/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
+
+   This function was based on the documentation found on
+   <Documentation/filesystems/proc.txt>, on the Linux kernel.
+
+   Linux kernels before commit
+   834f82e2aa9a8ede94b17b656329f850c1471514 (3.10) do not have this
+   field on smaps.  */
+
+static void
+decode_vmflags (char *p, struct smaps_vmflags *v)
+{
+  char *saveptr;
+  const char *s;
+
+  v->initialized_p = 1;
+  p = skip_to_space (p);
+  p = skip_spaces (p);
+
+  for (s = strtok_r (p, " ", &saveptr);
+       s != NULL;
+       s = strtok_r (NULL, " ", &saveptr))
+    {
+      if (strcmp (s, "io") == 0)
+	v->io_page = 1;
+      else if (strcmp (s, "ht") == 0)
+	v->uses_huge_tlb = 1;
+      else if (strcmp (s, "dd") == 0)
+	v->exclude_coredump = 1;
+      else if (strcmp (s, "sh") == 0)
+	v->shared_mapping = 1;
+    }
+}
+
+/* Return 1 if the memory mapping is anonymous, 0 otherwise.
+
+   FILENAME is the name of the file present in the first line of the
+   memory mapping, in the "/proc/PID/smaps" output.  For example, if
+   the first line is:
+
+   7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770   /path/to/file
+
+   Then FILENAME will be "/path/to/file".  */
+
+static int
+mapping_is_anonymous_p (const char *filename)
+{
+  static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex;
+  static int init_regex_p = 0;
+
+  if (!init_regex_p)
+    {
+      struct cleanup *c = make_cleanup (null_cleanup, NULL);
+
+      /* Let's be pessimistic and assume there will be an error while
+	 compiling the regex'es.  */
+      init_regex_p = -1;
+
+      /* DEV_ZERO_REGEX matches "/dev/zero" filenames (with or
+	 without the "(deleted)" string in the end).  We know for
+	 sure, based on the Linux kernel code, that memory mappings
+	 whose associated filename is "/dev/zero" are guaranteed to be
+	 MAP_ANONYMOUS.  */
+      compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match /dev/zero "
+			     "filename"));
+      /* SHMEM_FILE_REGEX matches "/SYSV%08x" filenames (with or
+	 without the "(deleted)" string in the end).  These filenames
+	 refer to shared memory (shmem), and memory mappings
+	 associated with them are MAP_ANONYMOUS as well.  */
+      compile_rx_or_error (&shmem_file_regex,
+			   "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match shmem "
+			     "filenames"));
+      /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the
+	 Linux kernel's 'n_link == 0' code, which is responsible to
+	 decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS'
+	 mapping.  In other words, if FILE_DELETED_REGEX matches, it
+	 does not necessarily mean that we are dealing with an
+	 anonymous shared mapping.  However, there is no easy way to
+	 detect this currently, so this is the best approximation we
+	 have.
+
+	 As a result, GDB will dump readonly pages of deleted
+	 executables when using the default value of coredump_filter
+	 (0x33), while the Linux kernel will not dump those pages.
+	 But we can live with that.  */
+      compile_rx_or_error (&file_deleted_regex, " (deleted)$",
+			   _("Could not compile regex to match "
+			     "'<file> (deleted)'"));
+      /* We will never release these regexes, so just discard the
+	 cleanups.  */
+      discard_cleanups (c);
+
+      /* If we reached this point, then everything succeeded.  */
+      init_regex_p = 1;
+    }
+
+  if (init_regex_p == -1)
+    {
+      const char deleted[] = " (deleted)";
+      size_t del_len = sizeof (deleted) - 1;
+      size_t filename_len = strlen (filename);
+
+      /* There was an error while compiling the regex'es above.  In
+	 order to try to give some reliable information to the caller,
+	 we just try to find the string " (deleted)" in the filename.
+	 If we managed to find it, then we assume the mapping is
+	 anonymous.  */
+      return (filename_len >= del_len
+	      && strcmp (filename + filename_len - del_len, deleted) == 0);
+    }
+
+  if (*filename == '\0'
+      || regexec (&dev_zero_regex, filename, 0, NULL, 0) == 0
+      || regexec (&shmem_file_regex, filename, 0, NULL, 0) == 0
+      || regexec (&file_deleted_regex, filename, 0, NULL, 0) == 0)
+    return 1;
+
+  return 0;
+}
+
+/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
+   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
+   greater than 0 if it should.
+
+   In a nutshell, this is the logic that we follow in order to decide
+   if a mapping should be dumped or not.
+
+   - If the mapping is associated to a file whose name ends with
+     " (deleted)", or if the file is "/dev/zero", or if it is
+     "/SYSV%08x" (shared memory), or if there is no file associated
+     with it, or if the AnonHugePages: or the Anonymous: fields in the
+     /proc/PID/smaps have contents, then GDB considers this mapping to
+     be anonymous.  Otherwise, GDB considers this mapping to be a
+     file-backed mapping (because there will be a file associated with
+     it).
+ 
+     It is worth mentioning that, from all those checks described
+     above, the most fragile is the one to see if the file name ends
+     with " (deleted)".  This does not necessarily mean that the
+     mapping is anonymous, because the deleted file associated with
+     the mapping may have been a hard link to another file, for
+     example.  The Linux kernel checks to see if "i_nlink == 0", but
+     GDB cannot easily (and normally) do this check (iff running as
+     root, it could find the mapping in /proc/PID/map_files/ and
+     determine whether there still are other hard links to the
+     inode/file).  Therefore, we made a compromise here, and we assume
+     that if the file name ends with " (deleted)", then the mapping is
+     indeed anonymous.  FWIW, this is something the Linux kernel could
+     do better: expose this information in a more direct way.
+ 
+   - If we see the flag "sh" in the "VmFlags:" field (in
+     /proc/PID/smaps), then certainly the memory mapping is shared
+     (VM_SHARED).  If we have access to the VmFlags, and we don't see
+     the "sh" there, then certainly the mapping is private.  However,
+     Linux kernels before commit
+     834f82e2aa9a8ede94b17b656329f850c1471514 (3.10) do not have the
+     "VmFlags:" field; in that case, we use another heuristic: if we
+     see 'p' in the permission flags, then we assume that the mapping
+     is private, even though the presence of the 's' flag there would
+     mean VM_MAYSHARE, which means the mapping could still be private.
+     This should work OK enough, however.  */
+
+static int
+dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
+		int maybe_private_p, int mapping_anon_p, const char *filename)
+{
+  /* Initially, we trust in what we received from our caller.  This
+     value may not be very precise (i.e., it was probably gathered
+     from the permission line in the /proc/PID/smaps list, which
+     actually refers to VM_MAYSHARE, and not VM_SHARED), but it is
+     what we have until we take a look at the "VmFlags:" field
+     (assuming that the version of the Linux kernel being used
+     supports it, of course).  */
+  int private_p = maybe_private_p;
+
+  /* We always dump vDSO and vsyscall mappings, because it's likely that
+     there'll be no file to read the contents from at core load time.
+     The kernel does the same.  */
+  if (strcmp ("[vdso]", filename) == 0
+      || strcmp ("[vsyscall]", filename) == 0)
+    return 1;
+
+  if (v->initialized_p)
+    {
+      /* We never dump I/O mappings.  */
+      if (v->io_page)
+	return 0;
+
+      /* Check if we should exclude this mapping.  */
+      if (v->exclude_coredump)
+	return 0;
+
+      /* Update our notion of whether this mapping is shared or
+	 private based on a trustworthy value.  */
+      private_p = !v->shared_mapping;
+
+      /* HugeTLB checking.  */
+      if (v->uses_huge_tlb)
+	{
+	  if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE))
+	      || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED)))
+	    return 1;
+
+	  return 0;
+	}
+    }
+
+  if (private_p)
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_PRIVATE) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_PRIVATE) != 0;
+    }
+  else
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_SHARED) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_SHARED) != 0;
+    }
+}
+
 /* Implement the "info proc" command.  */
 
 static void
@@ -819,48 +1096,86 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 				void *obfd)
 {
   char mapsfilename[100];
-  char *data;
+  char coredumpfilter_name[100];
+  char *data, *coredumpfilterdata;
+  pid_t pid;
+  /* Default dump behavior of coredump_filter (0x33), according to
+     Documentation/filesystems/proc.txt from the Linux kernel
+     tree.  */
+  unsigned int filterflags = (COREFILTER_ANON_PRIVATE
+			      | COREFILTER_ANON_SHARED
+			      | COREFILTER_ELF_HEADERS
+			      | COREFILTER_HUGETLB_PRIVATE);
 
   /* We need to know the real target PID to access /proc.  */
   if (current_inferior ()->fake_pid_p)
     return 1;
 
-  xsnprintf (mapsfilename, sizeof mapsfilename,
-	     "/proc/%d/smaps", current_inferior ()->pid);
+  pid = current_inferior ()->pid;
+
+  if (use_coredump_filter)
+    {
+      xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name),
+		 "/proc/%d/coredump_filter", pid);
+      coredumpfilterdata = target_fileio_read_stralloc (coredumpfilter_name);
+      if (coredumpfilterdata != NULL)
+	{
+	  sscanf (coredumpfilterdata, "%x", &filterflags);
+	  xfree (coredumpfilterdata);
+	}
+    }
+
+  xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid);
   data = target_fileio_read_stralloc (mapsfilename);
   if (data == NULL)
     {
       /* Older Linux kernels did not support /proc/PID/smaps.  */
-      xsnprintf (mapsfilename, sizeof mapsfilename,
-		 "/proc/%d/maps", current_inferior ()->pid);
+      xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid);
       data = target_fileio_read_stralloc (mapsfilename);
     }
-  if (data)
+
+  if (data != NULL)
     {
       struct cleanup *cleanup = make_cleanup (xfree, data);
-      char *line;
+      char *line, *t;
 
-      line = strtok (data, "\n");
-      while (line)
+      line = strtok_r (data, "\n", &t);
+      while (line != NULL)
 	{
 	  ULONGEST addr, endaddr, offset, inode;
 	  const char *permissions, *device, *filename;
+	  struct smaps_vmflags v;
 	  size_t permissions_len, device_len;
-	  int read, write, exec;
-	  int modified = 0, has_anonymous = 0;
+	  int read, write, exec, private;
+	  int has_anonymous = 0;
+	  int should_dump_p = 0;
+	  int mapping_anon_p;
 
+	  memset (&v, 0, sizeof (v));
 	  read_mapping (line, &addr, &endaddr, &permissions, &permissions_len,
 			&offset, &device, &device_len, &inode, &filename);
+	  mapping_anon_p = mapping_is_anonymous_p (filename);
 
 	  /* Decode permissions.  */
 	  read = (memchr (permissions, 'r', permissions_len) != 0);
 	  write = (memchr (permissions, 'w', permissions_len) != 0);
 	  exec = (memchr (permissions, 'x', permissions_len) != 0);
-
-	  /* Try to detect if region was modified by parsing smaps counters.  */
-	  for (line = strtok (NULL, "\n");
-	       line && line[0] >= 'A' && line[0] <= 'Z';
-	       line = strtok (NULL, "\n"))
+	  /* 'private' here actually means VM_MAYSHARE, and not
+	     VM_SHARED.  In order to know if a mapping is really
+	     private or not, we must check the flag "sh" in the
+	     VmFlags field.  This is done by decode_vmflags.  However,
+	     if we are using a Linux kernel released before the commit
+	     834f82e2aa9a8ede94b17b656329f850c1471514 (3.10), we will
+	     not have the VmFlags there.  In this case, there is
+	     really no way to know if we are dealing with VM_SHARED,
+	     so we just assume that VM_MAYSHARE is enough.  */
+	  private = memchr (permissions, 'p', permissions_len) != 0;
+
+	  /* Try to detect if region should be dumped by parsing smaps
+	     counters.  */
+	  for (line = strtok_r (NULL, "\n", &t);
+	       line != NULL && line[0] >= 'A' && line[0] <= 'Z';
+	       line = strtok_r (NULL, "\n", &t))
 	    {
 	      char keyword[64 + 1];
 
@@ -869,11 +1184,17 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 		  warning (_("Error parsing {s,}maps file '%s'"), mapsfilename);
 		  break;
 		}
+
 	      if (strcmp (keyword, "Anonymous:") == 0)
-		has_anonymous = 1;
-	      if (strcmp (keyword, "Shared_Dirty:") == 0
-		  || strcmp (keyword, "Private_Dirty:") == 0
-		  || strcmp (keyword, "Swap:") == 0
+		{
+		  /* Older Linux kernels did not support the
+		     "Anonymous:" counter.  Check it here.  */
+		  has_anonymous = 1;
+		}
+	      else if (strcmp (keyword, "VmFlags:") == 0)
+		decode_vmflags (line, &v);
+
+	      if (strcmp (keyword, "AnonHugePages:") == 0
 		  || strcmp (keyword, "Anonymous:") == 0)
 		{
 		  unsigned long number;
@@ -884,19 +1205,38 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 			       mapsfilename);
 		      break;
 		    }
-		  if (number != 0)
-		    modified = 1;
+		  if (number > 0)
+		    {
+		      /* Even if we are dealing with a file-backed
+			 mapping, if it contains anonymous pages we
+			 consider it to be an anonymous mapping,
+			 because this is what the Linux kernel does:
+
+			 // Dump segments that have been written to.
+			 if (vma->anon_vma && FILTER(ANON_PRIVATE))
+			 	goto whole;
+		      */
+		      mapping_anon_p = 1;
+		    }
 		}
 	    }
 
-	  /* Older Linux kernels did not support the "Anonymous:" counter.
-	     If it is missing, we can't be sure - dump all the pages.  */
-	  if (!has_anonymous)
-	    modified = 1;
+	  if (has_anonymous)
+	    should_dump_p = dump_mapping_p (filterflags, &v, private,
+					    mapping_anon_p, filename);
+	  else
+	    {
+	      /* Older Linux kernels did not support the "Anonymous:" counter.
+		 If it is missing, we can't be sure - dump all the pages.  */
+	      should_dump_p = 1;
+	    }
 
 	  /* Invoke the callback function to create the corefile segment.  */
-	  func (addr, endaddr - addr, offset, inode,
-		read, write, exec, modified, filename, obfd);
+	  if (should_dump_p)
+	    func (addr, endaddr - addr, offset, inode,
+		  read, write, exec, 1, /* MODIFIED is true because we
+					   want to dump the mapping.  */
+		  filename, obfd);
 	}
 
       do_cleanups (cleanup);
@@ -1972,6 +2312,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot)
   return retval;
 }
 
+/* Display whether the gcore command is using the
+   /proc/PID/coredump_filter file.  */
+
+static void
+show_use_coredump_filter (struct ui_file *file, int from_tty,
+			  struct cmd_list_element *c, const char *value)
+{
+  fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to generate"
+			    " corefiles is %s.\n"), value);
+}
+
 /* To be called from the various GDB_OSABI_LINUX handlers for the
    various GNU/Linux architectures and machine types.  */
 
@@ -2008,4 +2359,16 @@ _initialize_linux_tdep (void)
   /* Observers used to invalidate the cache when needed.  */
   observer_attach_inferior_exit (invalidate_linux_cache_inf);
   observer_attach_inferior_appeared (invalidate_linux_cache_inf);
+
+  add_setshow_boolean_cmd ("use-coredump-filter", class_files,
+			   &use_coredump_filter, _("\
+Set whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Show whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Use this command to set whether gcore should consider the contents\n\
+of /proc/PID/coredump_filter when generating the corefile.  For more information\n\
+about this file, refer to the manpage of core(5)."),
+			   NULL, show_use_coredump_filter,
+			   &setlist, &showlist);
 }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-22 20:46         ` Sergio Durigan Junior
@ 2015-03-23 20:27           ` Pedro Alves
  2015-03-23 21:08             ` Sergio Durigan Junior
  0 siblings, 1 reply; 20+ messages in thread
From: Pedro Alves @ 2015-03-23 20:27 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches

On 03/22/2015 08:45 PM, Sergio Durigan Junior wrote:

> +# We do not do file-backed mappings in the test program, but it is
> +# important to test this anyway.  One way of performing the test is to
> +# load GDB with a corefile but without a binary, and then ask for the
> +# disassemble of a function (i.e., the binary's .text section).  GDB
> +# should fail in this case.  However, it must succeed if the binary is
> +# provided along with the corefile.  This is what we test here.

It seems like we now just miss the case of corefilter that _does_ request
that the file backed regions are dumped.  In that case, disassembly
should work without the binary.  Could you add that too, please?  We
can e.g., pass a boolean parameter to test_disasm to specify whether
to expect that disassembly works without a program file.

> +
> +proc test_disasm { core address } {
> +    global testfile
> +
> +    # Restarting GDB without loading the binary
> +    gdb_exit
> +    gdb_start
> +
> +    set core_loaded [gdb_core_cmd "$core" "load core"]
> +    if { $core_loaded == -1 } {
> +	fail "loading $core"
> +	return
> +    }
> +
> +    gdb_test "disassemble $address" "No function contains specified address." \
> +	"disassemble function with corefile and without a binary"
> +
> +    clean_restart $testfile
> +
> +    set core_loaded [gdb_core_cmd "$core" "load core"]
> +    if { $core_loaded == -1 } {
> +	fail "loading $core"
> +	return
> +    }
> +
> +    gdb_test "disassemble $address" "Dump of assembler code for function.*" \
> +	"disassemble function with corefile and with a binary"

Looks like there are duplicate test messages here, in the
cases clean_restart, gdb_core_cmd, etc. fail.  You can fix that
with e.g.:

       with_test_prefix "no binary" {
           # Restart GDB without loading the binary.
           gdb_exit
	   gdb_start

	   set core_loaded [gdb_core_cmd "$core" "load core"]
	   if { $core_loaded == -1 } {
	      fail "load $core"
	      return
	   }

	   gdb_test "disassemble $address" "No function contains specified address." \
		"disassemble function"
       }

       with_test_prefix "with binary" {
	   clean_restart $testfile

	   set core_loaded [gdb_core_cmd "$core" "load core"]
	   if { $core_loaded == -1 } {
	      fail "load $core"
	      return
	   }

	   gdb_test "disassemble $address" "No function contains specified address." \
		"disassemble function"
       }

> +# Getting the inferior's PID

"Get".  Period at end.

> +set infpid ""
> +gdb_test_multiple "info inferiors" "getting inferior pid" {
> +    -re "process \($decimal\).*\r\n$gdb_prompt $" {
> +	set infpid $expect_out(1,string)
> +    }
> +}
> +
> +# Get the main function's address

Period.

(I saw a few other similar gerund uses in the file which
read a bit odd to me, but I didn't point them all out.)

This is OK with the missing test added.

Thanks for the patience and for working on this.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter
  2015-03-23 18:40         ` Sergio Durigan Junior
@ 2015-03-23 20:36           ` Pedro Alves
  0 siblings, 0 replies; 20+ messages in thread
From: Pedro Alves @ 2015-03-23 20:36 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches

On 03/23/2015 06:40 PM, Sergio Durigan Junior wrote:
> Done.  Patch attached.  I will wait until you look at the second patch
> (and approve it) in order to commit everything together.

Thanks.  FAOD, this version is OK.

On 03/23/2015 06:40 PM, Sergio Durigan Junior wrote:
> Jan has provided a good example of it before.  When you use stat -L, you
> can see if the 'Links' count is zero (which means the real file has been
> really deleted).  For example:
> 
>   lrw-------. 1 sergio sergio 64 Mar 23 14:34 7f2c5911b000-7f2c5911c000 -> /tmp/t2 (deleted)
>   [root@bla map_files]# stat -L 7f2c5911b000-7f2c5911c000 
>     File: ‘7f2c5911b000-7f2c5911c000’
>     Size: 0               Blocks: 0          IO Block: 4096   regular empty file
>   Device: 23h/35d         Inode: 217633770   Links: 1
> 
> The 'Links' is 1 because, despite the fact that 't2' has been deleted,
> the original file 't1' still exists.  However, if I remove 't1':
> 
>   [root@bla map_files]# stat -L 7f2c5911b000-7f2c5911c000 
>     File: ‘7f2c5911b000-7f2c5911c000’
>     Size: 0               Blocks: 0          IO Block: 4096   regular empty file
>   Device: 23h/35d         Inode: 217633770   Links: 0
> 
> Therefore, it should be possible to use stat even in this case.  But
> anyway, this was just FYI, we're not doing that.

Ah, yes, these files really work as symlinks, so ls -l was just
showing the symlink's own link count.  "ls -l -L" works too.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-23 20:27           ` Pedro Alves
@ 2015-03-23 21:08             ` Sergio Durigan Junior
  2015-03-23 23:06               ` Pedro Alves
  0 siblings, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-23 21:08 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Monday, March 23 2015, Pedro Alves wrote:

> On 03/22/2015 08:45 PM, Sergio Durigan Junior wrote:
>
>> +# We do not do file-backed mappings in the test program, but it is
>> +# important to test this anyway.  One way of performing the test is to
>> +# load GDB with a corefile but without a binary, and then ask for the
>> +# disassemble of a function (i.e., the binary's .text section).  GDB
>> +# should fail in this case.  However, it must succeed if the binary is
>> +# provided along with the corefile.  This is what we test here.
>
> It seems like we now just miss the case of corefilter that _does_ request
> that the file backed regions are dumped.  In that case, disassembly
> should work without the binary.  Could you add that too, please?  We
> can e.g., pass a boolean parameter to test_disasm to specify whether
> to expect that disassembly works without a program file.

Hm, I'm afraid there's a bit of confusion here, at least from my part.

I am already testing the case when we use a value that requests that
file-backed regions are dumped.  If you take a look at the
"all_anon_corefiles" list, you will see that the each corefile generated
there includes everything *except* for the specific type of mapping we
want to ignore (thus the "non_*" names).  And the result of this test is
that GDB cannot disassemble a function without a binary, even if all the
file-backed pages have been dumped.

Having said that, I made a test with git HEAD without my patch.  I
generated a corefile for the same test program, and then loaded only the
corefile:

  $ ./gdb -q -ex 'core ./core.31118' -ex 'disas 0x4007cb'
  ...
  Program terminated with signal SIGTRAP, Trace/breakpoint trap.
  #0  0x0000000000400905 in ?? ()
  No function contains specified address.
  (gdb) 

Which means that, even without my patch, GDB still cannot disassemble a
function without the binary.

FWIW, I did the same test, but this time using a corefile generated by
the Linux kernel (and with all bits set on coredump_filter), and the
results were the same.

>> +
>> +proc test_disasm { core address } {
>> +    global testfile
>> +
>> +    # Restarting GDB without loading the binary
>> +    gdb_exit
>> +    gdb_start
>> +
>> +    set core_loaded [gdb_core_cmd "$core" "load core"]
>> +    if { $core_loaded == -1 } {
>> +	fail "loading $core"
>> +	return
>> +    }
>> +
>> +    gdb_test "disassemble $address" "No function contains specified address." \
>> +	"disassemble function with corefile and without a binary"
>> +
>> +    clean_restart $testfile
>> +
>> +    set core_loaded [gdb_core_cmd "$core" "load core"]
>> +    if { $core_loaded == -1 } {
>> +	fail "loading $core"
>> +	return
>> +    }
>> +
>> +    gdb_test "disassemble $address" "Dump of assembler code for function.*" \
>> +	"disassemble function with corefile and with a binary"
>
> Looks like there are duplicate test messages here, in the
> cases clean_restart, gdb_core_cmd, etc. fail.  You can fix that
> with e.g.:
>
>        with_test_prefix "no binary" {
>            # Restart GDB without loading the binary.
>            gdb_exit
> 	   gdb_start
>
> 	   set core_loaded [gdb_core_cmd "$core" "load core"]
> 	   if { $core_loaded == -1 } {
> 	      fail "load $core"
> 	      return
> 	   }
>
> 	   gdb_test "disassemble $address" "No function contains specified address." \
> 		"disassemble function"
>        }
>
>        with_test_prefix "with binary" {
> 	   clean_restart $testfile
>
> 	   set core_loaded [gdb_core_cmd "$core" "load core"]
> 	   if { $core_loaded == -1 } {
> 	      fail "load $core"
> 	      return
> 	   }
>
> 	   gdb_test "disassemble $address" "No function contains specified address." \
> 		"disassemble function"
>        }

Thanks, fixed.

>> +# Getting the inferior's PID
>
> "Get".  Period at end.

Fixed.

>> +set infpid ""
>> +gdb_test_multiple "info inferiors" "getting inferior pid" {
>> +    -re "process \($decimal\).*\r\n$gdb_prompt $" {
>> +	set infpid $expect_out(1,string)
>> +    }
>> +}
>> +
>> +# Get the main function's address
>
> Period.

Fixed.

> (I saw a few other similar gerund uses in the file which
> read a bit odd to me, but I didn't point them all out.)

I removed all of them, and also added missing periods all over.  Thanks.

> This is OK with the missing test added.

I'll wait until you clarify that comment above :-).  I won't resubmit
the patch now because it only contains fixes to comments.

> Thanks for the patience and for working on this.

Thank you!

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-23 21:08             ` Sergio Durigan Junior
@ 2015-03-23 23:06               ` Pedro Alves
  2015-03-24  6:37                 ` Sergio Durigan Junior
  0 siblings, 1 reply; 20+ messages in thread
From: Pedro Alves @ 2015-03-23 23:06 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches

On 03/23/2015 09:08 PM, Sergio Durigan Junior wrote:
> On Monday, March 23 2015, Pedro Alves wrote:
> 
>> On 03/22/2015 08:45 PM, Sergio Durigan Junior wrote:
>>
>>> +# We do not do file-backed mappings in the test program, but it is
>>> +# important to test this anyway.  One way of performing the test is to
>>> +# load GDB with a corefile but without a binary, and then ask for the
>>> +# disassemble of a function (i.e., the binary's .text section).  GDB
>>> +# should fail in this case.  However, it must succeed if the binary is
>>> +# provided along with the corefile.  This is what we test here.
>>
>> It seems like we now just miss the case of corefilter that _does_ request
>> that the file backed regions are dumped.  In that case, disassembly
>> should work without the binary.  Could you add that too, please?  We
>> can e.g., pass a boolean parameter to test_disasm to specify whether
>> to expect that disassembly works without a program file.
> 
> Hm, I'm afraid there's a bit of confusion here, at least from my part.
> 
> I am already testing the case when we use a value that requests that
> file-backed regions are dumped.  If you take a look at the
> "all_anon_corefiles" list, you will see that the each corefile generated
> there includes everything *except* for the specific type of mapping we
> want to ignore (thus the "non_*" names).  
> And the result of this test is
> that GDB cannot disassemble a function without a binary, even if all the
> file-backed pages have been dumped.

Now I'm confused.  If all the file-backed pages have been dumped,
then aren't the .text present in the core dump?  If that doesn't work,
we've just caught a bug somewhere.

> 
> Having said that, I made a test with git HEAD without my patch.  I
> generated a corefile for the same test program, and then loaded only the
> corefile:
> 
>   $ ./gdb -q -ex 'core ./core.31118' -ex 'disas 0x4007cb'
>   ...
>   Program terminated with signal SIGTRAP, Trace/breakpoint trap.
>   #0  0x0000000000400905 in ?? ()
>   No function contains specified address.
>   (gdb) 
> 
> Which means that, even without my patch, GDB still cannot disassemble a
> function without the binary.

Oh, without symbols, you need to tell "disassemble" an address range
to disassemble, not just an address.  Like, "disassemble 0x4007cb, +10".
Otherwise that fails even before a memory read was ever attempted, while
gdb was looking for the function's boundaries.

I tried poking at coredump_filter now and, and I'm actually seeing
the opposite.  I can always disassemble `main'.

$ gdb segv -c core.22587
...
Core was generated by `./segv'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000004004a5 in main () at segv.c:5
5         *(volatile int *)0;
(gdb) x /i $pc
=> 0x4004a5 <main+9>:   mov    (%rax),%eax

$ gdb -c core.22587
...
(gdb) x /i $pc
=> 0x4004a5:    mov    (%rax),%eax


The reason that works is that `main' happens to end up in the
first page of the text segment, and that one ends up always
dumped, as the dynamic loader touches it...

I do see that kernel generated cores do get bigger if I set
file-backed bits in coredump_filter:

$ ls -s --hu  core.*
2.3M core.22528  112K core.22587

Bah, can't immediately think of a portable way to test this now.

> 
> FWIW, I did the same test, but this time using a corefile generated by
> the Linux kernel (and with all bits set on coredump_filter), and the
> results were the same.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-23 23:06               ` Pedro Alves
@ 2015-03-24  6:37                 ` Sergio Durigan Junior
  2015-03-24 10:12                   ` Pedro Alves
  0 siblings, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-24  6:37 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Monday, March 23 2015, Pedro Alves wrote:

> On 03/23/2015 09:08 PM, Sergio Durigan Junior wrote:
>> On Monday, March 23 2015, Pedro Alves wrote:
>> 
>>> On 03/22/2015 08:45 PM, Sergio Durigan Junior wrote:
>>>
>>>> +# We do not do file-backed mappings in the test program, but it is
>>>> +# important to test this anyway.  One way of performing the test is to
>>>> +# load GDB with a corefile but without a binary, and then ask for the
>>>> +# disassemble of a function (i.e., the binary's .text section).  GDB
>>>> +# should fail in this case.  However, it must succeed if the binary is
>>>> +# provided along with the corefile.  This is what we test here.
>>>
>>> It seems like we now just miss the case of corefilter that _does_ request
>>> that the file backed regions are dumped.  In that case, disassembly
>>> should work without the binary.  Could you add that too, please?  We
>>> can e.g., pass a boolean parameter to test_disasm to specify whether
>>> to expect that disassembly works without a program file.
>> 
>> Hm, I'm afraid there's a bit of confusion here, at least from my part.
>> 
>> I am already testing the case when we use a value that requests that
>> file-backed regions are dumped.  If you take a look at the
>> "all_anon_corefiles" list, you will see that the each corefile generated
>> there includes everything *except* for the specific type of mapping we
>> want to ignore (thus the "non_*" names).  
>> And the result of this test is
>> that GDB cannot disassemble a function without a binary, even if all the
>> file-backed pages have been dumped.
>
> Now I'm confused.  If all the file-backed pages have been dumped,
> then aren't the .text present in the core dump?  If that doesn't work,
> we've just caught a bug somewhere.

The bug was my understanding of the disassemble command :-).

>> 
>> Having said that, I made a test with git HEAD without my patch.  I
>> generated a corefile for the same test program, and then loaded only the
>> corefile:
>> 
>>   $ ./gdb -q -ex 'core ./core.31118' -ex 'disas 0x4007cb'
>>   ...
>>   Program terminated with signal SIGTRAP, Trace/breakpoint trap.
>>   #0  0x0000000000400905 in ?? ()
>>   No function contains specified address.
>>   (gdb) 
>> 
>> Which means that, even without my patch, GDB still cannot disassemble a
>> function without the binary.
>
> Oh, without symbols, you need to tell "disassemble" an address range
> to disassemble, not just an address.  Like, "disassemble 0x4007cb, +10".
> Otherwise that fails even before a memory read was ever attempted, while
> gdb was looking for the function's boundaries.

Thanks, this makes sense.  Therefore, my testcase is wrong, because it
is actually testing whether the disassemble command identifies this case
and reacts accordingly, instead of disassembling main...

> I tried poking at coredump_filter now and, and I'm actually seeing
> the opposite.  I can always disassemble `main'.
>
> $ gdb segv -c core.22587
> ...
> Core was generated by `./segv'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00000000004004a5 in main () at segv.c:5
> 5         *(volatile int *)0;
> (gdb) x /i $pc
> => 0x4004a5 <main+9>:   mov    (%rax),%eax
>
> $ gdb -c core.22587
> ...
> (gdb) x /i $pc
> => 0x4004a5:    mov    (%rax),%eax
>
> The reason that works is that `main' happens to end up in the
> first page of the text segment, and that one ends up always
> dumped, as the dynamic loader touches it...

This is strange.  It should not really matter if the dynamic loader
touches it or not, because GDB with my patch (or the Linux kernel,
AFAIK) is actually interested only if the page should be dumped or not
according to coredump_filter.

> I do see that kernel generated cores do get bigger if I set
> file-backed bits in coredump_filter:
>
> $ ls -s --hu  core.*
> 2.3M core.22528  112K core.22587
>
> Bah, can't immediately think of a portable way to test this now.

So, I did more investigation on this.  Here is what I found.

The Linux kernel uses the bit 4 on the coredump_filter file to determine
whether it should dump ELF headers or not.  According to vma_dump_size:

  /*
   * If this looks like the beginning of a DSO or executable mapping,
   * check for an ELF header.  If we find one, dump the first page to
   * aid in determining what was mapped here.
   */
  if (FILTER(ELF_HEADERS) &&
      vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) {
          u32 __user *header = (u32 __user *) vma->vm_start;
          u32 word;
          mm_segment_t fs = get_fs();
          /*
           * Doing it this way gets the constant folded by GCC.
           */
          union {
                  u32 cmp;
                  char elfmag[SELFMAG];
          } magic;
          BUILD_BUG_ON(SELFMAG != sizeof word);
          magic.elfmag[EI_MAG0] = ELFMAG0;
          magic.elfmag[EI_MAG1] = ELFMAG1;
          magic.elfmag[EI_MAG2] = ELFMAG2;
          magic.elfmag[EI_MAG3] = ELFMAG3;
          /*
           * Switch to the user "segment" for get_user(),
           * then put back what elf_core_dump() had in place.
           */
          set_fs(USER_DS);
          if (unlikely(get_user(word, header)))
                  word = 0;
          set_fs(fs);
          if (word == magic.cmp)
                  return PAGE_SIZE;
  }

So maybe this is what you meant above by "that one ends up always
dumped...", when refering to the first page of the text segment?  Well,
that is partially true: if you unset bit 4, you will see that this page
does not get dumped at all (and therefore we see the "Cannot access
memory..." error; I did some experiments here and confirmed that).

GDB does not honor bit 4, so it will only depend on the file-backed page
to be dumped in order to be able to disassemble things.  And while doing
the tests with my patch, I noticed that it is not always doing the right
thing about anonymous and file-backed mappings (argh).  Sometimes, it is
dumping file-backed private mappings even when I tell it not to do that,
and the reason is this:

  00400000-00401000 r-xp 00000000 fd:03 10914398 /path/to/file
  Size:                  4 kB
  Rss:                   4 kB
  Pss:                   4 kB
  Shared_Clean:          0 kB
  Shared_Dirty:          0 kB
  Private_Clean:         0 kB
  Private_Dirty:         4 kB
  Referenced:            4 kB
  Anonymous:             4 kB
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  AnonHugePages:         0 kB
  Swap:                  0 kB
  KernelPageSize:        4 kB
  MMUPageSize:           4 kB
  Locked:                0 kB
  VmFlags: rd ex mr mw me dw sd 

This is the .text segment of the test program.  It is a file-backed
private mapping, *but* it also contains anonymous contents.  For this
reason, my patch is considering this mapping as anonymous, because the
Linux kernel kind of does the same thing (again, from vma_dump_size):

  /* Dump segments that have been written to.  */
  if (vma->anon_vma && FILTER(ANON_PRIVATE))
          goto whole;

However, if we look below in the code:

  if (vma->vm_file == NULL)
          return 0;

  if (FILTER(MAPPED_PRIVATE))
          goto whole;

Therefore, if *also* considers tha case when the mapping is file-backed
private (which my patch doesn't do).

All this boils down to: my patch is incorrectly dumping the .text
segment when I ask it not to do that (i.e., when I ask it to ignore
file-backed private mappings and to dump anonymous private mappings),
and it is *not* dumping the .text segment when I ask it to dump it
(i.e., when I ask it to dump file-backed private mappings and to ignore
anonymous private mappings).

So, here's what I propose: I will rework this part of the patch and try
to come up with a better way of identifying these situations (mainly:
when a file-backed mapping has anonymous contents), and I will resubmit
it tomorrow.  Along with that, I should be able to extend the testcase
to cover the disassemble case (and it should start to work fine once I
make those adjustments).

Phew!  What a confusion...  :-/.  I hope things are clearer with this
e-mail.

WDYT?

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-24  6:37                 ` Sergio Durigan Junior
@ 2015-03-24 10:12                   ` Pedro Alves
  2015-03-24 23:15                     ` Sergio Durigan Junior
  0 siblings, 1 reply; 20+ messages in thread
From: Pedro Alves @ 2015-03-24 10:12 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches

On 03/24/2015 06:37 AM, Sergio Durigan Junior wrote:
> 
> The Linux kernel uses the bit 4 on the coredump_filter file to determine
> whether it should dump ELF headers or not.  According to vma_dump_size:

...

> So maybe this is what you meant above by "that one ends up always
> dumped...", when refering to the first page of the text segment?  Well,
> that is partially true: if you unset bit 4, you will see that this page
> does not get dumped at all (and therefore we see the "Cannot access
> memory..." error; I did some experiments here and confirmed that).

Ah, indeed I probably had bit 4 set.

> Therefore, if *also* considers tha case when the mapping is file-backed
> private (which my patch doesn't do).
> 
> All this boils down to: my patch is incorrectly dumping the .text
> segment when I ask it not to do that (i.e., when I ask it to ignore
> file-backed private mappings and to dump anonymous private mappings),
> and it is *not* dumping the .text segment when I ask it to dump it
> (i.e., when I ask it to dump file-backed private mappings and to ignore
> anonymous private mappings).
> 
> So, here's what I propose: I will rework this part of the patch and try
> to come up with a better way of identifying these situations (mainly:
> when a file-backed mapping has anonymous contents), and I will resubmit
> it tomorrow.  Along with that, I should be able to extend the testcase
> to cover the disassemble case (and it should start to work fine once I
> make those adjustments).

Sounds good.

> 
> Phew!  What a confusion...  :-/.  I hope things are clearer with this
> e-mail.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-24 10:12                   ` Pedro Alves
@ 2015-03-24 23:15                     ` Sergio Durigan Junior
  2015-03-24 23:48                       ` Pedro Alves
  0 siblings, 1 reply; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-24 23:15 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Tuesday, March 24 2015, Pedro Alves wrote:

>> Therefore, if *also* considers tha case when the mapping is file-backed
>> private (which my patch doesn't do).
>> 
>> All this boils down to: my patch is incorrectly dumping the .text
>> segment when I ask it not to do that (i.e., when I ask it to ignore
>> file-backed private mappings and to dump anonymous private mappings),
>> and it is *not* dumping the .text segment when I ask it to dump it
>> (i.e., when I ask it to dump file-backed private mappings and to ignore
>> anonymous private mappings).
>> 
>> So, here's what I propose: I will rework this part of the patch and try
>> to come up with a better way of identifying these situations (mainly:
>> when a file-backed mapping has anonymous contents), and I will resubmit
>> it tomorrow.  Along with that, I should be able to extend the testcase
>> to cover the disassemble case (and it should start to work fine once I
>> make those adjustments).
>
> Sounds good.

And I am back, with new investigations and results.  This whole thing is
starting to driving me crazy.

What I found is interesting.  When a memory mapping is file-backed, but
contains Anonymous: pages, the Linux kernel dumps this region when:

  - The user asks for anonymous pages, *OR*

  - The user asks for file-backed pages

Yes, it dumps the mapping in *both* scenarios.  It is like a
"file-backed anonymous" mapping...

I adjusted my patch to mimic this behavior.  However, there is one more
thing...

For some reason, when I run the binary inside GDB, the .text segment
*contains* Anonymous: pages.  However, when I run the binary outside
GDB, the Anonymous: counter is always zero.  This means that, inside
GDB, when we ask for a corefile that excludes file-backed private
mappings (i.e., the .text segment), according to the Linux kernel's
rules, the .text segment *still* should be dumped because it contains
Anonymous: pages.

Unfortunately, I was not able to generate a binary whose .text segment
contained Anonymous: pages outside GDB.  However, I made a binary
that has Anonymous: pages on a file-backed mapping, and I made the Linux
kernel generate a corefile for it while asking it to exclude file-backed
mappings, and I could confirm that the Linux kernel indeed includes this
mapping in the corefile.

My final proposal, which will be reflected in a patch that will be
submitted soon, is to relax the test of the the disassembly of main.
This way, we can still mimic what the Linux kernel does and make GDB
compatible with it.

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] Documentation and testcase
  2015-03-24 23:15                     ` Sergio Durigan Junior
@ 2015-03-24 23:48                       ` Pedro Alves
  0 siblings, 0 replies; 20+ messages in thread
From: Pedro Alves @ 2015-03-24 23:48 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches

On 03/24/2015 11:15 PM, Sergio Durigan Junior wrote:

> And I am back, with new investigations and results.  This whole thing is
> starting to driving me crazy.
> 
> What I found is interesting.  When a memory mapping is file-backed, but
> contains Anonymous: pages, the Linux kernel dumps this region when:
> 
>   - The user asks for anonymous pages, *OR*
> 
>   - The user asks for file-backed pages
> 
> Yes, it dumps the mapping in *both* scenarios.  It is like a
> "file-backed anonymous" mapping...

It makes sense to me.

> 
> I adjusted my patch to mimic this behavior.  However, there is one more
> thing...
> 
> For some reason, when I run the binary inside GDB, the .text segment
> *contains* Anonymous: pages.  However, when I run the binary outside
> GDB, the Anonymous: counter is always zero.  This means that, inside
> GDB, when we ask for a corefile that excludes file-backed private
> mappings (i.e., the .text segment), according to the Linux kernel's
> rules, the .text segment *still* should be dumped because it contains
> Anonymous: pages.

I'd guess those were pages gdb poked/wrote to?  Breakpoints, most
likely.  That should trigger a COW of the poked page.

> Unfortunately, I was not able to generate a binary whose .text segment
> contained Anonymous: pages outside GDB.  However, I made a binary
> that has Anonymous: pages on a file-backed mapping, and I made the Linux
> kernel generate a corefile for it while asking it to exclude file-backed
> mappings, and I could confirm that the Linux kernel indeed includes this
> mapping in the corefile.
> 
> My final proposal, which will be reflected in a patch that will be
> submitted soon, is to relax the test of the the disassembly of main.
> This way, we can still mimic what the Linux kernel does and make GDB
> compatible with it.

Not sure what relax means, but I'll just wait for the patch.  :-)

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 0/2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
@ 2015-03-24 23:57 Sergio Durigan Junior
  0 siblings, 0 replies; 20+ messages in thread
From: Sergio Durigan Junior @ 2015-03-24 23:57 UTC (permalink / raw)
  To: GDB Patches; +Cc: Pedro Alves, Sergio Durigan Junior

Hello,

This is the fifth version of this patch series.  It basically includes
fixes to make GDB mimic the Linux kernel's behavior when dealing with
"file-backed mappings with anonymous pages".  As explained on:

  <https://sourceware.org/ml/gdb-patches/2015-03/msg00814.html>

in this situation, the Linux kernel dumps this "special" mapping both
when the user has requested file-backed or anonymous mappings to be
included in the corefile.  Now, GDB does the same thing.

The patch also extends the testcase in order to make sure that,
depending on the type of the corefile we are dealing with, we can or
cannot disassemble a section of the code without loading a binary.  The
approach I took for this is:

  - For corefiles that include the file-backed mappings, GDB should be
    able to disassemble.

  - For corefiles that do not include file-backed nor anonymous
    mappings, GDB should not be able to disassemble.

It was necessary to "relax" the second test because, as explained in the
message linked above, when you load a binary inside GDB its .text
segment contains anonymous pages, so if you ask for a corefile without
file-backed mappings (which, in theory, should exclude the .text
segment), GDB will still include the .text segment because this kind of
mapping is *also* considered an anonymous mapping (the Linux kernel does
the same thing here).

Other than that, the code is the same.  The documentation has been
approved already, but I am including a NEWS entry this time (I wasn't
sure if it was worth it or not, so I decided to go ahead and do it).

OK to apply?

Sergio Durigan Junior (2):
  Implement support for checking /proc/PID/coredump_filter
  Documentation and testcase

 gdb/doc/gdb.texinfo                        |  33 +++
 gdb/linux-tdep.c                           | 455 +++++++++++++++++++++++++++--
 gdb/testsuite/gdb.base/coredump-filter.c   |  61 ++++
 gdb/testsuite/gdb.base/coredump-filter.exp | 198 +++++++++++++
 4 files changed, 719 insertions(+), 28 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/coredump-filter.c
 create mode 100644 gdb/testsuite/gdb.base/coredump-filter.exp

-- 
1.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-03-24 23:57 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-19 23:22 [PATCH v4 0/2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Sergio Durigan Junior
2015-03-19 23:22 ` [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter Sergio Durigan Junior
2015-03-20 19:12   ` Pedro Alves
2015-03-20 20:02     ` Sergio Durigan Junior
2015-03-20 22:02       ` Pedro Alves
2015-03-23 18:40         ` Sergio Durigan Junior
2015-03-23 20:36           ` Pedro Alves
2015-03-19 23:22 ` [PATCH 2/2] Documentation and testcase Sergio Durigan Junior
2015-03-20 19:46   ` Pedro Alves
2015-03-20 21:03     ` Sergio Durigan Junior
2015-03-20 22:09       ` Pedro Alves
2015-03-22 20:46         ` Sergio Durigan Junior
2015-03-23 20:27           ` Pedro Alves
2015-03-23 21:08             ` Sergio Durigan Junior
2015-03-23 23:06               ` Pedro Alves
2015-03-24  6:37                 ` Sergio Durigan Junior
2015-03-24 10:12                   ` Pedro Alves
2015-03-24 23:15                     ` Sergio Durigan Junior
2015-03-24 23:48                       ` Pedro Alves
2015-03-24 23:57 [PATCH v4 0/2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Sergio Durigan Junior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).