public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
@ 2015-03-05  3:48 Sergio Durigan Junior
  2015-03-05 15:48 ` Jan Kratochvil
                   ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-05  3:48 UTC (permalink / raw)
  To: GDB Patches; +Cc: Jan Kratochvil, Pedro Alves, Oleg Nesterov

Hello,

I have been working on this patch for quite some time, with some
interruptions here and there, but now I think it is ready to be
submitted and pushed upstream.

First of all, thanks to Jan Kratochvil for catching mistakes and to Oleg
Nesterov for helping us understanding this marvelous yet obscure world
of the Linux kernel memory mapping scheme.

This patch, as the subject says, extends GDB so that it is able to use
the contents of the file /proc/PID/coredump_filter when generating a
corefile.  This file contains a bit mask that is a representation of the
different types of memory mappings in the Linux kernel; the user can
choose to dump or not dump a certain type of memory mapping by
enabling/disabling the respective bit in the bit mask.  Currently, here
is what is supported:

  bit 0  Dump anonymous private mappings.
  bit 1  Dump anonymous shared mappings.
  bit 2  Dump file-backed private mappings.
  bit 3  Dump file-backed shared mappings.
  bit 4 (since Linux 2.6.24)
         Dump ELF headers.
  bit 5 (since Linux 2.6.28)
         Dump private huge pages.
  bit 6 (since Linux 2.6.28)
         Dump shared huge pages.

(This table has been taken from core(5), but you can also read about it
on Documentation/filesystems/proc.txt inside the Linux kernel source
tree).

The default value for this file, used by the Linux kernel, is 0x33,
which means that bits 0, 1 and 4 are enabled.  This is also the default
for GDB implemented in this patch, FWIW.

Well, reading the file is obviously trivial.  The hard part, mind you,
is how to determine the types of the memory mappings.  For that, I
extended the code of gdb/linux-tdep.c:linux_find_memory_regions_full and
made it rely *much more* on the information gathered from
/proc/<PID>/smaps.  This file contains a "verbose dump" of the
inferior's memory mappings, and we were not using as much information as
we could from it.  If you want to read more about this file, take a look
at the proc(5) manpage (I will also write a blog post soon about
everything I had to learn to get this patch done, and when I it is ready
I will post it here).

With Oleg's help, we could improve the current algorithm for determining
whether a memory mapping is anonymous/file-backed, private/shared.  GDB
now also respects the MADV_DONTDUMP flag and does not dump the memory
mapping marked as so, and won't try to dump "[vsyscall]" or "[vdso]"
mappings as before (just like the Linux kernel).

In a nutshell, what the new code is doing is:

- If the mapping is associated to a file whose name ends with "
  (deleted)", or if the file is "/dev/zero", or if it is "/SYSV%08x"
  (shared memory), or if there is no file associated with it, or if the
  AnonHugePages: or the Anonymous: fields in the /proc/PID/smaps have
  contents, then GDB considers this mapping to be anonymous.  Otherwise,
  GDB considers this mapping to be a file-backed mapping (because there
  will be a file associated with it).

  It is worth mentioning that, from all those checks described above,
  the most fragile is the one to see if the file name ends with "
  (deleted)".  This does not necessarily mean that the mapping is
  anonymous, because the deleted file associated with the mapping may
  have been a hard link to another file, for example.  The Linux kernel
  checks to see if "i_nlink == 0", but GDB cannot easily do this check.
  Therefore, we made a compromise here, and we assume that if the file
  name ends with " (deleted)", then the mapping is indeed anonymous.
  FWIW, this is something the Linux kernel could do better: expose this
  information in a more direct way.

- If we see the flag "sh" in the VmFlags: field (in /proc/PID/smaps),
  then certainly the memory mapping is shared (VM_SHARED).  If we have
  access to the VmFlags, and we don't see the "sh" there, then certainly
  the mapping is private.  However, older Linus kernels do not have the
  VmFlags field; in that case, we use another heuristic: if we see 'p'
  in the permission flags, then we assume that the mapping is private,
  even though the presence of the 's' flag there would mean VM_MAYSHARE,
  which means the mapping could still be private.  This should work OK
  enough, however.

As a side effect of this patch, gdb/gcore.c:gcore_create_callback is
also smarter when dumping mappings in the corefile and generating their
section headers.  Before, when we did not know if a memory mapping was
modified or not, we were passing '1' (true) to this function, which
would just assume that the mapping was modified and would be dumped.
Now, we are passing a new value, which represents "I don't know", and
will make the function use an existing heuristic to determine if we
should indeed dump this mapping, or just ignore it.

Finally, it is worth mentioning that I added a new command, 'set
use-coredump-filter on/off'.  When it is 'on', it will read the
coredump_filter' file (if it exists) and use its value; otherwise, it
will use the default value mentioned above (0x33) to decide which memory
mappings to dump.

I am submitting a documentation patch and a testsuite to exercise this
new feature.  I ran a regression test on a Fedora 20 machine (x86_64 and
native-gdbserver), and nothing was detected.

OK to apply?

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

gdb/ChangeLog:
2015-03-04  Sergio Durigan Junior  <sergiodj@redhat.com>
	    Jan Kratochvil  <jan.kratochvil@redhat.com>
	    Oleg Nesterov  <oleg@redhat.com>

	PR corefiles/16092
	* common/common-defs.h (enum memory_mapping_state): New enum.
	* defs.h (find_memory_region_ftype): Remove 'int modified'
	parameter, replacing by 'enum memory_mapping_state state'.
	* gcore.c (gcore_create_callback): Likewise.  Change 'if/else'
	statements and improve the logic of deciding when to ignore a
	memory mapping.
	(objfile_find_memory_regions): Passing
	'MEMORY_MAPPING_UNKNOWN_STATE' or 'MEMORY_MAPPING_MODIFIED' when
	needed to 'func' callback, instead of saying the memory mapping
	was modified even without knowing it.
	* gnu-nat.c (gnu_find_memory_regions): Likewise.
	* linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'.
	New enum identifying the various options of the coredump_filter
	file.
	(struct smaps_vmflags): New struct.
	(use_coredump_filter): New variable.
	(decode_vmflags): New function.
	(mapping_is_anonymous_p): Likewise.
	(dump_mapping_p): Likewise.
	(linux_find_memory_region_ftype): Remove 'int modified' parameter,
	replacing by 'enum memory_mapping_state state'.
	(linux_find_memory_regions_full): New variables
	'coredumpfilter_name', 'coredumpfilterdata', 'pid',
	'filterflags'.  Read /proc/<PID>/smaps file; improve parsing of
	its information.  Implement memory mapping filtering based on its
	contents.
	(linux_find_memory_regions_thunk): Remove 'int modified'
	parameter, replacing by 'enum memory_mapping_state state'.
	(linux_make_mappings_callback): Likewise.
	(find_mapping_size): Likewise.
	(show_use_coredump_filter): New function.
	(_initialize_linux_tdep): New command 'set use-coredump-filter'.
	* procfs.c (find_memory_regions_callback): Passing
	'MEMORY_MAPPING_UNKNOWN_STATE' when needed to 'func' callback,
	instead of saying the memory mapping was modified even without
	knowing it.

gdb/doc/ChangeLog:
2015-03-04  Sergio Durigan Junior  <sergiodj@redhat.com>

	PR corefiles/16092
	* gdb.texinfo (gcore): Mention new command 'set
	use-coredump-filter'.
	(set use-coredump-filter): Document new command.

gdb/testsuite/ChangeLog:
2015-03-04  Sergio Durigan Junior  <sergiodj@redhat.com>

	PR corefiles/16092
	* gdb.base/coredump-filter.c: New file.
	* gdb.base/coredump-filter.exp: Likewise.

diff --git a/gdb/common/common-defs.h b/gdb/common/common-defs.h
index 62d9de5..01b05f5 100644
--- a/gdb/common/common-defs.h
+++ b/gdb/common/common-defs.h
@@ -60,4 +60,14 @@
 # define EXTERN_C_POP
 #endif
 
+/* Enum used to inform the state of a memory mapping.  This is used in
+   functions implementing find_memory_region_ftype.  */
+
+enum memory_mapping_state
+  {
+    MEMORY_MAPPING_MODIFIED,
+    MEMORY_MAPPING_UNMODIFIED,
+    MEMORY_MAPPING_UNKNOWN_STATE,
+  };
+
 #endif /* COMMON_DEFS_H */
diff --git a/gdb/defs.h b/gdb/defs.h
index 72512f6..4829b62 100644
--- a/gdb/defs.h
+++ b/gdb/defs.h
@@ -338,7 +338,8 @@ extern void init_source_path (void);
 
 typedef int (*find_memory_region_ftype) (CORE_ADDR addr, unsigned long size,
 					 int read, int write, int exec,
-					 int modified, void *data);
+					 enum memory_mapping_state state,
+					 void *data);
 
 /* * Possible lvalue types.  Like enum language, this should be in
    value.h, but needs to be here for the same reason.  */
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 4b76ce9..e575eae 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -10952,6 +10952,67 @@ specified, the file name defaults to @file{core.@var{pid}}, where
 
 Note that this command is implemented only for some systems (as of
 this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
+
+On @sc{gnu}/Linux, this command can take into account the value of the
+file @file{/proc/@var{pid}/coredump_filter} when generating the core
+dump (@pxref{set use-coredump-filter}).
+
+@kindex set use-coredump-filter
+@anchor{set use-coredump-filter}
+@item set use-coredump-filter on
+@itemx set use-coredump-filter off
+Enable or disable the use of the file
+@file{/proc/@var{pid}/coredump_filter} when generating core dump
+files.  This file is used by the Linux kernel to decide what types of
+memory mappings will be dumped or ignored when generating a core dump
+file.
+
+To make use of this feature, you have to write in the
+@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
+which is a bit mask representing the memory mapping types.  If a bit
+is set in the bit mask, then the memory mappings of the corresponding
+types will be dumped; otherwise, they will be ignored.  The bits in
+this bit mask have the following meanings:
+
+@table @code
+@item bit 0
+Dump anonymous private mappings.
+@item bit 1
+Dump anonymous shared mappings.
+@item bit 2
+Dump file-backed private mappings.
+@item bit 3
+Dump file-backed shared mappings.
+@item bit 4
+(since Linux 2.6.24)
+Dump ELF headers. (@value{GDBN} does not take this bit into account)
+@item bit 5
+(since Linux 2.6.28)
+Dump private huge pages.
+@item bit 6
+(since Linux 2.6.28)
+Dump shared huge pages.
+@end table
+
+For example, supposing that the @code{pid} of the program being
+debugging is @code{1234}, if you wanted to dump everything except the
+anonymous private and the file-backed shared mappings, you would do:
+
+@smallexample
+$ echo 0x76 > /proc/1234/coredump_filter
+@end smallexample
+
+For more documentation about how to use the @file{coredump_filter}
+file, see the manpage of @code{proc(5)}.
+
+By default, this option is @code{on}.  If this option is turned
+@code{off}, @value{GDBN} will not read the @file{coredump_filter}
+file, but it uses the same default value as the Linux kernel in order
+to decide which pages will be dumped in the core dump file.  This
+value currently is @code{0x33}, which means that the bits @code{0}
+(anonymous private mappings), @code{1} (anonymous shared mappings) and
+@code{4} (ELF headers) are active.  This will cause these memory
+mappings to be dumped automatically.
 @end table
 
 @node Character Sets
diff --git a/gdb/gcore.c b/gdb/gcore.c
index 1ebff2a..9edcf40 100644
--- a/gdb/gcore.c
+++ b/gdb/gcore.c
@@ -408,27 +408,22 @@ make_output_phdrs (bfd *obfd, asection *osec, void *ignored)
 
 static int
 gcore_create_callback (CORE_ADDR vaddr, unsigned long size, int read,
-		       int write, int exec, int modified, void *data)
+		       int write, int exec, enum memory_mapping_state state,
+		       void *data)
 {
   bfd *obfd = data;
   asection *osec;
   flagword flags = SEC_ALLOC | SEC_HAS_CONTENTS | SEC_LOAD;
 
-  /* If the memory segment has no permissions set, ignore it, otherwise
-     when we later try to access it for read/write, we'll get an error
-     or jam the kernel.  */
-  if (read == 0 && write == 0 && exec == 0 && modified == 0)
-    {
-      if (info_verbose)
-        {
-          fprintf_filtered (gdb_stdout, "Ignore segment, %s bytes at %s\n",
-                            plongest (size), paddress (target_gdbarch (), vaddr));
-        }
-
-      return 0;
-    }
-
-  if (write == 0 && modified == 0 && !solib_keep_data_in_core (vaddr, size))
+  /* If the memory segment has no read permission set, or if it has
+     been marked as unmodified, then we have to generate a segment
+     header for it, but without contents (i.e., FileSiz = 0),
+     otherwise when we later try to access it for read/write, we'll
+     get an error or jam the kernel.  */
+  if (read == 0 || state == MEMORY_MAPPING_UNMODIFIED)
+    flags &= ~(SEC_LOAD | SEC_HAS_CONTENTS);
+  else if (write == 0 && state == MEMORY_MAPPING_UNKNOWN_STATE
+	   && !solib_keep_data_in_core (vaddr, size))
     {
       /* See if this region of memory lies inside a known file on disk.
 	 If so, we can avoid copying its contents by clearing SEC_LOAD.  */
@@ -521,7 +516,8 @@ objfile_find_memory_regions (struct target_ops *self,
 			 1, /* All sections will be readable.  */
 			 (flags & SEC_READONLY) == 0, /* Writable.  */
 			 (flags & SEC_CODE) != 0, /* Executable.  */
-			 1, /* MODIFIED is unknown, pass it as true.  */
+			 MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is
+							 unknown.  */
 			 obfd);
 	  if (ret != 0)
 	    return ret;
@@ -534,7 +530,7 @@ objfile_find_memory_regions (struct target_ops *self,
 	     1, /* Stack section will be readable.  */
 	     1, /* Stack section will be writable.  */
 	     0, /* Stack section will not be executable.  */
-	     1, /* Stack section will be modified.  */
+	     MEMORY_MAPPING_MODIFIED, /* Stack section will be modified.  */
 	     obfd);
 
   /* Make a heap segment.  */
@@ -543,7 +539,7 @@ objfile_find_memory_regions (struct target_ops *self,
 	     1, /* Heap section will be readable.  */
 	     1, /* Heap section will be writable.  */
 	     0, /* Heap section will not be executable.  */
-	     1, /* Heap section will be modified.  */
+	     MEMORY_MAPPING_MODIFIED, /* Heap section will be modified.  */
 	     obfd);
 
   return 0;
diff --git a/gdb/gnu-nat.c b/gdb/gnu-nat.c
index d830773..60612a7 100644
--- a/gdb/gnu-nat.c
+++ b/gdb/gnu-nat.c
@@ -2611,7 +2611,7 @@ gnu_find_memory_regions (struct target_ops *self,
 		     last_protection & VM_PROT_READ,
 		     last_protection & VM_PROT_WRITE,
 		     last_protection & VM_PROT_EXECUTE,
-		     1, /* MODIFIED is unknown, pass it as true.  */
+		     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 		     data);
 	  last_region_address = region_address;
 	  last_region_end = region_address += region_length;
@@ -2625,7 +2625,7 @@ gnu_find_memory_regions (struct target_ops *self,
 	     last_protection & VM_PROT_READ,
 	     last_protection & VM_PROT_WRITE,
 	     last_protection & VM_PROT_EXECUTE,
-	     1, /* MODIFIED is unknown, pass it as true.  */
+	     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 	     data);
 
   return 0;
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index d9884f3..2225b81 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -35,9 +35,58 @@
 #include "observer.h"
 #include "objfiles.h"
 #include "infcall.h"
+#include "gdbcmd.h"
+#include "gdb_regex.h"
 
 #include <ctype.h>
 
+/* This enum represents the values that the user can choose when
+   informing the Linux kernel about which memory mappings will be
+   dumped in a corefile.  They are described in the file
+   Documentation/filesystems/proc.txt, inside the Linux kernel
+   tree.  */
+
+enum
+  {
+    COREFILTER_ANON_PRIVATE = 1 << 0,
+    COREFILTER_ANON_SHARED = 1 << 1,
+    COREFILTER_MAPPED_PRIVATE = 1 << 2,
+    COREFILTER_MAPPED_SHARED = 1 << 3,
+    COREFILTER_ELF_HEADERS = 1 << 4,
+    COREFILTER_HUGETLB_PRIVATE = 1 << 5,
+    COREFILTER_HUGETLB_SHARED = 1 << 6,
+  };
+
+struct smaps_vmflags
+  {
+    /* Zero if this structure has not been initialized yet.  It
+       probably means that the Linux kernel being used does not emit
+       the "VmFlags:" field on "/proc/PID/smaps".  */
+
+    unsigned int initialized_p : 1;
+
+    /* Memory mapped I/O area (VM_IO, "io").  */
+
+    unsigned int io_page : 1;
+
+    /* Area uses huge TLB pages (VM_HUGETLB, "ht").  */
+
+    unsigned int uses_huge_tlb : 1;
+
+    /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd").  */
+
+    unsigned int exclude_coredump : 1;
+
+    /* Is this a MAP_SHARED mapping (VM_SHARED, "sh").  */
+
+    unsigned int shared_mapping : 1;
+  };
+
+/* Whether to take the /proc/PID/coredump_filter into account when
+   generating a corefile.  */
+
+static int use_coredump_filter = 1;
+
 /* This enum represents the signals' numbers on a generic architecture
    running the Linux kernel.  The definition of "generic" comes from
    the file <include/uapi/asm-generic/signal.h>, from the Linux kernel
@@ -381,6 +430,159 @@ read_mapping (const char *line,
   *filename = p;
 }
 
+/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
+
+   This function was based on the documentation found on
+   <Documentation/filesystems/proc.txt>, on the Linux kernel.
+
+   Linux kernels before commit
+   834f82e2aa9a8ede94b17b656329f850c1471514 do not have this field on
+   smaps.  */
+
+static void
+decode_vmflags (char *p, struct smaps_vmflags *v)
+{
+  char *saveptr;
+  char *s;
+
+  v->initialized_p = 1;
+  p = skip_to_space (p);
+  p = skip_spaces (p);
+
+  for (s = strtok_r (p, " ", &saveptr);
+       s != NULL;
+       s = strtok_r (NULL, " ", &saveptr))
+    {
+      if (strcmp (s, "io") == 0)
+	v->io_page = 1;
+      else if (strcmp (s, "ht") == 0)
+	v->uses_huge_tlb = 1;
+      else if (strcmp (s, "dd") == 0)
+	v->exclude_coredump = 1;
+      else if (strcmp (s, "sh") == 0)
+	v->shared_mapping = 1;
+    }
+}
+
+/* Return 1 if the memory mapping is anonymous, 0 otherwise.
+
+   FILENAME is the name of the file present in the first line of the
+   memory mapping, in the "/proc/PID/smaps" output.  For example, if
+   the first line is:
+
+   7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770   /path/to/file
+
+   Then FILENAME will be "/path/to/file".  */
+
+static int
+mapping_is_anonymous_p (const char *filename)
+{
+  static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex;
+  static int init_regex_p = 0;
+
+  if (!init_regex_p)
+    {
+      struct cleanup *c = make_cleanup (null_cleanup, NULL);
+
+      init_regex_p = 1;
+      compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match /dev/zero "
+			     "filename"));
+      compile_rx_or_error (&shmem_file_regex,
+			   "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match shmem "
+			     "filenames"));
+      /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the
+	 Linux kernel's 'n_link == 0' code, which is responsible to
+	 decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS'
+	 mapping.  In other words, if FILE_DELETED_REGEX matches, it
+	 does not necessarily mean that we are dealing with an
+	 anonymous shared mapping.  However, there is no easy way to
+	 detect this currently, so this is the best approximation we
+	 have.
+
+	 As a result, GDB will dump readonly pages of deleted
+	 executables when using the default value of coredump_filter
+	 (0x33), while the Linux kernel will not dump those pages.
+	 But we can live with that.  */
+      compile_rx_or_error (&file_deleted_regex, " (deleted)$",
+			   _("Could not compile regex to match "
+			     "'<file> (deleted)'"));
+      /* We will never release these regexes, so just discard the
+	 cleanups.  */
+      discard_cleanups (c);
+    }
+
+  if (*filename == '\0'
+      || regexec (&dev_zero_regex, filename, 0, NULL, 0) == 0
+      || regexec (&shmem_file_regex, filename, 0, NULL, 0) == 0
+      || regexec (&file_deleted_regex, filename, 0, NULL, 0) == 0)
+    return 1;
+
+  return 0;
+}
+
+/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
+   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
+   greater than 0 if it should.  */
+
+static int
+dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
+		int maybe_private_p, int mapping_anon_p, const char *filename)
+{
+  /* Initially, we trust in what we received from outside.  This value
+     may not be very precise (i.e., it was probably gathered from the
+     permission line in the /proc/PID/smaps list, which actually
+     refers to VM_MAYSHARE, and not VM_SHARED), but it is what we have
+     for now.  */
+  int private_p = maybe_private_p;
+
+  /* We always dump vDSO and vsyscall mappings.  */
+  if (strcmp ("[vdso]", filename) == 0
+      || strcmp ("[vsyscall]", filename) == 0)
+    return 1;
+
+  if (v->initialized_p)
+    {
+      /* We never dump I/O mappings.  */
+      if (v->io_page)
+	return 0;
+
+      /* Check if we should exclude this mapping.  */
+      if (v->exclude_coredump)
+	return 0;
+
+      /* Updating our notion of whether this mapping is shared or
+	 private based on a trustworthy value.  */
+      private_p = !v->shared_mapping;
+
+      /* HugeTLB checking.  */
+      if (v->uses_huge_tlb)
+	{
+	  if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE))
+	      || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED)))
+	    return 1;
+
+	  return 0;
+	}
+    }
+
+  if (private_p)
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_PRIVATE) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_PRIVATE) != 0;
+    }
+  else
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_SHARED) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_SHARED) != 0;
+    }
+}
+
 /* Implement the "info proc" command.  */
 
 static void
@@ -807,7 +1009,8 @@ linux_core_info_proc (struct gdbarch *gdbarch, const char *args,
 typedef int linux_find_memory_region_ftype (ULONGEST vaddr, ULONGEST size,
 					    ULONGEST offset, ULONGEST inode,
 					    int read, int write,
-					    int exec, int modified,
+					    int exec,
+					    enum memory_mapping_state state,
 					    const char *filename,
 					    void *data);
 
@@ -819,48 +1022,84 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 				void *obfd)
 {
   char mapsfilename[100];
-  char *data;
+  char coredumpfilter_name[100];
+  char *data, *coredumpfilterdata;
+  pid_t pid;
+  /* Default dump behavior of coredump_filter (0x33), according to
+     Documentation/filesystems/proc.txt from the Linux kernel
+     tree.  */
+  unsigned int filterflags = (COREFILTER_ANON_PRIVATE
+			      | COREFILTER_ANON_SHARED
+			      | COREFILTER_ELF_HEADERS
+			      | COREFILTER_HUGETLB_PRIVATE);
 
   /* We need to know the real target PID to access /proc.  */
   if (current_inferior ()->fake_pid_p)
     return 1;
 
-  xsnprintf (mapsfilename, sizeof mapsfilename,
-	     "/proc/%d/smaps", current_inferior ()->pid);
+  pid = current_inferior ()->pid;
+
+  if (use_coredump_filter)
+    {
+      xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name),
+		 "/proc/%d/coredump_filter", pid);
+      coredumpfilterdata = target_fileio_read_stralloc (coredumpfilter_name);
+      if (coredumpfilterdata != NULL)
+	{
+	  sscanf (coredumpfilterdata, "%x", &filterflags);
+	  xfree (coredumpfilterdata);
+	}
+    }
+
+  xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid);
   data = target_fileio_read_stralloc (mapsfilename);
   if (data == NULL)
     {
       /* Older Linux kernels did not support /proc/PID/smaps.  */
-      xsnprintf (mapsfilename, sizeof mapsfilename,
-		 "/proc/%d/maps", current_inferior ()->pid);
+      xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid);
       data = target_fileio_read_stralloc (mapsfilename);
     }
-  if (data)
+
+  if (data != NULL)
     {
       struct cleanup *cleanup = make_cleanup (xfree, data);
-      char *line;
+      char *line, *t;
 
-      line = strtok (data, "\n");
-      while (line)
+      line = strtok_r (data, "\n", &t);
+      while (line != NULL)
 	{
 	  ULONGEST addr, endaddr, offset, inode;
 	  const char *permissions, *device, *filename;
+	  struct smaps_vmflags v;
 	  size_t permissions_len, device_len;
-	  int read, write, exec;
-	  int modified = 0, has_anonymous = 0;
+	  int read, write, exec, private;
+	  enum memory_mapping_state state;
+	  int has_anonymous = 0;
+	  int mapping_anon_p;
 
+	  memset (&v, 0, sizeof (v));
 	  read_mapping (line, &addr, &endaddr, &permissions, &permissions_len,
 			&offset, &device, &device_len, &inode, &filename);
+	  mapping_anon_p = mapping_is_anonymous_p (filename);
 
 	  /* Decode permissions.  */
 	  read = (memchr (permissions, 'r', permissions_len) != 0);
 	  write = (memchr (permissions, 'w', permissions_len) != 0);
 	  exec = (memchr (permissions, 'x', permissions_len) != 0);
+	  /* 'private' here actually means VM_MAYSHARE, and not
+	     VM_SHARED.  In order to know if a mapping is really
+	     private or not, we must check the flag "sh" in the
+	     VmFlags field.  This is done by decode_vmflags.  However,
+	     if we are using an old Linux kernel, we will not have the
+	     VmFlags there.  In this case, there is really no way to
+	     know if we are dealing with VM_SHARED, so we just assume
+	     that VM_MAYSHARE is enough.  */
+	  private = memchr (permissions, 'p', permissions_len) != 0;
 
 	  /* Try to detect if region was modified by parsing smaps counters.  */
-	  for (line = strtok (NULL, "\n");
-	       line && line[0] >= 'A' && line[0] <= 'Z';
-	       line = strtok (NULL, "\n"))
+	  for (line = strtok_r (NULL, "\n", &t);
+	       line != NULL && line[0] >= 'A' && line[0] <= 'Z';
+	       line = strtok_r (NULL, "\n", &t))
 	    {
 	      char keyword[64 + 1];
 
@@ -869,11 +1108,17 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 		  warning (_("Error parsing {s,}maps file '%s'"), mapsfilename);
 		  break;
 		}
+
 	      if (strcmp (keyword, "Anonymous:") == 0)
-		has_anonymous = 1;
-	      if (strcmp (keyword, "Shared_Dirty:") == 0
-		  || strcmp (keyword, "Private_Dirty:") == 0
-		  || strcmp (keyword, "Swap:") == 0
+		{
+		  /* Older Linux kernels did not support the
+		     "Anonymous:" counter.  Check it here.  */
+		  has_anonymous = 1;
+		}
+	      else if (strcmp (keyword, "VmFlags:") == 0)
+		decode_vmflags (line, &v);
+
+	      if (strcmp (keyword, "AnonHugePages:") == 0
 		  || strcmp (keyword, "Anonymous:") == 0)
 		{
 		  unsigned long number;
@@ -884,19 +1129,43 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 			       mapsfilename);
 		      break;
 		    }
-		  if (number != 0)
-		    modified = 1;
+		  if (number > 0)
+		    {
+		      /* Even if we are dealing with a file-backed
+			 mapping, if it contains anonymous pages we
+			 consider it to be an anonymous mapping,
+			 because this is what the Linux kernel does:
+
+			 // Dump segments that have been written to.
+			 if (vma->anon_vma && FILTER(ANON_PRIVATE))
+			 	goto whole;
+		      */
+		      mapping_anon_p = 1;
+		    }
 		}
 	    }
 
-	  /* Older Linux kernels did not support the "Anonymous:" counter.
-	     If it is missing, we can't be sure - dump all the pages.  */
-	  if (!has_anonymous)
-	    modified = 1;
+	  /* If a mapping should not be dumped we still should create
+	     a segment for it, just without SEC_LOAD (see
+	     gcore_create_callback).  */
+	  if (has_anonymous)
+	    {
+	      if (dump_mapping_p (filterflags, &v, private, mapping_anon_p,
+				  filename))
+		state = MEMORY_MAPPING_MODIFIED;
+	      else
+		state = MEMORY_MAPPING_UNMODIFIED;
+	    }
+	  else
+	    {
+	      /* Older Linux kernels did not support the "Anonymous:" counter.
+		 If it is missing, we can't be sure - dump all the pages.  */
+	      state = MEMORY_MAPPING_UNKNOWN_STATE;
+	    }
 
 	  /* Invoke the callback function to create the corefile segment.  */
 	  func (addr, endaddr - addr, offset, inode,
-		read, write, exec, modified, filename, obfd);
+		read, write, exec, state, filename, obfd);
 	}
 
       do_cleanups (cleanup);
@@ -926,12 +1195,13 @@ struct linux_find_memory_regions_data
 static int
 linux_find_memory_regions_thunk (ULONGEST vaddr, ULONGEST size,
 				 ULONGEST offset, ULONGEST inode,
-				 int read, int write, int exec, int modified,
+				 int read, int write, int exec,
+				 enum memory_mapping_state state,
 				 const char *filename, void *arg)
 {
   struct linux_find_memory_regions_data *data = arg;
 
-  return data->func (vaddr, size, read, write, exec, modified, data->obfd);
+  return data->func (vaddr, size, read, write, exec, state, data->obfd);
 }
 
 /* A variant of linux_find_memory_regions_full that is suitable as the
@@ -1074,7 +1344,8 @@ static linux_find_memory_region_ftype linux_make_mappings_callback;
 static int
 linux_make_mappings_callback (ULONGEST vaddr, ULONGEST size,
 			      ULONGEST offset, ULONGEST inode,
-			      int read, int write, int exec, int modified,
+			      int read, int write, int exec,
+			      enum memory_mapping_state state,
 			      const char *filename, void *data)
 {
   struct linux_make_mappings_data *map_data = data;
@@ -1869,7 +2140,8 @@ linux_gdb_signal_to_target (struct gdbarch *gdbarch,
 
 static int
 find_mapping_size (CORE_ADDR vaddr, unsigned long size,
-		   int read, int write, int exec, int modified,
+		   int read, int write, int exec,
+		   enum memory_mapping_state state,
 		   void *data)
 {
   struct mem_range *range = data;
@@ -1969,6 +2241,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot)
   return retval;
 }
 
+/* Display whether the gcore command is using the
+   /proc/PID/coredump_filter file.  */
+
+static void
+show_use_coredump_filter (struct ui_file *file, int from_tty,
+			  struct cmd_list_element *c, const char *value)
+{
+  fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to generate"
+			    " corefiles is %s.\n"), value);
+}
+
 /* To be called from the various GDB_OSABI_LINUX handlers for the
    various GNU/Linux architectures and machine types.  */
 
@@ -2005,4 +2288,16 @@ _initialize_linux_tdep (void)
   /* Observers used to invalidate the cache when needed.  */
   observer_attach_inferior_exit (invalidate_linux_cache_inf);
   observer_attach_inferior_appeared (invalidate_linux_cache_inf);
+
+  add_setshow_boolean_cmd ("use-coredump-filter", class_files,
+			   &use_coredump_filter, _("\
+Set whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Show whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Use this command to set whether gcore should consider the contents\n\
+of /proc/PID/coredump_filter when generating the corefile.  For more information\n\
+about this file, refer to the manpage of core(5)."),
+			   NULL, show_use_coredump_filter,
+			   &setlist, &showlist);
 }
diff --git a/gdb/procfs.c b/gdb/procfs.c
index b62539f..d074dd3 100644
--- a/gdb/procfs.c
+++ b/gdb/procfs.c
@@ -4967,7 +4967,7 @@ find_memory_regions_callback (struct prmap *map,
 		  (map->pr_mflags & MA_READ) != 0,
 		  (map->pr_mflags & MA_WRITE) != 0,
 		  (map->pr_mflags & MA_EXEC) != 0,
-		  1, /* MODIFIED is unknown, pass it as true.  */
+		  MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 		  data);
 }
 
diff --git a/gdb/testsuite/gdb.base/coredump-filter.c b/gdb/testsuite/gdb.base/coredump-filter.c
new file mode 100644
index 0000000..192c469
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.c
@@ -0,0 +1,61 @@
+/* Copyright 2015 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <assert.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+
+static void *
+do_mmap (void *addr, size_t size, int prot, int flags, int fd, off_t offset)
+{
+  void *ret = mmap (addr, size, prot, flags, fd, offset);
+
+  assert (ret != NULL);
+  return ret;
+}
+
+int
+main (int argc, char *argv[])
+{
+  const size_t size = 10;
+  const int default_prot = PROT_READ | PROT_WRITE;
+  char *private_anon, *shared_anon;
+  char *dont_dump;
+  int i;
+
+  private_anon = do_mmap (NULL, size, default_prot,
+			  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (private_anon, 0x11, size);
+
+  shared_anon = do_mmap (NULL, size, default_prot,
+			 MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+  memset (shared_anon, 0x22, size);
+
+  dont_dump = do_mmap (NULL, size, default_prot,
+		       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (dont_dump, 0x55, size);
+  i = madvise (dont_dump, size, MADV_DONTDUMP);
+  assert_perror (errno);
+  assert (i == 0);
+
+  return 0; /* break-here */
+}
diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
new file mode 100644
index 0000000..c7ae91d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.exp
@@ -0,0 +1,129 @@
+# Copyright 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile debug] } {
+    untested $testfile.exp
+    return -1
+}
+
+if { ![runto_main] } {
+    untested $testfile.exp
+    return -1
+}
+
+gdb_breakpoint [gdb_get_line_number "break-here"]
+gdb_continue_to_breakpoint "break-here" ".* break-here .*"
+
+proc do_save_core { filter_flag core ipid } {
+    verbose -log "writing $filter_flag to /proc/$ipid/coredump_filter"
+    if { [catch {open /proc/$ipid/coredump_filter w} fileid] } {
+	untested $testfile.exp
+	return -1
+    }
+
+    # Set coredump_filter to the value we want
+    puts $fileid $filter_flag
+    close $fileid
+
+    # Generate a corefile
+    gdb_gcore_cmd "$core" "save corefile $core"
+}
+
+proc do_load_and_test_core { core var working_var working_value } {
+    global hex decimal addr
+
+    set core_loaded [gdb_core_cmd "$core" "load $core"]
+    if { $core_loaded == -1 } {
+	fail "loading $core"
+	return
+    }
+
+    # Use 'int' as any variants of 'char' try to read the target bytes.
+    gdb_test "print *(unsigned int *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
+	"printing $var when core is loaded (should not work)"
+    gdb_test "print/x *(unsigned int *) $addr($working_var)" " = $working_value.*" \
+	"print/x *$working_var ( = $working_value)"
+}
+
+set non_private_anon_core [standard_output_file non-private-anon.gcore]
+set non_shared_anon_core [standard_output_file non-shared-anon.gcore]
+set dont_dump_core [standard_output_file dont-dump.gcore]
+
+# We will generate a few corefiles
+#
+# This list is composed by sub-lists, and their elements are (in
+# order):
+#
+# - name of the test
+# - hexadecimal value to be put in the /proc/PID/coredump_filter file
+# - name of the variable that contains the name of the corefile to be
+#   generated (including the initial $).
+# - name of the variable in the C source code that points to the
+#   memory mapping that will NOT be present in the corefile.
+# - name of a variable in the C source code that points to a memory
+#   mapping that WILL be present in the corefile
+# - corresponding value expected for the above variable
+
+set all_corefiles { { "non-Private-Anonymous" "0x7e" \
+			  $non_private_anon_core \
+			  "private_anon" \
+			  "shared_anon" "0x22" }
+    { "non-Shared-Anonymous" "0x7d" \
+	  $non_shared_anon_core "shared_anon" \
+	  "private_anon" "0x11" }
+    { "DoNotDump" "0x33" \
+	  $dont_dump_core "dont_dump" \
+	  "shared_anon" "0x22" } }
+
+set core_supported [gdb_gcore_cmd "$non_private_anon_core" "save a corefile"]
+if { !$core_supported } {
+    untested $testfile.exp
+    return -1
+}
+
+# Getting the inferior's PID
+gdb_test_multiple "info inferiors" "getting inferior pid" {
+    -re "process \($decimal\).*\r\n$gdb_prompt $" {
+	set infpid $expect_out(1,string)
+    }
+}
+
+foreach item $all_corefiles {
+    foreach name [list [lindex $item 3] [lindex $item 4]] {
+	set test "print/x $name"
+	gdb_test_multiple $test $test {
+	    -re " = \($hex\)\r\n$gdb_prompt $" {
+		set addr($name) $expect_out(1,string)
+	    }
+	}
+    }
+}
+
+foreach item $all_corefiles {
+    with_test_prefix "saving corefile for [lindex $item 0]" {
+	do_save_core [lindex $item 1] [subst [lindex $item 2]] $infpid
+    }
+}
+
+clean_restart $testfile
+
+foreach item $all_corefiles {
+    with_test_prefix "loading and testing corefile for [lindex $item 0]" {
+	do_load_and_test_core [subst [lindex $item 2]] [lindex $item 3] \
+	    [lindex $item 4] [lindex $item 5]
+    }
+}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-05  3:48 [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Sergio Durigan Junior
@ 2015-03-05 15:48 ` Jan Kratochvil
  2015-03-05 20:53   ` Sergio Durigan Junior
  2015-03-12 21:39 ` [PATCH v2] " Sergio Durigan Junior
  2015-03-13 19:37 ` [PATCH] " Pedro Alves
  2 siblings, 1 reply; 46+ messages in thread
From: Jan Kratochvil @ 2015-03-05 15:48 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches, Pedro Alves, Oleg Nesterov

On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
>   bit 0  Dump anonymous private mappings.
>   bit 1  Dump anonymous shared mappings.
>   bit 2  Dump file-backed private mappings.
>   bit 3  Dump file-backed shared mappings.
>   bit 4 (since Linux 2.6.24)
>          Dump ELF headers.
>   bit 5 (since Linux 2.6.28)
>          Dump private huge pages.
>   bit 6 (since Linux 2.6.28)
>          Dump shared huge pages.
[...]
> The default value for this file, used by the Linux kernel, is 0x33,
> which means that bits 0, 1 and 4 are enabled.  This is also the default

and 5

> for GDB implemented in this patch, FWIW.
[...]
> With Oleg's help, we could improve the current algorithm for determining
> whether a memory mapping is anonymous/file-backed, private/shared.  GDB
> now also respects the MADV_DONTDUMP flag and does not dump the memory

s/does not dump/does dump/

> mapping marked as so, and won't try to dump "[vsyscall]" or "[vdso]"
> mappings as before (just like the Linux kernel).

Currently it also tries to dump [vvar] (by default rules) but that is
unreadable for some reason, causing:
warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
                                                                ^^^^^^^^^^^^^^
Saved corefile /tmp/1j
(gdb) _
# grep 7ffff6ceb000 /proc/$p/maps
7ffff6ceb000-7ffff6ced000 r--p 00000000 00:00 0                          [vvar]
^^^^^^^^^^^^                                                              ^^^^

I do not know what [vvar] is good for and why it cannot be read.


>   It is worth mentioning that, from all those checks described above,
>   the most fragile is the one to see if the file name ends with "
>   (deleted)".  This does not necessarily mean that the mapping is
>   anonymous, because the deleted file associated with the mapping may
>   have been a hard link to another file, for example.  The Linux kernel
>   checks to see if "i_nlink == 0", but GDB cannot easily do this check.

# stat /proc/21604/map_files/400000-4ec000 
  File: ‘/proc/21604/map_files/400000-4ec000’ -> ‘/tmp/bash-deleted’
  Size: 64        	Blocks: 0          IO Block: 1024   symbolic link
Device: 3h/3d	Inode: 1554082     Links: 1
# stat -L /proc/21604/map_files/400000-4ec000 
  File: ‘/proc/21604/map_files/400000-4ec000’
  Size: 1051464   	Blocks: 2056       IO Block: 4096   regular file
Device: fd01h/64769d	Inode: 5509691     Links: 1
# rm /tmp/bash-deleted
# stat -L /proc/21604/map_files/400000-4ec000 
  File: ‘/proc/21604/map_files/400000-4ec000’
  Size: 1051464   	Blocks: 2056       IO Block: 4096   regular file
Device: fd01h/64769d	Inode: 5509691     Links: 0
                                                  ^

One could find if i_nlink == 0 if it would be enough.  But it would work only
if GDB runs as root so it is probably not worth coding it:

$ ls -ld /proc/3803/map_files
dr-x------ 2 lace lace 0 Mar  5 16:44 /proc/3803/map_files/
$ stat /proc/3803/map_files/400000-4ec000
stat: cannot stat ‘/proc/3803/map_files/400000-4ec000’: Operation not permitted


Jan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-05 15:48 ` Jan Kratochvil
@ 2015-03-05 20:53   ` Sergio Durigan Junior
  2015-03-05 20:57     ` Jan Kratochvil
  0 siblings, 1 reply; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-05 20:53 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: GDB Patches, Pedro Alves, Oleg Nesterov

On Thursday, March 05 2015, Jan Kratochvil wrote:

> On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
>>   bit 0  Dump anonymous private mappings.
>>   bit 1  Dump anonymous shared mappings.
>>   bit 2  Dump file-backed private mappings.
>>   bit 3  Dump file-backed shared mappings.
>>   bit 4 (since Linux 2.6.24)
>>          Dump ELF headers.
>>   bit 5 (since Linux 2.6.28)
>>          Dump private huge pages.
>>   bit 6 (since Linux 2.6.28)
>>          Dump shared huge pages.
> [...]
>> The default value for this file, used by the Linux kernel, is 0x33,
>> which means that bits 0, 1 and 4 are enabled.  This is also the default
>
> and 5

And 5.  Thanks.

>> for GDB implemented in this patch, FWIW.
> [...]
>> With Oleg's help, we could improve the current algorithm for determining
>> whether a memory mapping is anonymous/file-backed, private/shared.  GDB
>> now also respects the MADV_DONTDUMP flag and does not dump the memory
>
> s/does not dump/does dump/

No, it doesn't dump.  MADV_DONTDUMP activates the "dd" flag in VmFlags,
and the patch looks for it and, if it finds the flag, it doesn't mark
the memory mapping to be dumped.  However, GDB will create the section
header in the corefile.

>> mapping marked as so, and won't try to dump "[vsyscall]" or "[vdso]"
>> mappings as before (just like the Linux kernel).
>
> Currently it also tries to dump [vvar] (by default rules) but that is
> unreadable for some reason, causing:
> warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
>                                                                 ^^^^^^^^^^^^^^
> Saved corefile /tmp/1j
> (gdb) _
> # grep 7ffff6ceb000 /proc/$p/maps
> 7ffff6ceb000-7ffff6ced000 r--p 00000000 00:00 0                          [vvar]
> ^^^^^^^^^^^^                                                              ^^^^
>
> I do not know what [vvar] is good for and why it cannot be read.

I totally forgot about this, even though we discussed it before.  Sorry;
I am sending a new version of the patch which addresses this issue.

>>   It is worth mentioning that, from all those checks described above,
>>   the most fragile is the one to see if the file name ends with "
>>   (deleted)".  This does not necessarily mean that the mapping is
>>   anonymous, because the deleted file associated with the mapping may
>>   have been a hard link to another file, for example.  The Linux kernel
>>   checks to see if "i_nlink == 0", but GDB cannot easily do this check.
>
> # stat /proc/21604/map_files/400000-4ec000 
>   File: ‘/proc/21604/map_files/400000-4ec000’ -> ‘/tmp/bash-deleted’
>   Size: 64        	Blocks: 0          IO Block: 1024   symbolic link
> Device: 3h/3d	Inode: 1554082     Links: 1
> # stat -L /proc/21604/map_files/400000-4ec000 
>   File: ‘/proc/21604/map_files/400000-4ec000’
>   Size: 1051464   	Blocks: 2056       IO Block: 4096   regular file
> Device: fd01h/64769d	Inode: 5509691     Links: 1
> # rm /tmp/bash-deleted
> # stat -L /proc/21604/map_files/400000-4ec000 
>   File: ‘/proc/21604/map_files/400000-4ec000’
>   Size: 1051464   	Blocks: 2056       IO Block: 4096   regular file
> Device: fd01h/64769d	Inode: 5509691     Links: 0
>                                                   ^
>
> One could find if i_nlink == 0 if it would be enough.  But it would work only
> if GDB runs as root so it is probably not worth coding it:
>
> $ ls -ld /proc/3803/map_files
> dr-x------ 2 lace lace 0 Mar  5 16:44 /proc/3803/map_files/
> $ stat /proc/3803/map_files/400000-4ec000
> stat: cannot stat ‘/proc/3803/map_files/400000-4ec000’: Operation not permitted

Yeah, but it would still be much easier if this information were present
in the smaps file directly.

Here's the updated patch that filters out the [vvar] mapping.

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

gdb/ChangeLog:
2015-03-04  Sergio Durigan Junior  <sergiodj@redhat.com>
	    Jan Kratochvil  <jan.kratochvil@redhat.com>
	    Oleg Nesterov  <oleg@redhat.com>

	PR corefiles/16092
	* common/common-defs.h (enum memory_mapping_state): New enum.
	* defs.h (find_memory_region_ftype): Remove 'int modified'
	parameter, replacing by 'enum memory_mapping_state state'.
	* gcore.c (gcore_create_callback): Likewise.  Change 'if/else'
	statements and improve the logic of deciding when to ignore a
	memory mapping.
	(objfile_find_memory_regions): Passing
	'MEMORY_MAPPING_UNKNOWN_STATE' or 'MEMORY_MAPPING_MODIFIED' when
	needed to 'func' callback, instead of saying the memory mapping
	was modified even without knowing it.
	* gnu-nat.c (gnu_find_memory_regions): Likewise.
	* linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'.
	New enum identifying the various options of the coredump_filter
	file.
	(struct smaps_vmflags): New struct.
	(use_coredump_filter): New variable.
	(decode_vmflags): New function.
	(mapping_is_anonymous_p): Likewise.
	(dump_mapping_p): Likewise.
	(linux_find_memory_region_ftype): Remove 'int modified' parameter,
	replacing by 'enum memory_mapping_state state'.
	(linux_find_memory_regions_full): New variables
	'coredumpfilter_name', 'coredumpfilterdata', 'pid',
	'filterflags'.  Read /proc/<PID>/smaps file; improve parsing of
	its information.  Implement memory mapping filtering based on its
	contents.
	(linux_find_memory_regions_thunk): Remove 'int modified'
	parameter, replacing by 'enum memory_mapping_state state'.
	(linux_make_mappings_callback): Likewise.
	(find_mapping_size): Likewise.
	(show_use_coredump_filter): New function.
	(_initialize_linux_tdep): New command 'set use-coredump-filter'.
	* procfs.c (find_memory_regions_callback): Passing
	'MEMORY_MAPPING_UNKNOWN_STATE' when needed to 'func' callback,
	instead of saying the memory mapping was modified even without
	knowing it.

gdb/doc/ChangeLog:
2015-03-04  Sergio Durigan Junior  <sergiodj@redhat.com>

	PR corefiles/16092
	* gdb.texinfo (gcore): Mention new command 'set
	use-coredump-filter'.
	(set use-coredump-filter): Document new command.

gdb/testsuite/ChangeLog:
2015-03-04  Sergio Durigan Junior  <sergiodj@redhat.com>

	PR corefiles/16092
	* gdb.base/coredump-filter.c: New file.
	* gdb.base/coredump-filter.exp: Likewise.

diff --git a/gdb/common/common-defs.h b/gdb/common/common-defs.h
index 62d9de5..01b05f5 100644
--- a/gdb/common/common-defs.h
+++ b/gdb/common/common-defs.h
@@ -60,4 +60,14 @@
 # define EXTERN_C_POP
 #endif
 
+/* Enum used to inform the state of a memory mapping.  This is used in
+   functions implementing find_memory_region_ftype.  */
+
+enum memory_mapping_state
+  {
+    MEMORY_MAPPING_MODIFIED,
+    MEMORY_MAPPING_UNMODIFIED,
+    MEMORY_MAPPING_UNKNOWN_STATE,
+  };
+
 #endif /* COMMON_DEFS_H */
diff --git a/gdb/defs.h b/gdb/defs.h
index 72512f6..4829b62 100644
--- a/gdb/defs.h
+++ b/gdb/defs.h
@@ -338,7 +338,8 @@ extern void init_source_path (void);
 
 typedef int (*find_memory_region_ftype) (CORE_ADDR addr, unsigned long size,
 					 int read, int write, int exec,
-					 int modified, void *data);
+					 enum memory_mapping_state state,
+					 void *data);
 
 /* * Possible lvalue types.  Like enum language, this should be in
    value.h, but needs to be here for the same reason.  */
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 4b76ce9..e575eae 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -10952,6 +10952,67 @@ specified, the file name defaults to @file{core.@var{pid}}, where
 
 Note that this command is implemented only for some systems (as of
 this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
+
+On @sc{gnu}/Linux, this command can take into account the value of the
+file @file{/proc/@var{pid}/coredump_filter} when generating the core
+dump (@pxref{set use-coredump-filter}).
+
+@kindex set use-coredump-filter
+@anchor{set use-coredump-filter}
+@item set use-coredump-filter on
+@itemx set use-coredump-filter off
+Enable or disable the use of the file
+@file{/proc/@var{pid}/coredump_filter} when generating core dump
+files.  This file is used by the Linux kernel to decide what types of
+memory mappings will be dumped or ignored when generating a core dump
+file.
+
+To make use of this feature, you have to write in the
+@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
+which is a bit mask representing the memory mapping types.  If a bit
+is set in the bit mask, then the memory mappings of the corresponding
+types will be dumped; otherwise, they will be ignored.  The bits in
+this bit mask have the following meanings:
+
+@table @code
+@item bit 0
+Dump anonymous private mappings.
+@item bit 1
+Dump anonymous shared mappings.
+@item bit 2
+Dump file-backed private mappings.
+@item bit 3
+Dump file-backed shared mappings.
+@item bit 4
+(since Linux 2.6.24)
+Dump ELF headers. (@value{GDBN} does not take this bit into account)
+@item bit 5
+(since Linux 2.6.28)
+Dump private huge pages.
+@item bit 6
+(since Linux 2.6.28)
+Dump shared huge pages.
+@end table
+
+For example, supposing that the @code{pid} of the program being
+debugging is @code{1234}, if you wanted to dump everything except the
+anonymous private and the file-backed shared mappings, you would do:
+
+@smallexample
+$ echo 0x76 > /proc/1234/coredump_filter
+@end smallexample
+
+For more documentation about how to use the @file{coredump_filter}
+file, see the manpage of @code{proc(5)}.
+
+By default, this option is @code{on}.  If this option is turned
+@code{off}, @value{GDBN} will not read the @file{coredump_filter}
+file, but it uses the same default value as the Linux kernel in order
+to decide which pages will be dumped in the core dump file.  This
+value currently is @code{0x33}, which means that the bits @code{0}
+(anonymous private mappings), @code{1} (anonymous shared mappings) and
+@code{4} (ELF headers) are active.  This will cause these memory
+mappings to be dumped automatically.
 @end table
 
 @node Character Sets
diff --git a/gdb/gcore.c b/gdb/gcore.c
index 1ebff2a..9edcf40 100644
--- a/gdb/gcore.c
+++ b/gdb/gcore.c
@@ -408,27 +408,22 @@ make_output_phdrs (bfd *obfd, asection *osec, void *ignored)
 
 static int
 gcore_create_callback (CORE_ADDR vaddr, unsigned long size, int read,
-		       int write, int exec, int modified, void *data)
+		       int write, int exec, enum memory_mapping_state state,
+		       void *data)
 {
   bfd *obfd = data;
   asection *osec;
   flagword flags = SEC_ALLOC | SEC_HAS_CONTENTS | SEC_LOAD;
 
-  /* If the memory segment has no permissions set, ignore it, otherwise
-     when we later try to access it for read/write, we'll get an error
-     or jam the kernel.  */
-  if (read == 0 && write == 0 && exec == 0 && modified == 0)
-    {
-      if (info_verbose)
-        {
-          fprintf_filtered (gdb_stdout, "Ignore segment, %s bytes at %s\n",
-                            plongest (size), paddress (target_gdbarch (), vaddr));
-        }
-
-      return 0;
-    }
-
-  if (write == 0 && modified == 0 && !solib_keep_data_in_core (vaddr, size))
+  /* If the memory segment has no read permission set, or if it has
+     been marked as unmodified, then we have to generate a segment
+     header for it, but without contents (i.e., FileSiz = 0),
+     otherwise when we later try to access it for read/write, we'll
+     get an error or jam the kernel.  */
+  if (read == 0 || state == MEMORY_MAPPING_UNMODIFIED)
+    flags &= ~(SEC_LOAD | SEC_HAS_CONTENTS);
+  else if (write == 0 && state == MEMORY_MAPPING_UNKNOWN_STATE
+	   && !solib_keep_data_in_core (vaddr, size))
     {
       /* See if this region of memory lies inside a known file on disk.
 	 If so, we can avoid copying its contents by clearing SEC_LOAD.  */
@@ -521,7 +516,8 @@ objfile_find_memory_regions (struct target_ops *self,
 			 1, /* All sections will be readable.  */
 			 (flags & SEC_READONLY) == 0, /* Writable.  */
 			 (flags & SEC_CODE) != 0, /* Executable.  */
-			 1, /* MODIFIED is unknown, pass it as true.  */
+			 MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is
+							 unknown.  */
 			 obfd);
 	  if (ret != 0)
 	    return ret;
@@ -534,7 +530,7 @@ objfile_find_memory_regions (struct target_ops *self,
 	     1, /* Stack section will be readable.  */
 	     1, /* Stack section will be writable.  */
 	     0, /* Stack section will not be executable.  */
-	     1, /* Stack section will be modified.  */
+	     MEMORY_MAPPING_MODIFIED, /* Stack section will be modified.  */
 	     obfd);
 
   /* Make a heap segment.  */
@@ -543,7 +539,7 @@ objfile_find_memory_regions (struct target_ops *self,
 	     1, /* Heap section will be readable.  */
 	     1, /* Heap section will be writable.  */
 	     0, /* Heap section will not be executable.  */
-	     1, /* Heap section will be modified.  */
+	     MEMORY_MAPPING_MODIFIED, /* Heap section will be modified.  */
 	     obfd);
 
   return 0;
diff --git a/gdb/gnu-nat.c b/gdb/gnu-nat.c
index d830773..60612a7 100644
--- a/gdb/gnu-nat.c
+++ b/gdb/gnu-nat.c
@@ -2611,7 +2611,7 @@ gnu_find_memory_regions (struct target_ops *self,
 		     last_protection & VM_PROT_READ,
 		     last_protection & VM_PROT_WRITE,
 		     last_protection & VM_PROT_EXECUTE,
-		     1, /* MODIFIED is unknown, pass it as true.  */
+		     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 		     data);
 	  last_region_address = region_address;
 	  last_region_end = region_address += region_length;
@@ -2625,7 +2625,7 @@ gnu_find_memory_regions (struct target_ops *self,
 	     last_protection & VM_PROT_READ,
 	     last_protection & VM_PROT_WRITE,
 	     last_protection & VM_PROT_EXECUTE,
-	     1, /* MODIFIED is unknown, pass it as true.  */
+	     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 	     data);
 
   return 0;
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index d9884f3..6295a04 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -35,9 +35,58 @@
 #include "observer.h"
 #include "objfiles.h"
 #include "infcall.h"
+#include "gdbcmd.h"
+#include "gdb_regex.h"
 
 #include <ctype.h>
 
+/* This enum represents the values that the user can choose when
+   informing the Linux kernel about which memory mappings will be
+   dumped in a corefile.  They are described in the file
+   Documentation/filesystems/proc.txt, inside the Linux kernel
+   tree.  */
+
+enum
+  {
+    COREFILTER_ANON_PRIVATE = 1 << 0,
+    COREFILTER_ANON_SHARED = 1 << 1,
+    COREFILTER_MAPPED_PRIVATE = 1 << 2,
+    COREFILTER_MAPPED_SHARED = 1 << 3,
+    COREFILTER_ELF_HEADERS = 1 << 4,
+    COREFILTER_HUGETLB_PRIVATE = 1 << 5,
+    COREFILTER_HUGETLB_SHARED = 1 << 6,
+  };
+
+struct smaps_vmflags
+  {
+    /* Zero if this structure has not been initialized yet.  It
+       probably means that the Linux kernel being used does not emit
+       the "VmFlags:" field on "/proc/PID/smaps".  */
+
+    unsigned int initialized_p : 1;
+
+    /* Memory mapped I/O area (VM_IO, "io").  */
+
+    unsigned int io_page : 1;
+
+    /* Area uses huge TLB pages (VM_HUGETLB, "ht").  */
+
+    unsigned int uses_huge_tlb : 1;
+
+    /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd").  */
+
+    unsigned int exclude_coredump : 1;
+
+    /* Is this a MAP_SHARED mapping (VM_SHARED, "sh").  */
+
+    unsigned int shared_mapping : 1;
+  };
+
+/* Whether to take the /proc/PID/coredump_filter into account when
+   generating a corefile.  */
+
+static int use_coredump_filter = 1;
+
 /* This enum represents the signals' numbers on a generic architecture
    running the Linux kernel.  The definition of "generic" comes from
    the file <include/uapi/asm-generic/signal.h>, from the Linux kernel
@@ -381,6 +430,164 @@ read_mapping (const char *line,
   *filename = p;
 }
 
+/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
+
+   This function was based on the documentation found on
+   <Documentation/filesystems/proc.txt>, on the Linux kernel.
+
+   Linux kernels before commit
+   834f82e2aa9a8ede94b17b656329f850c1471514 do not have this field on
+   smaps.  */
+
+static void
+decode_vmflags (char *p, struct smaps_vmflags *v)
+{
+  char *saveptr;
+  char *s;
+
+  v->initialized_p = 1;
+  p = skip_to_space (p);
+  p = skip_spaces (p);
+
+  for (s = strtok_r (p, " ", &saveptr);
+       s != NULL;
+       s = strtok_r (NULL, " ", &saveptr))
+    {
+      if (strcmp (s, "io") == 0)
+	v->io_page = 1;
+      else if (strcmp (s, "ht") == 0)
+	v->uses_huge_tlb = 1;
+      else if (strcmp (s, "dd") == 0)
+	v->exclude_coredump = 1;
+      else if (strcmp (s, "sh") == 0)
+	v->shared_mapping = 1;
+    }
+}
+
+/* Return 1 if the memory mapping is anonymous, 0 otherwise.
+
+   FILENAME is the name of the file present in the first line of the
+   memory mapping, in the "/proc/PID/smaps" output.  For example, if
+   the first line is:
+
+   7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770   /path/to/file
+
+   Then FILENAME will be "/path/to/file".  */
+
+static int
+mapping_is_anonymous_p (const char *filename)
+{
+  static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex;
+  static int init_regex_p = 0;
+
+  if (!init_regex_p)
+    {
+      struct cleanup *c = make_cleanup (null_cleanup, NULL);
+
+      init_regex_p = 1;
+      compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match /dev/zero "
+			     "filename"));
+      compile_rx_or_error (&shmem_file_regex,
+			   "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match shmem "
+			     "filenames"));
+      /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the
+	 Linux kernel's 'n_link == 0' code, which is responsible to
+	 decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS'
+	 mapping.  In other words, if FILE_DELETED_REGEX matches, it
+	 does not necessarily mean that we are dealing with an
+	 anonymous shared mapping.  However, there is no easy way to
+	 detect this currently, so this is the best approximation we
+	 have.
+
+	 As a result, GDB will dump readonly pages of deleted
+	 executables when using the default value of coredump_filter
+	 (0x33), while the Linux kernel will not dump those pages.
+	 But we can live with that.  */
+      compile_rx_or_error (&file_deleted_regex, " (deleted)$",
+			   _("Could not compile regex to match "
+			     "'<file> (deleted)'"));
+      /* We will never release these regexes, so just discard the
+	 cleanups.  */
+      discard_cleanups (c);
+    }
+
+  if (*filename == '\0'
+      || regexec (&dev_zero_regex, filename, 0, NULL, 0) == 0
+      || regexec (&shmem_file_regex, filename, 0, NULL, 0) == 0
+      || regexec (&file_deleted_regex, filename, 0, NULL, 0) == 0)
+    return 1;
+
+  return 0;
+}
+
+/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
+   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
+   greater than 0 if it should.  */
+
+static int
+dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
+		int maybe_private_p, int mapping_anon_p, const char *filename)
+{
+  /* Initially, we trust in what we received from outside.  This value
+     may not be very precise (i.e., it was probably gathered from the
+     permission line in the /proc/PID/smaps list, which actually
+     refers to VM_MAYSHARE, and not VM_SHARED), but it is what we have
+     for now.  */
+  int private_p = maybe_private_p;
+
+  /* We always dump vDSO and vsyscall mappings.  */
+  if (strcmp ("[vdso]", filename) == 0
+      || strcmp ("[vsyscall]", filename) == 0)
+    return 1;
+
+  /* The [vvar] memory mapping cannot be read, so we just ignore it
+     and don't dump its contents.  */
+  if (strcmp ("[vvar]", filename) == 0)
+    return 0;
+
+  if (v->initialized_p)
+    {
+      /* We never dump I/O mappings.  */
+      if (v->io_page)
+	return 0;
+
+      /* Check if we should exclude this mapping.  */
+      if (v->exclude_coredump)
+	return 0;
+
+      /* Updating our notion of whether this mapping is shared or
+	 private based on a trustworthy value.  */
+      private_p = !v->shared_mapping;
+
+      /* HugeTLB checking.  */
+      if (v->uses_huge_tlb)
+	{
+	  if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE))
+	      || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED)))
+	    return 1;
+
+	  return 0;
+	}
+    }
+
+  if (private_p)
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_PRIVATE) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_PRIVATE) != 0;
+    }
+  else
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_SHARED) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_SHARED) != 0;
+    }
+}
+
 /* Implement the "info proc" command.  */
 
 static void
@@ -807,7 +1014,8 @@ linux_core_info_proc (struct gdbarch *gdbarch, const char *args,
 typedef int linux_find_memory_region_ftype (ULONGEST vaddr, ULONGEST size,
 					    ULONGEST offset, ULONGEST inode,
 					    int read, int write,
-					    int exec, int modified,
+					    int exec,
+					    enum memory_mapping_state state,
 					    const char *filename,
 					    void *data);
 
@@ -819,48 +1027,84 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 				void *obfd)
 {
   char mapsfilename[100];
-  char *data;
+  char coredumpfilter_name[100];
+  char *data, *coredumpfilterdata;
+  pid_t pid;
+  /* Default dump behavior of coredump_filter (0x33), according to
+     Documentation/filesystems/proc.txt from the Linux kernel
+     tree.  */
+  unsigned int filterflags = (COREFILTER_ANON_PRIVATE
+			      | COREFILTER_ANON_SHARED
+			      | COREFILTER_ELF_HEADERS
+			      | COREFILTER_HUGETLB_PRIVATE);
 
   /* We need to know the real target PID to access /proc.  */
   if (current_inferior ()->fake_pid_p)
     return 1;
 
-  xsnprintf (mapsfilename, sizeof mapsfilename,
-	     "/proc/%d/smaps", current_inferior ()->pid);
+  pid = current_inferior ()->pid;
+
+  if (use_coredump_filter)
+    {
+      xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name),
+		 "/proc/%d/coredump_filter", pid);
+      coredumpfilterdata = target_fileio_read_stralloc (coredumpfilter_name);
+      if (coredumpfilterdata != NULL)
+	{
+	  sscanf (coredumpfilterdata, "%x", &filterflags);
+	  xfree (coredumpfilterdata);
+	}
+    }
+
+  xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid);
   data = target_fileio_read_stralloc (mapsfilename);
   if (data == NULL)
     {
       /* Older Linux kernels did not support /proc/PID/smaps.  */
-      xsnprintf (mapsfilename, sizeof mapsfilename,
-		 "/proc/%d/maps", current_inferior ()->pid);
+      xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid);
       data = target_fileio_read_stralloc (mapsfilename);
     }
-  if (data)
+
+  if (data != NULL)
     {
       struct cleanup *cleanup = make_cleanup (xfree, data);
-      char *line;
+      char *line, *t;
 
-      line = strtok (data, "\n");
-      while (line)
+      line = strtok_r (data, "\n", &t);
+      while (line != NULL)
 	{
 	  ULONGEST addr, endaddr, offset, inode;
 	  const char *permissions, *device, *filename;
+	  struct smaps_vmflags v;
 	  size_t permissions_len, device_len;
-	  int read, write, exec;
-	  int modified = 0, has_anonymous = 0;
+	  int read, write, exec, private;
+	  enum memory_mapping_state state;
+	  int has_anonymous = 0;
+	  int mapping_anon_p;
 
+	  memset (&v, 0, sizeof (v));
 	  read_mapping (line, &addr, &endaddr, &permissions, &permissions_len,
 			&offset, &device, &device_len, &inode, &filename);
+	  mapping_anon_p = mapping_is_anonymous_p (filename);
 
 	  /* Decode permissions.  */
 	  read = (memchr (permissions, 'r', permissions_len) != 0);
 	  write = (memchr (permissions, 'w', permissions_len) != 0);
 	  exec = (memchr (permissions, 'x', permissions_len) != 0);
+	  /* 'private' here actually means VM_MAYSHARE, and not
+	     VM_SHARED.  In order to know if a mapping is really
+	     private or not, we must check the flag "sh" in the
+	     VmFlags field.  This is done by decode_vmflags.  However,
+	     if we are using an old Linux kernel, we will not have the
+	     VmFlags there.  In this case, there is really no way to
+	     know if we are dealing with VM_SHARED, so we just assume
+	     that VM_MAYSHARE is enough.  */
+	  private = memchr (permissions, 'p', permissions_len) != 0;
 
 	  /* Try to detect if region was modified by parsing smaps counters.  */
-	  for (line = strtok (NULL, "\n");
-	       line && line[0] >= 'A' && line[0] <= 'Z';
-	       line = strtok (NULL, "\n"))
+	  for (line = strtok_r (NULL, "\n", &t);
+	       line != NULL && line[0] >= 'A' && line[0] <= 'Z';
+	       line = strtok_r (NULL, "\n", &t))
 	    {
 	      char keyword[64 + 1];
 
@@ -869,11 +1113,17 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 		  warning (_("Error parsing {s,}maps file '%s'"), mapsfilename);
 		  break;
 		}
+
 	      if (strcmp (keyword, "Anonymous:") == 0)
-		has_anonymous = 1;
-	      if (strcmp (keyword, "Shared_Dirty:") == 0
-		  || strcmp (keyword, "Private_Dirty:") == 0
-		  || strcmp (keyword, "Swap:") == 0
+		{
+		  /* Older Linux kernels did not support the
+		     "Anonymous:" counter.  Check it here.  */
+		  has_anonymous = 1;
+		}
+	      else if (strcmp (keyword, "VmFlags:") == 0)
+		decode_vmflags (line, &v);
+
+	      if (strcmp (keyword, "AnonHugePages:") == 0
 		  || strcmp (keyword, "Anonymous:") == 0)
 		{
 		  unsigned long number;
@@ -884,19 +1134,43 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 			       mapsfilename);
 		      break;
 		    }
-		  if (number != 0)
-		    modified = 1;
+		  if (number > 0)
+		    {
+		      /* Even if we are dealing with a file-backed
+			 mapping, if it contains anonymous pages we
+			 consider it to be an anonymous mapping,
+			 because this is what the Linux kernel does:
+
+			 // Dump segments that have been written to.
+			 if (vma->anon_vma && FILTER(ANON_PRIVATE))
+			 	goto whole;
+		      */
+		      mapping_anon_p = 1;
+		    }
 		}
 	    }
 
-	  /* Older Linux kernels did not support the "Anonymous:" counter.
-	     If it is missing, we can't be sure - dump all the pages.  */
-	  if (!has_anonymous)
-	    modified = 1;
+	  /* If a mapping should not be dumped we still should create
+	     a segment for it, just without SEC_LOAD (see
+	     gcore_create_callback).  */
+	  if (has_anonymous)
+	    {
+	      if (dump_mapping_p (filterflags, &v, private, mapping_anon_p,
+				  filename))
+		state = MEMORY_MAPPING_MODIFIED;
+	      else
+		state = MEMORY_MAPPING_UNMODIFIED;
+	    }
+	  else
+	    {
+	      /* Older Linux kernels did not support the "Anonymous:" counter.
+		 If it is missing, we can't be sure - dump all the pages.  */
+	      state = MEMORY_MAPPING_UNKNOWN_STATE;
+	    }
 
 	  /* Invoke the callback function to create the corefile segment.  */
 	  func (addr, endaddr - addr, offset, inode,
-		read, write, exec, modified, filename, obfd);
+		read, write, exec, state, filename, obfd);
 	}
 
       do_cleanups (cleanup);
@@ -926,12 +1200,13 @@ struct linux_find_memory_regions_data
 static int
 linux_find_memory_regions_thunk (ULONGEST vaddr, ULONGEST size,
 				 ULONGEST offset, ULONGEST inode,
-				 int read, int write, int exec, int modified,
+				 int read, int write, int exec,
+				 enum memory_mapping_state state,
 				 const char *filename, void *arg)
 {
   struct linux_find_memory_regions_data *data = arg;
 
-  return data->func (vaddr, size, read, write, exec, modified, data->obfd);
+  return data->func (vaddr, size, read, write, exec, state, data->obfd);
 }
 
 /* A variant of linux_find_memory_regions_full that is suitable as the
@@ -1074,7 +1349,8 @@ static linux_find_memory_region_ftype linux_make_mappings_callback;
 static int
 linux_make_mappings_callback (ULONGEST vaddr, ULONGEST size,
 			      ULONGEST offset, ULONGEST inode,
-			      int read, int write, int exec, int modified,
+			      int read, int write, int exec,
+			      enum memory_mapping_state state,
 			      const char *filename, void *data)
 {
   struct linux_make_mappings_data *map_data = data;
@@ -1869,7 +2145,8 @@ linux_gdb_signal_to_target (struct gdbarch *gdbarch,
 
 static int
 find_mapping_size (CORE_ADDR vaddr, unsigned long size,
-		   int read, int write, int exec, int modified,
+		   int read, int write, int exec,
+		   enum memory_mapping_state state,
 		   void *data)
 {
   struct mem_range *range = data;
@@ -1969,6 +2246,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot)
   return retval;
 }
 
+/* Display whether the gcore command is using the
+   /proc/PID/coredump_filter file.  */
+
+static void
+show_use_coredump_filter (struct ui_file *file, int from_tty,
+			  struct cmd_list_element *c, const char *value)
+{
+  fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to generate"
+			    " corefiles is %s.\n"), value);
+}
+
 /* To be called from the various GDB_OSABI_LINUX handlers for the
    various GNU/Linux architectures and machine types.  */
 
@@ -2005,4 +2293,16 @@ _initialize_linux_tdep (void)
   /* Observers used to invalidate the cache when needed.  */
   observer_attach_inferior_exit (invalidate_linux_cache_inf);
   observer_attach_inferior_appeared (invalidate_linux_cache_inf);
+
+  add_setshow_boolean_cmd ("use-coredump-filter", class_files,
+			   &use_coredump_filter, _("\
+Set whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Show whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Use this command to set whether gcore should consider the contents\n\
+of /proc/PID/coredump_filter when generating the corefile.  For more information\n\
+about this file, refer to the manpage of core(5)."),
+			   NULL, show_use_coredump_filter,
+			   &setlist, &showlist);
 }
diff --git a/gdb/procfs.c b/gdb/procfs.c
index b62539f..d074dd3 100644
--- a/gdb/procfs.c
+++ b/gdb/procfs.c
@@ -4967,7 +4967,7 @@ find_memory_regions_callback (struct prmap *map,
 		  (map->pr_mflags & MA_READ) != 0,
 		  (map->pr_mflags & MA_WRITE) != 0,
 		  (map->pr_mflags & MA_EXEC) != 0,
-		  1, /* MODIFIED is unknown, pass it as true.  */
+		  MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 		  data);
 }
 
diff --git a/gdb/testsuite/gdb.base/coredump-filter.c b/gdb/testsuite/gdb.base/coredump-filter.c
new file mode 100644
index 0000000..192c469
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.c
@@ -0,0 +1,61 @@
+/* Copyright 2015 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <assert.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+
+static void *
+do_mmap (void *addr, size_t size, int prot, int flags, int fd, off_t offset)
+{
+  void *ret = mmap (addr, size, prot, flags, fd, offset);
+
+  assert (ret != NULL);
+  return ret;
+}
+
+int
+main (int argc, char *argv[])
+{
+  const size_t size = 10;
+  const int default_prot = PROT_READ | PROT_WRITE;
+  char *private_anon, *shared_anon;
+  char *dont_dump;
+  int i;
+
+  private_anon = do_mmap (NULL, size, default_prot,
+			  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (private_anon, 0x11, size);
+
+  shared_anon = do_mmap (NULL, size, default_prot,
+			 MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+  memset (shared_anon, 0x22, size);
+
+  dont_dump = do_mmap (NULL, size, default_prot,
+		       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (dont_dump, 0x55, size);
+  i = madvise (dont_dump, size, MADV_DONTDUMP);
+  assert_perror (errno);
+  assert (i == 0);
+
+  return 0; /* break-here */
+}
diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
new file mode 100644
index 0000000..c7ae91d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.exp
@@ -0,0 +1,129 @@
+# Copyright 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile debug] } {
+    untested $testfile.exp
+    return -1
+}
+
+if { ![runto_main] } {
+    untested $testfile.exp
+    return -1
+}
+
+gdb_breakpoint [gdb_get_line_number "break-here"]
+gdb_continue_to_breakpoint "break-here" ".* break-here .*"
+
+proc do_save_core { filter_flag core ipid } {
+    verbose -log "writing $filter_flag to /proc/$ipid/coredump_filter"
+    if { [catch {open /proc/$ipid/coredump_filter w} fileid] } {
+	untested $testfile.exp
+	return -1
+    }
+
+    # Set coredump_filter to the value we want
+    puts $fileid $filter_flag
+    close $fileid
+
+    # Generate a corefile
+    gdb_gcore_cmd "$core" "save corefile $core"
+}
+
+proc do_load_and_test_core { core var working_var working_value } {
+    global hex decimal addr
+
+    set core_loaded [gdb_core_cmd "$core" "load $core"]
+    if { $core_loaded == -1 } {
+	fail "loading $core"
+	return
+    }
+
+    # Use 'int' as any variants of 'char' try to read the target bytes.
+    gdb_test "print *(unsigned int *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
+	"printing $var when core is loaded (should not work)"
+    gdb_test "print/x *(unsigned int *) $addr($working_var)" " = $working_value.*" \
+	"print/x *$working_var ( = $working_value)"
+}
+
+set non_private_anon_core [standard_output_file non-private-anon.gcore]
+set non_shared_anon_core [standard_output_file non-shared-anon.gcore]
+set dont_dump_core [standard_output_file dont-dump.gcore]
+
+# We will generate a few corefiles
+#
+# This list is composed by sub-lists, and their elements are (in
+# order):
+#
+# - name of the test
+# - hexadecimal value to be put in the /proc/PID/coredump_filter file
+# - name of the variable that contains the name of the corefile to be
+#   generated (including the initial $).
+# - name of the variable in the C source code that points to the
+#   memory mapping that will NOT be present in the corefile.
+# - name of a variable in the C source code that points to a memory
+#   mapping that WILL be present in the corefile
+# - corresponding value expected for the above variable
+
+set all_corefiles { { "non-Private-Anonymous" "0x7e" \
+			  $non_private_anon_core \
+			  "private_anon" \
+			  "shared_anon" "0x22" }
+    { "non-Shared-Anonymous" "0x7d" \
+	  $non_shared_anon_core "shared_anon" \
+	  "private_anon" "0x11" }
+    { "DoNotDump" "0x33" \
+	  $dont_dump_core "dont_dump" \
+	  "shared_anon" "0x22" } }
+
+set core_supported [gdb_gcore_cmd "$non_private_anon_core" "save a corefile"]
+if { !$core_supported } {
+    untested $testfile.exp
+    return -1
+}
+
+# Getting the inferior's PID
+gdb_test_multiple "info inferiors" "getting inferior pid" {
+    -re "process \($decimal\).*\r\n$gdb_prompt $" {
+	set infpid $expect_out(1,string)
+    }
+}
+
+foreach item $all_corefiles {
+    foreach name [list [lindex $item 3] [lindex $item 4]] {
+	set test "print/x $name"
+	gdb_test_multiple $test $test {
+	    -re " = \($hex\)\r\n$gdb_prompt $" {
+		set addr($name) $expect_out(1,string)
+	    }
+	}
+    }
+}
+
+foreach item $all_corefiles {
+    with_test_prefix "saving corefile for [lindex $item 0]" {
+	do_save_core [lindex $item 1] [subst [lindex $item 2]] $infpid
+    }
+}
+
+clean_restart $testfile
+
+foreach item $all_corefiles {
+    with_test_prefix "loading and testing corefile for [lindex $item 0]" {
+	do_load_and_test_core [subst [lindex $item 2]] [lindex $item 3] \
+	    [lindex $item 4] [lindex $item 5]
+    }
+}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-05 20:53   ` Sergio Durigan Junior
@ 2015-03-05 20:57     ` Jan Kratochvil
  2015-03-11 20:02       ` Oleg Nesterov
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Kratochvil @ 2015-03-05 20:57 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches, Pedro Alves, Oleg Nesterov

On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
> On Thursday, March 05 2015, Jan Kratochvil wrote:
> > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
> >> With Oleg's help, we could improve the current algorithm for determining
> >> whether a memory mapping is anonymous/file-backed, private/shared.  GDB
> >> now also respects the MADV_DONTDUMP flag and does not dump the memory
> >
> > s/does not dump/does dump/
> 
> No, it doesn't dump.  MADV_DONTDUMP activates the "dd" flag in VmFlags,
> and the patch looks for it and, if it finds the flag, it doesn't mark
> the memory mapping to be dumped.  However, GDB will create the section
> header in the corefile.

Sorry, I meesed it up even more.  For MADV_DONTDUMP you are right, FSF GDB
dumps MADV_DONTDUMP memory, kernel does not and with this patch GDB will not.

What I wanted to say was:


> >> mapping marked as so, and won't try to dump "[vsyscall]" or "[vdso]"

s/won't try/will try/

this one.


> >> mappings as before (just like the Linux kernel).
> >
> > Currently it also tries to dump [vvar] (by default rules) but that is
> > unreadable for some reason, causing:
> > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
> >                                                                 ^^^^^^^^^^^^^^
> > Saved corefile /tmp/1j
> > (gdb) _
> > # grep 7ffff6ceb000 /proc/$p/maps
> > 7ffff6ceb000-7ffff6ced000 r--p 00000000 00:00 0                          [vvar]
> > ^^^^^^^^^^^^                                                              ^^^^
> >
> > I do not know what [vvar] is good for and why it cannot be read.
> 
> I totally forgot about this, even though we discussed it before.  Sorry;
> I am sending a new version of the patch which addresses this issue.

It would be good to get a reply from a kernel aware person what does it mean
before such patch gets accepted.  It can be also just a Linux kernel bug.


Jan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-05 20:57     ` Jan Kratochvil
@ 2015-03-11 20:02       ` Oleg Nesterov
  2015-03-12 11:31         ` Sergio Durigan Junior
                           ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-11 20:02 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Sergio Durigan Junior, GDB Patches, Pedro Alves

On 03/05, Jan Kratochvil wrote:
>
> On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
> > On Thursday, March 05 2015, Jan Kratochvil wrote:
> > > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
> > > Currently it also tries to dump [vvar] (by default rules) but that is
> > > unreadable for some reason, causing:
> > > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
> > >                                                                 ^^^^^^^^^^^^^^
> > > Saved corefile /tmp/1j
> > > (gdb) _
> > > # grep 7ffff6ceb000 /proc/$p/maps
> > > 7ffff6ceb000-7ffff6ced000 r--p 00000000 00:00 0                          [vvar]
> > > ^^^^^^^^^^^^                                                              ^^^^
> > >
> > > I do not know what [vvar] is good for and why it cannot be read.

Well, I am not sure I understand this new mapping correctly. I need to
recheck.

But apparently it represents the kernel data (say, gtod) which vdso code
(running in user mode)  can read.

Probably gdb doesn't need to dump this vma, but see below.

> It would be good to get a reply from a kernel aware person what does it mean
> before such patch gets accepted.  It can be also just a Linux kernel bug.

_So far_ this doesn't look like a kernel bug to me.

I guess it fails because of

	struct page *no_pages[] = {NULL};
	struct vm_special_mapping vvar_mapping = {
		.name = "[vvar]",
		.pages = no_pages,
	};

so get_user_pages() -> special_mapping_fault() can't succeed, there is
no page it could return.

And the code above looks as if we deny the access on purpose. Probably
this makes sense, this section can contain the "sensitive" data, say,
hpet timer's io memory...

But! I need to recheck. In fact, it seems to me that I should discuss
this on lkml. I have some concerns, but most probably this is only my
misunderstanding, I need to read this (new to me) code more carefully.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-11 20:02       ` Oleg Nesterov
@ 2015-03-12 11:31         ` Sergio Durigan Junior
  2015-03-12 14:36         ` vvar, gup && coredump Oleg Nesterov
  2015-03-12 15:02         ` [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Oleg Nesterov
  2 siblings, 0 replies; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-12 11:31 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, GDB Patches, Pedro Alves

On Wednesday, March 11 2015, Oleg Nesterov wrote:

> On 03/05, Jan Kratochvil wrote:
>>
>> On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
>> > On Thursday, March 05 2015, Jan Kratochvil wrote:
>> > > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
>> > > Currently it also tries to dump [vvar] (by default rules) but that is
>> > > unreadable for some reason, causing:
>> > > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
>> > >                                                                 ^^^^^^^^^^^^^^
>> > > Saved corefile /tmp/1j
>> > > (gdb) _
>> > > # grep 7ffff6ceb000 /proc/$p/maps
>> > > 7ffff6ceb000-7ffff6ced000 r--p 00000000 00:00 0                          [vvar]
>> > > ^^^^^^^^^^^^                                                              ^^^^
>> > >
>> > > I do not know what [vvar] is good for and why it cannot be read.
>
> Well, I am not sure I understand this new mapping correctly. I need to
> recheck.
>
> But apparently it represents the kernel data (say, gtod) which vdso code
> (running in user mode)  can read.
>
> Probably gdb doesn't need to dump this vma, but see below.

Right.  As far as I can see this was not being dumped in the previous
code, too.  I did not check whether the Linux kernel dumps this or not.

>> It would be good to get a reply from a kernel aware person what does it mean
>> before such patch gets accepted.  It can be also just a Linux kernel bug.
>
> _So far_ this doesn't look like a kernel bug to me.
>
> I guess it fails because of
>
> 	struct page *no_pages[] = {NULL};
> 	struct vm_special_mapping vvar_mapping = {
> 		.name = "[vvar]",
> 		.pages = no_pages,
> 	};
>
> so get_user_pages() -> special_mapping_fault() can't succeed, there is
> no page it could return.
>
> And the code above looks as if we deny the access on purpose. Probably
> this makes sense, this section can contain the "sensitive" data, say,
> hpet timer's io memory...
>
> But! I need to recheck. In fact, it seems to me that I should discuss
> this on lkml. I have some concerns, but most probably this is only my
> misunderstanding, I need to read this (new to me) code more carefully.

Thanks, Oleg.

For now, I will keep discarding this mapping in the dumping.  But please
let us know about your findings.

Meanwhile, I'll keep pinging this patch for reviews here.

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* vvar, gup && coredump
  2015-03-11 20:02       ` Oleg Nesterov
  2015-03-12 11:31         ` Sergio Durigan Junior
@ 2015-03-12 14:36         ` Oleg Nesterov
  2015-03-12 16:29           ` Andy Lutomirski
  2015-03-12 15:02         ` [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Oleg Nesterov
  2 siblings, 1 reply; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 14:36 UTC (permalink / raw)
  To: Jan Kratochvil, Andy Lutomirski
  Cc: Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel

Add cc's, change subject.

On 03/11, Oleg Nesterov wrote:
>
> On 03/05, Jan Kratochvil wrote:
> >
> > On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
> > > On Thursday, March 05 2015, Jan Kratochvil wrote:
> > > > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
> > > > Currently it also tries to dump [vvar] (by default rules) but that is
> > > > unreadable for some reason, causing:
> > > > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
> > > >                                                                 ^^^^^^^^^^^^^^
>
> > It would be good to get a reply from a kernel aware person what does it mean
> > before such patch gets accepted.  It can be also just a Linux kernel bug.
>
> _So far_ this doesn't look like a kernel bug to me.
>
> But! I need to recheck. In fact, it seems to me that I should discuss
> this on lkml. I have some concerns, but most probably this is only my
> misunderstanding, I need to read this (new to me) code more carefully.

Hi Andy, we need your help ;)

So, the problem is that gdb can't access the "vvar" mapping which looks
like the "normal" vma from user-space pov.

Technically this is clear. vvar_mapping->pages is the "dummy" no_pages[]
array, get_user_pages() can't succeed. In fact even follow_page() can't
work because of VM_PFNMAP/_PAGE_SPECIAL set by remap_pfn_range().

What is not clear: do we really want gup() to fail? Or it is not trivial
to turn __vvar_page into the "normal" page? (to simplify the discussion,
lets ignore hpet mapping for now).

Because this doesn't look consistent. gdb tries to "coredump" the live
process like the kernel does, but fails to dump the "r--p ... [vvar]"
region.


OK, gdb can look at VM_DONTDUMP bit in "VmFlags:" field in /proc/pid/smaps
and skip this vma. But, why (afaics) the kernel dumps this vma then? Lets
look at vma_dump_size(),

	/* always dump the vdso and vsyscall sections */
	if (always_dump_vma(vma))
		goto whole;

	if (vma->vm_flags & VM_DONTDUMP)
		return 0;

so the kernel ignores VM_DONTDUMP in this case, always_dump_vma() returns
true because of special_mapping_name(). Perhaps we should check VM_DONTDUMP
before always_dump_vma() ?


Or. We can teach gdb to read and dump its own "vvar" mapping to mimic the
kernel behaviour, this is the same read-only memory. But this hack doesn't
look nice, gdb should not know "too much" about the kernel internals.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-11 20:02       ` Oleg Nesterov
  2015-03-12 11:31         ` Sergio Durigan Junior
  2015-03-12 14:36         ` vvar, gup && coredump Oleg Nesterov
@ 2015-03-12 15:02         ` Oleg Nesterov
  2015-03-12 15:46           ` Pedro Alves
  2015-03-12 17:37           ` Sergio Durigan Junior
  2 siblings, 2 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 15:02 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Sergio Durigan Junior, GDB Patches, Pedro Alves

See another email I sent, perhaps this needs more discussion...

But,

On 03/11, Oleg Nesterov wrote:
>
> On 03/05, Jan Kratochvil wrote:
> >
> > On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
> > > On Thursday, March 05 2015, Jan Kratochvil wrote:
> > > > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
> > > > Currently it also tries to dump [vvar] (by default rules) but that is
> > > > unreadable for some reason, causing:
> > > > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
> > > >                                                                 ^^^^^^^^^^^^^^
> > > > Saved corefile /tmp/1j
> > > > (gdb) _
> > > > # grep 7ffff6ceb000 /proc/$p/maps
> > > > 7ffff6ceb000-7ffff6ced000 r--p 00000000 00:00 0                          [vvar]
> > > > ^^^^^^^^^^^^                                                              ^^^^
> > > >
> > > > I do not know what [vvar] is good for and why it cannot be read.
>
> Probably gdb doesn't need to dump this vma, but see below.

Probably yes. Note that it has VM_DONTDUMP ("dd" in "VmFlags:" field).

However. If (for any reason) you decide to dump this region, gdb can
look into /proc/self/maps, find its own "vvar" mapping, and simply read
this memory. Unlike "vdso", "vvar" has the same content for every process.

(just in case, "vdso" is the same too but it is MAYWRITE, so it can have
 anonymous pages. Say, breakpoints installed by gdb).

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-12 15:02         ` [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Oleg Nesterov
@ 2015-03-12 15:46           ` Pedro Alves
  2015-03-12 15:57             ` Jan Kratochvil
  2015-03-12 16:07             ` Oleg Nesterov
  2015-03-12 17:37           ` Sergio Durigan Junior
  1 sibling, 2 replies; 46+ messages in thread
From: Pedro Alves @ 2015-03-12 15:46 UTC (permalink / raw)
  To: Oleg Nesterov, Jan Kratochvil; +Cc: Sergio Durigan Junior, GDB Patches

On 03/12/2015 03:00 PM, Oleg Nesterov wrote:

> However. If (for any reason) you decide to dump this region, gdb can
> look into /proc/self/maps, find its own "vvar" mapping, and simply read
> this memory. Unlike "vdso", "vvar" has the same content for every process.

Actually it can't: GDB may well be dumping the memory of
a process running on another machine (through gdbserver).

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-12 15:46           ` Pedro Alves
@ 2015-03-12 15:57             ` Jan Kratochvil
  2015-03-12 16:19               ` Pedro Alves
  2015-03-12 16:07             ` Oleg Nesterov
  1 sibling, 1 reply; 46+ messages in thread
From: Jan Kratochvil @ 2015-03-12 15:57 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Oleg Nesterov, Sergio Durigan Junior, GDB Patches

On Thu, 12 Mar 2015 16:45:15 +0100, Pedro Alves wrote:
> On 03/12/2015 03:00 PM, Oleg Nesterov wrote:
> 
> > However. If (for any reason) you decide to dump this region, gdb can
> > look into /proc/self/maps, find its own "vvar" mapping, and simply read
> > this memory. Unlike "vdso", "vvar" has the same content for every process.
> 
> Actually it can't: GDB may well be dumping the memory of
> a process running on another machine (through gdbserver).

So it can - from gdbserver's [vvar].


Jan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-12 15:46           ` Pedro Alves
  2015-03-12 15:57             ` Jan Kratochvil
@ 2015-03-12 16:07             ` Oleg Nesterov
  2015-03-12 16:28               ` Pedro Alves
  1 sibling, 1 reply; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 16:07 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches

On 03/12, Pedro Alves wrote:
>
> On 03/12/2015 03:00 PM, Oleg Nesterov wrote:
>
> > However. If (for any reason) you decide to dump this region, gdb can
> > look into /proc/self/maps, find its own "vvar" mapping, and simply read
> > this memory. Unlike "vdso", "vvar" has the same content for every process.
>
> Actually it can't: GDB may well be dumping the memory of
> a process running on another machine (through gdbserver).

Yes, thanks for correcting me...

I do not know if gdb can ask gdbserver to read its own memory, but even if
it can this doesn't look like a nice solution.

Just curious... I know that gdb can execute the code on behalf of the traced
process, so perhaps it can force the tracee to memcpy() its "vvar" memory.
Can this work with gdbserver? Again, I do not think this hack can make any
sense. I am just curious.

At least (I hope) this mapping doesn't look "important" from debugging pov,
perhaps gdb should ignore it. Lets see what Andy thinks, but I bet it is
very unlikely that the kernel will be changed to allow the access to this
vma.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-12 15:57             ` Jan Kratochvil
@ 2015-03-12 16:19               ` Pedro Alves
  0 siblings, 0 replies; 46+ messages in thread
From: Pedro Alves @ 2015-03-12 16:19 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Oleg Nesterov, Sergio Durigan Junior, GDB Patches

On 03/12/2015 03:57 PM, Jan Kratochvil wrote:
> On Thu, 12 Mar 2015 16:45:15 +0100, Pedro Alves wrote:
>> On 03/12/2015 03:00 PM, Oleg Nesterov wrote:
>>
>>> However. If (for any reason) you decide to dump this region, gdb can
>>> look into /proc/self/maps, find its own "vvar" mapping, and simply read
>>> this memory. Unlike "vdso", "vvar" has the same content for every process.
>>
>> Actually it can't: GDB may well be dumping the memory of
>> a process running on another machine (through gdbserver).
> 
> So it can - from gdbserver's [vvar].

Sure, but GDB is just remotely reading the /proc files.
We'd need a new RSP packet to get at that object.  All
for working around something that sounds like the kernel
should be supporting without hacks.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-12 16:07             ` Oleg Nesterov
@ 2015-03-12 16:28               ` Pedro Alves
  0 siblings, 0 replies; 46+ messages in thread
From: Pedro Alves @ 2015-03-12 16:28 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches

On 03/12/2015 04:05 PM, Oleg Nesterov wrote:
> On 03/12, Pedro Alves wrote:
>>
>> On 03/12/2015 03:00 PM, Oleg Nesterov wrote:
>>
>>> However. If (for any reason) you decide to dump this region, gdb can
>>> look into /proc/self/maps, find its own "vvar" mapping, and simply read
>>> this memory. Unlike "vdso", "vvar" has the same content for every process.
>>
>> Actually it can't: GDB may well be dumping the memory of
>> a process running on another machine (through gdbserver).
> 
> Yes, thanks for correcting me...
> 
> I do not know if gdb can ask gdbserver to read its own memory, but even if
> it can this doesn't look like a nice solution.

Not currently, it can't.

> 
> Just curious... I know that gdb can execute the code on behalf of the traced
> process, so perhaps it can force the tracee to memcpy() its "vvar" memory.
> Can this work with gdbserver? Again, I do not think this hack can make any
> sense. I am just curious.

Yes, that can work.  But it's horrible.  :-)  If the user is dumping the
process's core, it's likely because the traced process is already in a
not-so-good / corrupted state.  Forcing it to run more code may make
things worse.

> At least (I hope) this mapping doesn't look "important" from debugging pov,
> perhaps gdb should ignore it. Lets see what Andy thinks, 

Agreed, let's hear what Andy says.

> but I bet it is
> very unlikely that the kernel will be changed to allow the access to this
> vma.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 14:36         ` vvar, gup && coredump Oleg Nesterov
@ 2015-03-12 16:29           ` Andy Lutomirski
  2015-03-12 16:56             ` Oleg Nesterov
  0 siblings, 1 reply; 46+ messages in thread
From: Andy Lutomirski @ 2015-03-12 16:29 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On Thu, Mar 12, 2015 at 7:34 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> Add cc's, change subject.
>
> On 03/11, Oleg Nesterov wrote:
>>
>> On 03/05, Jan Kratochvil wrote:
>> >
>> > On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
>> > > On Thursday, March 05 2015, Jan Kratochvil wrote:
>> > > > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
>> > > > Currently it also tries to dump [vvar] (by default rules) but that is
>> > > > unreadable for some reason, causing:
>> > > > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
>> > > >                                                                 ^^^^^^^^^^^^^^
>>
>> > It would be good to get a reply from a kernel aware person what does it mean
>> > before such patch gets accepted.  It can be also just a Linux kernel bug.
>>
>> _So far_ this doesn't look like a kernel bug to me.
>>
>> But! I need to recheck. In fact, it seems to me that I should discuss
>> this on lkml. I have some concerns, but most probably this is only my
>> misunderstanding, I need to read this (new to me) code more carefully.
>
> Hi Andy, we need your help ;)
>
> So, the problem is that gdb can't access the "vvar" mapping which looks
> like the "normal" vma from user-space pov.
>
> Technically this is clear. vvar_mapping->pages is the "dummy" no_pages[]
> array, get_user_pages() can't succeed. In fact even follow_page() can't
> work because of VM_PFNMAP/_PAGE_SPECIAL set by remap_pfn_range().
>
> What is not clear: do we really want gup() to fail? Or it is not trivial
> to turn __vvar_page into the "normal" page? (to simplify the discussion,
> lets ignore hpet mapping for now).

We could presumably fiddle with the vma to allow get_user_pages to
work on at least the first vvar page.  There are some decently large
caveats, though:

 - We don't want to COW it.  If someone pokes at that page with
ptrace, for example, and it gets COWed, everything will stop working
because the offending process will no longer see updates.  That way
lies infinite loops.

 - The implementation could be odd.  The vma is either VM_MIXEDMAP or
VM_PFNMAP, and I don't see any practical way to change that.

 - The HPET and perhaps pvclock stuff.  The HPET probably doesn't have
a struct page at all, so you can't possibly get_user_pages it.

>
> Because this doesn't look consistent. gdb tries to "coredump" the live
> process like the kernel does, but fails to dump the "r--p ... [vvar]"
> region.
>
>
> OK, gdb can look at VM_DONTDUMP bit in "VmFlags:" field in /proc/pid/smaps
> and skip this vma. But, why (afaics) the kernel dumps this vma then? Lets
> look at vma_dump_size(),
>
>         /* always dump the vdso and vsyscall sections */
>         if (always_dump_vma(vma))
>                 goto whole;
>
>         if (vma->vm_flags & VM_DONTDUMP)
>                 return 0;
>
> so the kernel ignores VM_DONTDUMP in this case, always_dump_vma() returns
> true because of special_mapping_name(). Perhaps we should check VM_DONTDUMP
> before always_dump_vma() ?
>

That sounds reasonable to me.  I'll write the patch later today.  gdb
will still need changes, though, right?

--Andy

>
> Or. We can teach gdb to read and dump its own "vvar" mapping to mimic the
> kernel behaviour, this is the same read-only memory. But this hack doesn't
> look nice, gdb should not know "too much" about the kernel internals.
>
> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 16:29           ` Andy Lutomirski
@ 2015-03-12 16:56             ` Oleg Nesterov
  2015-03-12 17:18               ` Andy Lutomirski
  2015-03-12 17:48               ` Oleg Nesterov
  0 siblings, 2 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 16:56 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> On Thu, Mar 12, 2015 at 7:34 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > What is not clear: do we really want gup() to fail? Or it is not trivial
> > to turn __vvar_page into the "normal" page? (to simplify the discussion,
> > lets ignore hpet mapping for now).
>
> We could presumably fiddle with the vma to allow get_user_pages to
> work on at least the first vvar page.  There are some decently large
> caveats, though:
>
>  - We don't want to COW it.  If someone pokes at that page with
> ptrace, for example, and it gets COWed, everything will stop working
> because the offending process will no longer see updates.  That way
> lies infinite loops.

Of course, but this looks simple... is_cow_mapping() == F so FOLL_FORCE
won't work anyway?

>  - The implementation could be odd.  The vma is either VM_MIXEDMAP or
> VM_PFNMAP, and I don't see any practical way to change that.
>
>  - The HPET and perhaps pvclock stuff.  The HPET probably doesn't have
> a struct page at all, so you can't possibly get_user_pages it.

Yes, this is true. OK, lets not dump it. I'll probably send a patch which
changes vma_dump_size() to check VM_DONTDUMP first...

But this leads to another question: why do we want to expose this
"vvar" vma at all?

For the moment, forget about compat 32-bit applications running under
64-bit kernel.

Can't we simply add FIX_VVAR_PAGE into fixed_addresses{}, map it into
init_mm via set_fixmap(FIX_VVAR_PAGE, __PAGE_USER) and change __vdso.*
functions to use fix_to_virt() address?

I don't really understand the low-level details, I'd like to understand
if this can work or not. And if it can work, why this is undesirable.

As for 32-bit applications. Yes, this can't work because 32-bit simply
can't access this "high" memory. But you know, it would be very nice to
have the fixmap-like "global" area in init_mm which is also visible to
compat applications. If we had it, uprobes could work without xol vma's.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 16:56             ` Oleg Nesterov
@ 2015-03-12 17:18               ` Andy Lutomirski
  2015-03-12 17:40                 ` Oleg Nesterov
  2015-03-12 17:48               ` Oleg Nesterov
  1 sibling, 1 reply; 46+ messages in thread
From: Andy Lutomirski @ 2015-03-12 17:18 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On Thu, Mar 12, 2015 at 9:54 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Andy Lutomirski wrote:
>>
>> On Thu, Mar 12, 2015 at 7:34 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> >
>> > What is not clear: do we really want gup() to fail? Or it is not trivial
>> > to turn __vvar_page into the "normal" page? (to simplify the discussion,
>> > lets ignore hpet mapping for now).
>>
>> We could presumably fiddle with the vma to allow get_user_pages to
>> work on at least the first vvar page.  There are some decently large
>> caveats, though:
>>
>>  - We don't want to COW it.  If someone pokes at that page with
>> ptrace, for example, and it gets COWed, everything will stop working
>> because the offending process will no longer see updates.  That way
>> lies infinite loops.
>
> Of course, but this looks simple... is_cow_mapping() == F so FOLL_FORCE
> won't work anyway?
>
>>  - The implementation could be odd.  The vma is either VM_MIXEDMAP or
>> VM_PFNMAP, and I don't see any practical way to change that.
>>
>>  - The HPET and perhaps pvclock stuff.  The HPET probably doesn't have
>> a struct page at all, so you can't possibly get_user_pages it.
>
> Yes, this is true. OK, lets not dump it. I'll probably send a patch which
> changes vma_dump_size() to check VM_DONTDUMP first...
>
> But this leads to another question: why do we want to expose this
> "vvar" vma at all?
>
> For the moment, forget about compat 32-bit applications running under
> 64-bit kernel.
>
> Can't we simply add FIX_VVAR_PAGE into fixed_addresses{}, map it into
> init_mm via set_fixmap(FIX_VVAR_PAGE, __PAGE_USER) and change __vdso.*
> functions to use fix_to_virt() address?
>
> I don't really understand the low-level details, I'd like to understand
> if this can work or not. And if it can work, why this is undesirable.
>
> As for 32-bit applications. Yes, this can't work because 32-bit simply
> can't access this "high" memory. But you know, it would be very nice to
> have the fixmap-like "global" area in init_mm which is also visible to
> compat applications. If we had it, uprobes could work without xol vma's.
>

It could work for 32-bit native, but not for 32-bit compat.  Also, I
have grand plans to add per-task vvar overrides for seccomp and such.
And RIP-relative addressing is a bit nicer than absolute :)

It used to work that way, but we changed it in 3.15 IIRC.

On a related note, I'm hoping to rework the mm part pretty heavily:

http://lkml.kernel.org/r/cover.1414629045.git.luto@amacapital.net

--Andy

> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-12 15:02         ` [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Oleg Nesterov
  2015-03-12 15:46           ` Pedro Alves
@ 2015-03-12 17:37           ` Sergio Durigan Junior
  1 sibling, 0 replies; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-12 17:37 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jan Kratochvil, GDB Patches, Pedro Alves

On Thursday, March 12 2015, Oleg Nesterov wrote:

>> Probably gdb doesn't need to dump this vma, but see below.
>
> Probably yes. Note that it has VM_DONTDUMP ("dd" in "VmFlags:" field).

The fact that the region has VM_DONTDUMP is enough for GDB to ignore
it.  IMO, as discussed in the other thread with Andy, the Linux kernel
is bogus in this case and should also be ignoring this.

> However. If (for any reason) you decide to dump this region, gdb can
> look into /proc/self/maps, find its own "vvar" mapping, and simply read
> this memory. Unlike "vdso", "vvar" has the same content for every process.

Yeah, but I don't think this is worth the effort.  As Pedro mentioned,
things can get more complicated when we consider remote scenarios.

> (just in case, "vdso" is the same too but it is MAYWRITE, so it can have
>  anonymous pages. Say, breakpoints installed by gdb).

Also, [vdso] doesn't have the VM_DONTDUMP flag.  My patch is already
dumping it inconditionally.

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:18               ` Andy Lutomirski
@ 2015-03-12 17:40                 ` Oleg Nesterov
  2015-03-12 17:45                   ` Sergio Durigan Junior
  2015-03-12 17:56                   ` Andy Lutomirski
  0 siblings, 2 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 17:40 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> On Thu, Mar 12, 2015 at 9:54 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 03/12, Andy Lutomirski wrote:
> >>
> > As for 32-bit applications. Yes, this can't work because 32-bit simply
> > can't access this "high" memory. But you know, it would be very nice to
> > have the fixmap-like "global" area in init_mm which is also visible to
> > compat applications. If we had it, uprobes could work without xol vma's.
> >
> It could work for 32-bit native, but not for 32-bit compat.

Yes, yes, I meant 32-bit compat apps. Once again, it would be nice if we
had the "low" fixmaps in init_mm. But unlikely this is possible...

> On a related note, I'm hoping to rework the mm part pretty heavily:
>
> http://lkml.kernel.org/r/cover.1414629045.git.luto@amacapital.net

OK... not that I really understand this email.

Well. Speaking of vdso. I understand that unlikely we can do this, but
for uprobes it would be nice to have a anon-inode file behind this mapping,
so that vma_interval_tree_foreach() could work, etc. OK, this is completely
off-topic, please forget.



And I noticed that I didn't read your previous email carefully enough...

> That sounds reasonable to me.  I'll write the patch later today.

Sure, please send a patch if you want to do this.

> gdb will still need changes, though, right?

This is up to gdb developers. To me, it should simply skip this
VM_DONTDUMP vma.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:40                 ` Oleg Nesterov
@ 2015-03-12 17:45                   ` Sergio Durigan Junior
  2015-03-12 18:04                     ` Oleg Nesterov
  2015-03-12 17:56                   ` Andy Lutomirski
  1 sibling, 1 reply; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-12 17:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On Thursday, March 12 2015, Oleg Nesterov wrote:

>> gdb will still need changes, though, right?
>
> This is up to gdb developers. To me, it should simply skip this
> VM_DONTDUMP vma.

If I understood this discussion correctly (and thanks Andy and Oleg for,
*ahem*, dumping all this useful information for us!), GDB will not need
modifications in the Linux kernel in this area.  In fact, my patch
already implements the "ignore VM_DONTDUMP mappings" part, so we're
pretty much covered.

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 16:56             ` Oleg Nesterov
  2015-03-12 17:18               ` Andy Lutomirski
@ 2015-03-12 17:48               ` Oleg Nesterov
  2015-03-12 17:55                 ` Andy Lutomirski
                                   ` (2 more replies)
  1 sibling, 3 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 17:48 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Oleg Nesterov wrote:
>
> Yes, this is true. OK, lets not dump it.

OTOH. We can probably add ->access() into special_mapping_vmops, this
way __access_remote_vm() could work even if gup() fails ?

Jan, Sergio. How much do we want do dump this area ? The change above
should be justified.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:48               ` Oleg Nesterov
@ 2015-03-12 17:55                 ` Andy Lutomirski
  2015-03-12 18:16                   ` Oleg Nesterov
  2015-03-12 18:20                 ` Pedro Alves
  2015-03-16 19:03                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
  2 siblings, 1 reply; 46+ messages in thread
From: Andy Lutomirski @ 2015-03-12 17:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On Thu, Mar 12, 2015 at 10:46 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Oleg Nesterov wrote:
>>
>> Yes, this is true. OK, lets not dump it.
>
> OTOH. We can probably add ->access() into special_mapping_vmops, this
> way __access_remote_vm() could work even if gup() fails ?

Let's wait until my special_mapping vmops rework lands to do that.
I'll dust it off and resubmit it.

>
> Jan, Sergio. How much do we want do dump this area ? The change above
> should be justified.
>
> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:40                 ` Oleg Nesterov
  2015-03-12 17:45                   ` Sergio Durigan Junior
@ 2015-03-12 17:56                   ` Andy Lutomirski
  2015-03-12 18:28                     ` Oleg Nesterov
  1 sibling, 1 reply; 46+ messages in thread
From: Andy Lutomirski @ 2015-03-12 17:56 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On Thu, Mar 12, 2015 at 10:39 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Andy Lutomirski wrote:
>>
>> On Thu, Mar 12, 2015 at 9:54 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> > On 03/12, Andy Lutomirski wrote:
>> >>
>> > As for 32-bit applications. Yes, this can't work because 32-bit simply
>> > can't access this "high" memory. But you know, it would be very nice to
>> > have the fixmap-like "global" area in init_mm which is also visible to
>> > compat applications. If we had it, uprobes could work without xol vma's.
>> >
>> It could work for 32-bit native, but not for 32-bit compat.
>
> Yes, yes, I meant 32-bit compat apps. Once again, it would be nice if we
> had the "low" fixmaps in init_mm. But unlikely this is possible...
>
>> On a related note, I'm hoping to rework the mm part pretty heavily:
>>
>> http://lkml.kernel.org/r/cover.1414629045.git.luto@amacapital.net
>
> OK... not that I really understand this email.
>
> Well. Speaking of vdso. I understand that unlikely we can do this, but
> for uprobes it would be nice to have a anon-inode file behind this mapping,
> so that vma_interval_tree_foreach() could work, etc. OK, this is completely
> off-topic, please forget.

Couldn't you do that directly in the uprobes code?  That is, create an
anon_inode file and just map it the old-fashioned way?

--Andy

>
>
>
> And I noticed that I didn't read your previous email carefully enough...
>
>> That sounds reasonable to me.  I'll write the patch later today.
>
> Sure, please send a patch if you want to do this.
>
>> gdb will still need changes, though, right?
>
> This is up to gdb developers. To me, it should simply skip this
> VM_DONTDUMP vma.
>
> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:45                   ` Sergio Durigan Junior
@ 2015-03-12 18:04                     ` Oleg Nesterov
  2015-03-13  4:50                       ` Sergio Durigan Junior
  0 siblings, 1 reply; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 18:04 UTC (permalink / raw)
  To: Sergio Durigan Junior
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On 03/12, Sergio Durigan Junior wrote:
>
> On Thursday, March 12 2015, Oleg Nesterov wrote:
>
> >> gdb will still need changes, though, right?
> >
> > This is up to gdb developers. To me, it should simply skip this
> > VM_DONTDUMP vma.
>
> If I understood this discussion correctly (and thanks Andy and Oleg for,
> *ahem*, dumping all this useful information for us!), GDB will not need
> modifications in the Linux kernel in this area.  In fact, my patch
> already implements the "ignore VM_DONTDUMP mappings" part, so we're
> pretty much covered.

OK, thanks.

And it seems that we all agree that the kernel should not dump this vma
too. Could you confirm that this is fine from gdb pov just in case?


However. Even if we do not want it in the coredump, this can confuse gdb
users which might want to read this memory during debugging. So perhaps
we still can add ->access() to "fix" PTRACE_PEEK/access_remote_vm later.

But I see another email from Andy, so lets forget about this for now.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:55                 ` Andy Lutomirski
@ 2015-03-12 18:16                   ` Oleg Nesterov
  2015-03-12 18:23                     ` Sergio Durigan Junior
  0 siblings, 1 reply; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 18:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> On Thu, Mar 12, 2015 at 10:46 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 03/12, Oleg Nesterov wrote:
> >>
> >> Yes, this is true. OK, lets not dump it.
> >
> > OTOH. We can probably add ->access() into special_mapping_vmops, this
> > way __access_remote_vm() could work even if gup() fails ?
>
> Let's wait until my special_mapping vmops rework lands to do that.
> I'll dust it off and resubmit it.

OK. Please CC me. Not that I think I can help, just I want to understand
what you are going to do.

Although currently I do not even read most of emails I get ;)

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:48               ` Oleg Nesterov
  2015-03-12 17:55                 ` Andy Lutomirski
@ 2015-03-12 18:20                 ` Pedro Alves
  2015-03-12 18:26                   ` Andy Lutomirski
  2015-03-16 19:03                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
  2 siblings, 1 reply; 46+ messages in thread
From: Pedro Alves @ 2015-03-12 18:20 UTC (permalink / raw)
  To: Oleg Nesterov, Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, linux-kernel

On 03/12/2015 05:46 PM, Oleg Nesterov wrote:
> On 03/12, Oleg Nesterov wrote:
>>
>> Yes, this is true. OK, lets not dump it.
> 
> OTOH. We can probably add ->access() into special_mapping_vmops, this
> way __access_remote_vm() could work even if gup() fails ?
> 
> Jan, Sergio. How much do we want do dump this area ? The change above
> should be justified.

Memory mappings that weren't touched since they were initially mapped can
be retrieved from the program binary and the shared libraries, even if
the core dump is moved to another machine.  However, in vvar case,
sounds like  there's nowhere to read it from offline?  In that case,
it could be justified to dump it.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 18:16                   ` Oleg Nesterov
@ 2015-03-12 18:23                     ` Sergio Durigan Junior
  0 siblings, 0 replies; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-12 18:23 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On Thursday, March 12 2015, Oleg Nesterov wrote:

>> Let's wait until my special_mapping vmops rework lands to do that.
>> I'll dust it off and resubmit it.
>
> OK. Please CC me. Not that I think I can help, just I want to understand
> what you are going to do.

If you think your work will impact the core dumping part, please include
me in the Cc as well.

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 18:20                 ` Pedro Alves
@ 2015-03-12 18:26                   ` Andy Lutomirski
  0 siblings, 0 replies; 46+ messages in thread
From: Andy Lutomirski @ 2015-03-12 18:26 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Oleg Nesterov, Jan Kratochvil, Sergio Durigan Junior,
	GDB Patches, linux-kernel

On Thu, Mar 12, 2015 at 11:19 AM, Pedro Alves <palves@redhat.com> wrote:
> On 03/12/2015 05:46 PM, Oleg Nesterov wrote:
>> On 03/12, Oleg Nesterov wrote:
>>>
>>> Yes, this is true. OK, lets not dump it.
>>
>> OTOH. We can probably add ->access() into special_mapping_vmops, this
>> way __access_remote_vm() could work even if gup() fails ?
>>
>> Jan, Sergio. How much do we want do dump this area ? The change above
>> should be justified.
>
> Memory mappings that weren't touched since they were initially mapped can
> be retrieved from the program binary and the shared libraries, even if
> the core dump is moved to another machine.  However, in vvar case,
> sounds like  there's nowhere to read it from offline?  In that case,
> it could be justified to dump it.

This is why we currently dump the vdso text.  On arm64 (the only other
architecture that uses a real vma for vvar data IIRC), we use a more
normal vma and we dump it.  x86 is the odd one out here.

We could just leave the kernel alone.  The data that gets dumped is of
dubious value, but it could be slightly helpful when debugging vdso
crashes*, but, of course, dumping it is inherently racy.


* The vdso never crashes :)

--Andy

>
> Thanks,
> Pedro Alves
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:56                   ` Andy Lutomirski
@ 2015-03-12 18:28                     ` Oleg Nesterov
  0 siblings, 0 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-12 18:28 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> On Thu, Mar 12, 2015 at 10:39 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > Well. Speaking of vdso. I understand that unlikely we can do this, but
> > for uprobes it would be nice to have a anon-inode file behind this mapping,
> > so that vma_interval_tree_foreach() could work, etc. OK, this is completely
> > off-topic, please forget.
>
> Couldn't you do that directly in the uprobes code?  That is, create an
> anon_inode file and just map it the old-fashioned way?

This won't help. Uprobes wants this file mmaped by all applications, so
that build_map_info() can find mm's, vma's, etc to install the system-wide
breakpoint. But again, this is off-topic and unlikely possible.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-05  3:48 [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Sergio Durigan Junior
  2015-03-05 15:48 ` Jan Kratochvil
@ 2015-03-12 21:39 ` Sergio Durigan Junior
  2015-03-13 19:34   ` Pedro Alves
  2015-03-14  9:40   ` Eli Zaretskii
  2015-03-13 19:37 ` [PATCH] " Pedro Alves
  2 siblings, 2 replies; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-12 21:39 UTC (permalink / raw)
  To: GDB Patches; +Cc: Jan Kratochvil, Pedro Alves, Oleg Nesterov

On Wednesday, March 04 2015, I wrote:

> Hello,
>
> I have been working on this patch for quite some time, with some
> interruptions here and there, but now I think it is ready to be
> submitted and pushed upstream.

Hi,

After our discussion about dumping the [vvar] mapping, I am sending this
v2.  The only difference is that the code is not checking to see whether
the mapping name is [vvar] or not.  As explained in the other thread,
the [vvar] mapping contains the "dd" flag (VM_DONTDUMP), and GDB already
honors that.

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

gdb/ChangeLog:
2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>
	    Jan Kratochvil  <jan.kratochvil@redhat.com>
	    Oleg Nesterov  <oleg@redhat.com>

	PR corefiles/16092
	* common/common-defs.h (enum memory_mapping_state): New enum.
	* defs.h (find_memory_region_ftype): Remove 'int modified'
	parameter, replacing by 'enum memory_mapping_state state'.
	* gcore.c (gcore_create_callback): Likewise.  Change 'if/else'
	statements and improve the logic of deciding when to ignore a
	memory mapping.
	(objfile_find_memory_regions): Passing
	'MEMORY_MAPPING_UNKNOWN_STATE' or 'MEMORY_MAPPING_MODIFIED' when
	needed to 'func' callback, instead of saying the memory mapping
	was modified even without knowing it.
	* gnu-nat.c (gnu_find_memory_regions): Likewise.
	* linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'.
	New enum identifying the various options of the coredump_filter
	file.
	(struct smaps_vmflags): New struct.
	(use_coredump_filter): New variable.
	(decode_vmflags): New function.
	(mapping_is_anonymous_p): Likewise.
	(dump_mapping_p): Likewise.
	(linux_find_memory_region_ftype): Remove 'int modified' parameter,
	replacing by 'enum memory_mapping_state state'.
	(linux_find_memory_regions_full): New variables
	'coredumpfilter_name', 'coredumpfilterdata', 'pid',
	'filterflags'.  Read /proc/<PID>/smaps file; improve parsing of
	its information.  Implement memory mapping filtering based on its
	contents.
	(linux_find_memory_regions_thunk): Remove 'int modified'
	parameter, replacing by 'enum memory_mapping_state state'.
	(linux_make_mappings_callback): Likewise.
	(find_mapping_size): Likewise.
	(show_use_coredump_filter): New function.
	(_initialize_linux_tdep): New command 'set use-coredump-filter'.
	* procfs.c (find_memory_regions_callback): Passing
	'MEMORY_MAPPING_UNKNOWN_STATE' when needed to 'func' callback,
	instead of saying the memory mapping was modified even without
	knowing it.

gdb/doc/ChangeLog:
2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>

	PR corefiles/16092
	* gdb.texinfo (gcore): Mention new command 'set
	use-coredump-filter'.
	(set use-coredump-filter): Document new command.

gdb/testsuite/ChangeLog:
2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>

	PR corefiles/16092
	* gdb.base/coredump-filter.c: New file.
	* gdb.base/coredump-filter.exp: Likewise.


diff --git a/gdb/common/common-defs.h b/gdb/common/common-defs.h
index 62d9de5..01b05f5 100644
--- a/gdb/common/common-defs.h
+++ b/gdb/common/common-defs.h
@@ -60,4 +60,14 @@
 # define EXTERN_C_POP
 #endif
 
+/* Enum used to inform the state of a memory mapping.  This is used in
+   functions implementing find_memory_region_ftype.  */
+
+enum memory_mapping_state
+  {
+    MEMORY_MAPPING_MODIFIED,
+    MEMORY_MAPPING_UNMODIFIED,
+    MEMORY_MAPPING_UNKNOWN_STATE,
+  };
+
 #endif /* COMMON_DEFS_H */
diff --git a/gdb/defs.h b/gdb/defs.h
index 72512f6..4829b62 100644
--- a/gdb/defs.h
+++ b/gdb/defs.h
@@ -338,7 +338,8 @@ extern void init_source_path (void);
 
 typedef int (*find_memory_region_ftype) (CORE_ADDR addr, unsigned long size,
 					 int read, int write, int exec,
-					 int modified, void *data);
+					 enum memory_mapping_state state,
+					 void *data);
 
 /* * Possible lvalue types.  Like enum language, this should be in
    value.h, but needs to be here for the same reason.  */
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 9e71642..092bc93 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -10952,6 +10952,67 @@ specified, the file name defaults to @file{core.@var{pid}}, where
 
 Note that this command is implemented only for some systems (as of
 this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
+
+On @sc{gnu}/Linux, this command can take into account the value of the
+file @file{/proc/@var{pid}/coredump_filter} when generating the core
+dump (@pxref{set use-coredump-filter}).
+
+@kindex set use-coredump-filter
+@anchor{set use-coredump-filter}
+@item set use-coredump-filter on
+@itemx set use-coredump-filter off
+Enable or disable the use of the file
+@file{/proc/@var{pid}/coredump_filter} when generating core dump
+files.  This file is used by the Linux kernel to decide what types of
+memory mappings will be dumped or ignored when generating a core dump
+file.
+
+To make use of this feature, you have to write in the
+@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
+which is a bit mask representing the memory mapping types.  If a bit
+is set in the bit mask, then the memory mappings of the corresponding
+types will be dumped; otherwise, they will be ignored.  The bits in
+this bit mask have the following meanings:
+
+@table @code
+@item bit 0
+Dump anonymous private mappings.
+@item bit 1
+Dump anonymous shared mappings.
+@item bit 2
+Dump file-backed private mappings.
+@item bit 3
+Dump file-backed shared mappings.
+@item bit 4
+(since Linux 2.6.24)
+Dump ELF headers. (@value{GDBN} does not take this bit into account)
+@item bit 5
+(since Linux 2.6.28)
+Dump private huge pages.
+@item bit 6
+(since Linux 2.6.28)
+Dump shared huge pages.
+@end table
+
+For example, supposing that the @code{pid} of the program being
+debugging is @code{1234}, if you wanted to dump everything except the
+anonymous private and the file-backed shared mappings, you would do:
+
+@smallexample
+$ echo 0x76 > /proc/1234/coredump_filter
+@end smallexample
+
+For more documentation about how to use the @file{coredump_filter}
+file, see the manpage of @code{proc(5)}.
+
+By default, this option is @code{on}.  If this option is turned
+@code{off}, @value{GDBN} will not read the @file{coredump_filter}
+file, but it uses the same default value as the Linux kernel in order
+to decide which pages will be dumped in the core dump file.  This
+value currently is @code{0x33}, which means that the bits @code{0}
+(anonymous private mappings), @code{1} (anonymous shared mappings) and
+@code{4} (ELF headers) are active.  This will cause these memory
+mappings to be dumped automatically.
 @end table
 
 @node Character Sets
diff --git a/gdb/gcore.c b/gdb/gcore.c
index 44b9d0c..89d8285 100644
--- a/gdb/gcore.c
+++ b/gdb/gcore.c
@@ -415,27 +415,22 @@ make_output_phdrs (bfd *obfd, asection *osec, void *ignored)
 
 static int
 gcore_create_callback (CORE_ADDR vaddr, unsigned long size, int read,
-		       int write, int exec, int modified, void *data)
+		       int write, int exec, enum memory_mapping_state state,
+		       void *data)
 {
   bfd *obfd = data;
   asection *osec;
   flagword flags = SEC_ALLOC | SEC_HAS_CONTENTS | SEC_LOAD;
 
-  /* If the memory segment has no permissions set, ignore it, otherwise
-     when we later try to access it for read/write, we'll get an error
-     or jam the kernel.  */
-  if (read == 0 && write == 0 && exec == 0 && modified == 0)
-    {
-      if (info_verbose)
-        {
-          fprintf_filtered (gdb_stdout, "Ignore segment, %s bytes at %s\n",
-                            plongest (size), paddress (target_gdbarch (), vaddr));
-        }
-
-      return 0;
-    }
-
-  if (write == 0 && modified == 0 && !solib_keep_data_in_core (vaddr, size))
+  /* If the memory segment has no read permission set, or if it has
+     been marked as unmodified, then we have to generate a segment
+     header for it, but without contents (i.e., FileSiz = 0),
+     otherwise when we later try to access it for read/write, we'll
+     get an error or jam the kernel.  */
+  if (read == 0 || state == MEMORY_MAPPING_UNMODIFIED)
+    flags &= ~(SEC_LOAD | SEC_HAS_CONTENTS);
+  else if (write == 0 && state == MEMORY_MAPPING_UNKNOWN_STATE
+	   && !solib_keep_data_in_core (vaddr, size))
     {
       /* See if this region of memory lies inside a known file on disk.
 	 If so, we can avoid copying its contents by clearing SEC_LOAD.  */
@@ -528,7 +523,8 @@ objfile_find_memory_regions (struct target_ops *self,
 			 1, /* All sections will be readable.  */
 			 (flags & SEC_READONLY) == 0, /* Writable.  */
 			 (flags & SEC_CODE) != 0, /* Executable.  */
-			 1, /* MODIFIED is unknown, pass it as true.  */
+			 MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is
+							 unknown.  */
 			 obfd);
 	  if (ret != 0)
 	    return ret;
@@ -541,7 +537,7 @@ objfile_find_memory_regions (struct target_ops *self,
 	     1, /* Stack section will be readable.  */
 	     1, /* Stack section will be writable.  */
 	     0, /* Stack section will not be executable.  */
-	     1, /* Stack section will be modified.  */
+	     MEMORY_MAPPING_MODIFIED, /* Stack section will be modified.  */
 	     obfd);
 
   /* Make a heap segment.  */
@@ -550,7 +546,7 @@ objfile_find_memory_regions (struct target_ops *self,
 	     1, /* Heap section will be readable.  */
 	     1, /* Heap section will be writable.  */
 	     0, /* Heap section will not be executable.  */
-	     1, /* Heap section will be modified.  */
+	     MEMORY_MAPPING_MODIFIED, /* Heap section will be modified.  */
 	     obfd);
 
   return 0;
diff --git a/gdb/gnu-nat.c b/gdb/gnu-nat.c
index d830773..60612a7 100644
--- a/gdb/gnu-nat.c
+++ b/gdb/gnu-nat.c
@@ -2611,7 +2611,7 @@ gnu_find_memory_regions (struct target_ops *self,
 		     last_protection & VM_PROT_READ,
 		     last_protection & VM_PROT_WRITE,
 		     last_protection & VM_PROT_EXECUTE,
-		     1, /* MODIFIED is unknown, pass it as true.  */
+		     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 		     data);
 	  last_region_address = region_address;
 	  last_region_end = region_address += region_length;
@@ -2625,7 +2625,7 @@ gnu_find_memory_regions (struct target_ops *self,
 	     last_protection & VM_PROT_READ,
 	     last_protection & VM_PROT_WRITE,
 	     last_protection & VM_PROT_EXECUTE,
-	     1, /* MODIFIED is unknown, pass it as true.  */
+	     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 	     data);
 
   return 0;
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index ea0d4cd..ae11a2e 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -35,9 +35,58 @@
 #include "observer.h"
 #include "objfiles.h"
 #include "infcall.h"
+#include "gdbcmd.h"
+#include "gdb_regex.h"
 
 #include <ctype.h>
 
+/* This enum represents the values that the user can choose when
+   informing the Linux kernel about which memory mappings will be
+   dumped in a corefile.  They are described in the file
+   Documentation/filesystems/proc.txt, inside the Linux kernel
+   tree.  */
+
+enum
+  {
+    COREFILTER_ANON_PRIVATE = 1 << 0,
+    COREFILTER_ANON_SHARED = 1 << 1,
+    COREFILTER_MAPPED_PRIVATE = 1 << 2,
+    COREFILTER_MAPPED_SHARED = 1 << 3,
+    COREFILTER_ELF_HEADERS = 1 << 4,
+    COREFILTER_HUGETLB_PRIVATE = 1 << 5,
+    COREFILTER_HUGETLB_SHARED = 1 << 6,
+  };
+
+struct smaps_vmflags
+  {
+    /* Zero if this structure has not been initialized yet.  It
+       probably means that the Linux kernel being used does not emit
+       the "VmFlags:" field on "/proc/PID/smaps".  */
+
+    unsigned int initialized_p : 1;
+
+    /* Memory mapped I/O area (VM_IO, "io").  */
+
+    unsigned int io_page : 1;
+
+    /* Area uses huge TLB pages (VM_HUGETLB, "ht").  */
+
+    unsigned int uses_huge_tlb : 1;
+
+    /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd").  */
+
+    unsigned int exclude_coredump : 1;
+
+    /* Is this a MAP_SHARED mapping (VM_SHARED, "sh").  */
+
+    unsigned int shared_mapping : 1;
+  };
+
+/* Whether to take the /proc/PID/coredump_filter into account when
+   generating a corefile.  */
+
+static int use_coredump_filter = 1;
+
 /* This enum represents the signals' numbers on a generic architecture
    running the Linux kernel.  The definition of "generic" comes from
    the file <include/uapi/asm-generic/signal.h>, from the Linux kernel
@@ -381,6 +430,159 @@ read_mapping (const char *line,
   *filename = p;
 }
 
+/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
+
+   This function was based on the documentation found on
+   <Documentation/filesystems/proc.txt>, on the Linux kernel.
+
+   Linux kernels before commit
+   834f82e2aa9a8ede94b17b656329f850c1471514 do not have this field on
+   smaps.  */
+
+static void
+decode_vmflags (char *p, struct smaps_vmflags *v)
+{
+  char *saveptr;
+  char *s;
+
+  v->initialized_p = 1;
+  p = skip_to_space (p);
+  p = skip_spaces (p);
+
+  for (s = strtok_r (p, " ", &saveptr);
+       s != NULL;
+       s = strtok_r (NULL, " ", &saveptr))
+    {
+      if (strcmp (s, "io") == 0)
+	v->io_page = 1;
+      else if (strcmp (s, "ht") == 0)
+	v->uses_huge_tlb = 1;
+      else if (strcmp (s, "dd") == 0)
+	v->exclude_coredump = 1;
+      else if (strcmp (s, "sh") == 0)
+	v->shared_mapping = 1;
+    }
+}
+
+/* Return 1 if the memory mapping is anonymous, 0 otherwise.
+
+   FILENAME is the name of the file present in the first line of the
+   memory mapping, in the "/proc/PID/smaps" output.  For example, if
+   the first line is:
+
+   7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770   /path/to/file
+
+   Then FILENAME will be "/path/to/file".  */
+
+static int
+mapping_is_anonymous_p (const char *filename)
+{
+  static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex;
+  static int init_regex_p = 0;
+
+  if (!init_regex_p)
+    {
+      struct cleanup *c = make_cleanup (null_cleanup, NULL);
+
+      init_regex_p = 1;
+      compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match /dev/zero "
+			     "filename"));
+      compile_rx_or_error (&shmem_file_regex,
+			   "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$",
+			   _("Could not compile regex to match shmem "
+			     "filenames"));
+      /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the
+	 Linux kernel's 'n_link == 0' code, which is responsible to
+	 decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS'
+	 mapping.  In other words, if FILE_DELETED_REGEX matches, it
+	 does not necessarily mean that we are dealing with an
+	 anonymous shared mapping.  However, there is no easy way to
+	 detect this currently, so this is the best approximation we
+	 have.
+
+	 As a result, GDB will dump readonly pages of deleted
+	 executables when using the default value of coredump_filter
+	 (0x33), while the Linux kernel will not dump those pages.
+	 But we can live with that.  */
+      compile_rx_or_error (&file_deleted_regex, " (deleted)$",
+			   _("Could not compile regex to match "
+			     "'<file> (deleted)'"));
+      /* We will never release these regexes, so just discard the
+	 cleanups.  */
+      discard_cleanups (c);
+    }
+
+  if (*filename == '\0'
+      || regexec (&dev_zero_regex, filename, 0, NULL, 0) == 0
+      || regexec (&shmem_file_regex, filename, 0, NULL, 0) == 0
+      || regexec (&file_deleted_regex, filename, 0, NULL, 0) == 0)
+    return 1;
+
+  return 0;
+}
+
+/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
+   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
+   greater than 0 if it should.  */
+
+static int
+dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
+		int maybe_private_p, int mapping_anon_p, const char *filename)
+{
+  /* Initially, we trust in what we received from outside.  This value
+     may not be very precise (i.e., it was probably gathered from the
+     permission line in the /proc/PID/smaps list, which actually
+     refers to VM_MAYSHARE, and not VM_SHARED), but it is what we have
+     for now.  */
+  int private_p = maybe_private_p;
+
+  /* We always dump vDSO and vsyscall mappings.  */
+  if (strcmp ("[vdso]", filename) == 0
+      || strcmp ("[vsyscall]", filename) == 0)
+    return 1;
+
+  if (v->initialized_p)
+    {
+      /* We never dump I/O mappings.  */
+      if (v->io_page)
+	return 0;
+
+      /* Check if we should exclude this mapping.  */
+      if (v->exclude_coredump)
+	return 0;
+
+      /* Updating our notion of whether this mapping is shared or
+	 private based on a trustworthy value.  */
+      private_p = !v->shared_mapping;
+
+      /* HugeTLB checking.  */
+      if (v->uses_huge_tlb)
+	{
+	  if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE))
+	      || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED)))
+	    return 1;
+
+	  return 0;
+	}
+    }
+
+  if (private_p)
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_PRIVATE) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_PRIVATE) != 0;
+    }
+  else
+    {
+      if (mapping_anon_p)
+	return (filterflags & COREFILTER_ANON_SHARED) != 0;
+      else
+	return (filterflags & COREFILTER_MAPPED_SHARED) != 0;
+    }
+}
+
 /* Implement the "info proc" command.  */
 
 static void
@@ -807,7 +1009,8 @@ linux_core_info_proc (struct gdbarch *gdbarch, const char *args,
 typedef int linux_find_memory_region_ftype (ULONGEST vaddr, ULONGEST size,
 					    ULONGEST offset, ULONGEST inode,
 					    int read, int write,
-					    int exec, int modified,
+					    int exec,
+					    enum memory_mapping_state state,
 					    const char *filename,
 					    void *data);
 
@@ -819,48 +1022,84 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 				void *obfd)
 {
   char mapsfilename[100];
-  char *data;
+  char coredumpfilter_name[100];
+  char *data, *coredumpfilterdata;
+  pid_t pid;
+  /* Default dump behavior of coredump_filter (0x33), according to
+     Documentation/filesystems/proc.txt from the Linux kernel
+     tree.  */
+  unsigned int filterflags = (COREFILTER_ANON_PRIVATE
+			      | COREFILTER_ANON_SHARED
+			      | COREFILTER_ELF_HEADERS
+			      | COREFILTER_HUGETLB_PRIVATE);
 
   /* We need to know the real target PID to access /proc.  */
   if (current_inferior ()->fake_pid_p)
     return 1;
 
-  xsnprintf (mapsfilename, sizeof mapsfilename,
-	     "/proc/%d/smaps", current_inferior ()->pid);
+  pid = current_inferior ()->pid;
+
+  if (use_coredump_filter)
+    {
+      xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name),
+		 "/proc/%d/coredump_filter", pid);
+      coredumpfilterdata = target_fileio_read_stralloc (coredumpfilter_name);
+      if (coredumpfilterdata != NULL)
+	{
+	  sscanf (coredumpfilterdata, "%x", &filterflags);
+	  xfree (coredumpfilterdata);
+	}
+    }
+
+  xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid);
   data = target_fileio_read_stralloc (mapsfilename);
   if (data == NULL)
     {
       /* Older Linux kernels did not support /proc/PID/smaps.  */
-      xsnprintf (mapsfilename, sizeof mapsfilename,
-		 "/proc/%d/maps", current_inferior ()->pid);
+      xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid);
       data = target_fileio_read_stralloc (mapsfilename);
     }
-  if (data)
+
+  if (data != NULL)
     {
       struct cleanup *cleanup = make_cleanup (xfree, data);
-      char *line;
+      char *line, *t;
 
-      line = strtok (data, "\n");
-      while (line)
+      line = strtok_r (data, "\n", &t);
+      while (line != NULL)
 	{
 	  ULONGEST addr, endaddr, offset, inode;
 	  const char *permissions, *device, *filename;
+	  struct smaps_vmflags v;
 	  size_t permissions_len, device_len;
-	  int read, write, exec;
-	  int modified = 0, has_anonymous = 0;
+	  int read, write, exec, private;
+	  enum memory_mapping_state state;
+	  int has_anonymous = 0;
+	  int mapping_anon_p;
 
+	  memset (&v, 0, sizeof (v));
 	  read_mapping (line, &addr, &endaddr, &permissions, &permissions_len,
 			&offset, &device, &device_len, &inode, &filename);
+	  mapping_anon_p = mapping_is_anonymous_p (filename);
 
 	  /* Decode permissions.  */
 	  read = (memchr (permissions, 'r', permissions_len) != 0);
 	  write = (memchr (permissions, 'w', permissions_len) != 0);
 	  exec = (memchr (permissions, 'x', permissions_len) != 0);
+	  /* 'private' here actually means VM_MAYSHARE, and not
+	     VM_SHARED.  In order to know if a mapping is really
+	     private or not, we must check the flag "sh" in the
+	     VmFlags field.  This is done by decode_vmflags.  However,
+	     if we are using an old Linux kernel, we will not have the
+	     VmFlags there.  In this case, there is really no way to
+	     know if we are dealing with VM_SHARED, so we just assume
+	     that VM_MAYSHARE is enough.  */
+	  private = memchr (permissions, 'p', permissions_len) != 0;
 
 	  /* Try to detect if region was modified by parsing smaps counters.  */
-	  for (line = strtok (NULL, "\n");
-	       line && line[0] >= 'A' && line[0] <= 'Z';
-	       line = strtok (NULL, "\n"))
+	  for (line = strtok_r (NULL, "\n", &t);
+	       line != NULL && line[0] >= 'A' && line[0] <= 'Z';
+	       line = strtok_r (NULL, "\n", &t))
 	    {
 	      char keyword[64 + 1];
 
@@ -869,11 +1108,17 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 		  warning (_("Error parsing {s,}maps file '%s'"), mapsfilename);
 		  break;
 		}
+
 	      if (strcmp (keyword, "Anonymous:") == 0)
-		has_anonymous = 1;
-	      if (strcmp (keyword, "Shared_Dirty:") == 0
-		  || strcmp (keyword, "Private_Dirty:") == 0
-		  || strcmp (keyword, "Swap:") == 0
+		{
+		  /* Older Linux kernels did not support the
+		     "Anonymous:" counter.  Check it here.  */
+		  has_anonymous = 1;
+		}
+	      else if (strcmp (keyword, "VmFlags:") == 0)
+		decode_vmflags (line, &v);
+
+	      if (strcmp (keyword, "AnonHugePages:") == 0
 		  || strcmp (keyword, "Anonymous:") == 0)
 		{
 		  unsigned long number;
@@ -884,19 +1129,43 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
 			       mapsfilename);
 		      break;
 		    }
-		  if (number != 0)
-		    modified = 1;
+		  if (number > 0)
+		    {
+		      /* Even if we are dealing with a file-backed
+			 mapping, if it contains anonymous pages we
+			 consider it to be an anonymous mapping,
+			 because this is what the Linux kernel does:
+
+			 // Dump segments that have been written to.
+			 if (vma->anon_vma && FILTER(ANON_PRIVATE))
+			 	goto whole;
+		      */
+		      mapping_anon_p = 1;
+		    }
 		}
 	    }
 
-	  /* Older Linux kernels did not support the "Anonymous:" counter.
-	     If it is missing, we can't be sure - dump all the pages.  */
-	  if (!has_anonymous)
-	    modified = 1;
+	  /* If a mapping should not be dumped we still should create
+	     a segment for it, just without SEC_LOAD (see
+	     gcore_create_callback).  */
+	  if (has_anonymous)
+	    {
+	      if (dump_mapping_p (filterflags, &v, private, mapping_anon_p,
+				  filename))
+		state = MEMORY_MAPPING_MODIFIED;
+	      else
+		state = MEMORY_MAPPING_UNMODIFIED;
+	    }
+	  else
+	    {
+	      /* Older Linux kernels did not support the "Anonymous:" counter.
+		 If it is missing, we can't be sure - dump all the pages.  */
+	      state = MEMORY_MAPPING_UNKNOWN_STATE;
+	    }
 
 	  /* Invoke the callback function to create the corefile segment.  */
 	  func (addr, endaddr - addr, offset, inode,
-		read, write, exec, modified, filename, obfd);
+		read, write, exec, state, filename, obfd);
 	}
 
       do_cleanups (cleanup);
@@ -926,12 +1195,13 @@ struct linux_find_memory_regions_data
 static int
 linux_find_memory_regions_thunk (ULONGEST vaddr, ULONGEST size,
 				 ULONGEST offset, ULONGEST inode,
-				 int read, int write, int exec, int modified,
+				 int read, int write, int exec,
+				 enum memory_mapping_state state,
 				 const char *filename, void *arg)
 {
   struct linux_find_memory_regions_data *data = arg;
 
-  return data->func (vaddr, size, read, write, exec, modified, data->obfd);
+  return data->func (vaddr, size, read, write, exec, state, data->obfd);
 }
 
 /* A variant of linux_find_memory_regions_full that is suitable as the
@@ -1074,7 +1344,8 @@ static linux_find_memory_region_ftype linux_make_mappings_callback;
 static int
 linux_make_mappings_callback (ULONGEST vaddr, ULONGEST size,
 			      ULONGEST offset, ULONGEST inode,
-			      int read, int write, int exec, int modified,
+			      int read, int write, int exec,
+			      enum memory_mapping_state state,
 			      const char *filename, void *data)
 {
   struct linux_make_mappings_data *map_data = data;
@@ -1872,7 +2143,8 @@ linux_gdb_signal_to_target (struct gdbarch *gdbarch,
 
 static int
 find_mapping_size (CORE_ADDR vaddr, unsigned long size,
-		   int read, int write, int exec, int modified,
+		   int read, int write, int exec,
+		   enum memory_mapping_state state,
 		   void *data)
 {
   struct mem_range *range = data;
@@ -1972,6 +2244,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot)
   return retval;
 }
 
+/* Display whether the gcore command is using the
+   /proc/PID/coredump_filter file.  */
+
+static void
+show_use_coredump_filter (struct ui_file *file, int from_tty,
+			  struct cmd_list_element *c, const char *value)
+{
+  fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to generate"
+			    " corefiles is %s.\n"), value);
+}
+
 /* To be called from the various GDB_OSABI_LINUX handlers for the
    various GNU/Linux architectures and machine types.  */
 
@@ -2008,4 +2291,16 @@ _initialize_linux_tdep (void)
   /* Observers used to invalidate the cache when needed.  */
   observer_attach_inferior_exit (invalidate_linux_cache_inf);
   observer_attach_inferior_appeared (invalidate_linux_cache_inf);
+
+  add_setshow_boolean_cmd ("use-coredump-filter", class_files,
+			   &use_coredump_filter, _("\
+Set whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Show whether gcore should consider /proc/PID/coredump_filter."),
+			   _("\
+Use this command to set whether gcore should consider the contents\n\
+of /proc/PID/coredump_filter when generating the corefile.  For more information\n\
+about this file, refer to the manpage of core(5)."),
+			   NULL, show_use_coredump_filter,
+			   &setlist, &showlist);
 }
diff --git a/gdb/procfs.c b/gdb/procfs.c
index b62539f..d074dd3 100644
--- a/gdb/procfs.c
+++ b/gdb/procfs.c
@@ -4967,7 +4967,7 @@ find_memory_regions_callback (struct prmap *map,
 		  (map->pr_mflags & MA_READ) != 0,
 		  (map->pr_mflags & MA_WRITE) != 0,
 		  (map->pr_mflags & MA_EXEC) != 0,
-		  1, /* MODIFIED is unknown, pass it as true.  */
+		  MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
 		  data);
 }
 
diff --git a/gdb/testsuite/gdb.base/coredump-filter.c b/gdb/testsuite/gdb.base/coredump-filter.c
new file mode 100644
index 0000000..192c469
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.c
@@ -0,0 +1,61 @@
+/* Copyright 2015 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <assert.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+
+static void *
+do_mmap (void *addr, size_t size, int prot, int flags, int fd, off_t offset)
+{
+  void *ret = mmap (addr, size, prot, flags, fd, offset);
+
+  assert (ret != NULL);
+  return ret;
+}
+
+int
+main (int argc, char *argv[])
+{
+  const size_t size = 10;
+  const int default_prot = PROT_READ | PROT_WRITE;
+  char *private_anon, *shared_anon;
+  char *dont_dump;
+  int i;
+
+  private_anon = do_mmap (NULL, size, default_prot,
+			  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (private_anon, 0x11, size);
+
+  shared_anon = do_mmap (NULL, size, default_prot,
+			 MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+  memset (shared_anon, 0x22, size);
+
+  dont_dump = do_mmap (NULL, size, default_prot,
+		       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  memset (dont_dump, 0x55, size);
+  i = madvise (dont_dump, size, MADV_DONTDUMP);
+  assert_perror (errno);
+  assert (i == 0);
+
+  return 0; /* break-here */
+}
diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
new file mode 100644
index 0000000..c7ae91d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/coredump-filter.exp
@@ -0,0 +1,129 @@
+# Copyright 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile debug] } {
+    untested $testfile.exp
+    return -1
+}
+
+if { ![runto_main] } {
+    untested $testfile.exp
+    return -1
+}
+
+gdb_breakpoint [gdb_get_line_number "break-here"]
+gdb_continue_to_breakpoint "break-here" ".* break-here .*"
+
+proc do_save_core { filter_flag core ipid } {
+    verbose -log "writing $filter_flag to /proc/$ipid/coredump_filter"
+    if { [catch {open /proc/$ipid/coredump_filter w} fileid] } {
+	untested $testfile.exp
+	return -1
+    }
+
+    # Set coredump_filter to the value we want
+    puts $fileid $filter_flag
+    close $fileid
+
+    # Generate a corefile
+    gdb_gcore_cmd "$core" "save corefile $core"
+}
+
+proc do_load_and_test_core { core var working_var working_value } {
+    global hex decimal addr
+
+    set core_loaded [gdb_core_cmd "$core" "load $core"]
+    if { $core_loaded == -1 } {
+	fail "loading $core"
+	return
+    }
+
+    # Use 'int' as any variants of 'char' try to read the target bytes.
+    gdb_test "print *(unsigned int *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
+	"printing $var when core is loaded (should not work)"
+    gdb_test "print/x *(unsigned int *) $addr($working_var)" " = $working_value.*" \
+	"print/x *$working_var ( = $working_value)"
+}
+
+set non_private_anon_core [standard_output_file non-private-anon.gcore]
+set non_shared_anon_core [standard_output_file non-shared-anon.gcore]
+set dont_dump_core [standard_output_file dont-dump.gcore]
+
+# We will generate a few corefiles
+#
+# This list is composed by sub-lists, and their elements are (in
+# order):
+#
+# - name of the test
+# - hexadecimal value to be put in the /proc/PID/coredump_filter file
+# - name of the variable that contains the name of the corefile to be
+#   generated (including the initial $).
+# - name of the variable in the C source code that points to the
+#   memory mapping that will NOT be present in the corefile.
+# - name of a variable in the C source code that points to a memory
+#   mapping that WILL be present in the corefile
+# - corresponding value expected for the above variable
+
+set all_corefiles { { "non-Private-Anonymous" "0x7e" \
+			  $non_private_anon_core \
+			  "private_anon" \
+			  "shared_anon" "0x22" }
+    { "non-Shared-Anonymous" "0x7d" \
+	  $non_shared_anon_core "shared_anon" \
+	  "private_anon" "0x11" }
+    { "DoNotDump" "0x33" \
+	  $dont_dump_core "dont_dump" \
+	  "shared_anon" "0x22" } }
+
+set core_supported [gdb_gcore_cmd "$non_private_anon_core" "save a corefile"]
+if { !$core_supported } {
+    untested $testfile.exp
+    return -1
+}
+
+# Getting the inferior's PID
+gdb_test_multiple "info inferiors" "getting inferior pid" {
+    -re "process \($decimal\).*\r\n$gdb_prompt $" {
+	set infpid $expect_out(1,string)
+    }
+}
+
+foreach item $all_corefiles {
+    foreach name [list [lindex $item 3] [lindex $item 4]] {
+	set test "print/x $name"
+	gdb_test_multiple $test $test {
+	    -re " = \($hex\)\r\n$gdb_prompt $" {
+		set addr($name) $expect_out(1,string)
+	    }
+	}
+    }
+}
+
+foreach item $all_corefiles {
+    with_test_prefix "saving corefile for [lindex $item 0]" {
+	do_save_core [lindex $item 1] [subst [lindex $item 2]] $infpid
+    }
+}
+
+clean_restart $testfile
+
+foreach item $all_corefiles {
+    with_test_prefix "loading and testing corefile for [lindex $item 0]" {
+	do_load_and_test_core [subst [lindex $item 2]] [lindex $item 3] \
+	    [lindex $item 4] [lindex $item 5]
+    }
+}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 18:04                     ` Oleg Nesterov
@ 2015-03-13  4:50                       ` Sergio Durigan Junior
  2015-03-13 15:06                         ` Oleg Nesterov
  0 siblings, 1 reply; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-13  4:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On Thursday, March 12 2015, Oleg Nesterov wrote:

>> If I understood this discussion correctly (and thanks Andy and Oleg for,
>> *ahem*, dumping all this useful information for us!), GDB will not need
>> modifications in the Linux kernel in this area.  In fact, my patch
>> already implements the "ignore VM_DONTDUMP mappings" part, so we're
>> pretty much covered.
>
> OK, thanks.
>
> And it seems that we all agree that the kernel should not dump this vma
> too. Could you confirm that this is fine from gdb pov just in case?

Yes, this is what we expect from the GDB side.  This mapping is marked
as "dd", so it does not make sense to dump it.

While I have you guys, would it be possible for the Linux kernel to
include a new flag on VmFlags to uniquely identify an anonymous mapping?
Currently, there is no easy way to do that from userspace.  My patch
implements the following heuristic on GDB:

  if (pathname == "/dev/zero (deleted)"
      || pathname == "/SYSV%08x (deleted)"
      || pathname == "<file> (deleted)"
      || "Anonymous:" field is > 0 kB
      || "AnonHugePages:" field is > 0 kB)
    mapping is anonymous;

However, this can be fragile.  The Linux kernel checks for i_nlink == 0,
but there is no easy way for GDB to check this on userspace (as Jan
mentioned, one could look at /proc/PID/map_files/, but they are only
accessible by root).  That's why I think it would be good to provide
this info right away in /proc/PID/smaps...

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: vvar, gup && coredump
  2015-03-13  4:50                       ` Sergio Durigan Junior
@ 2015-03-13 15:06                         ` Oleg Nesterov
  0 siblings, 0 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-13 15:06 UTC (permalink / raw)
  To: Sergio Durigan Junior
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On 03/13, Sergio Durigan Junior wrote:
>
> On Thursday, March 12 2015, Oleg Nesterov wrote:
>
> > And it seems that we all agree that the kernel should not dump this vma
> > too. Could you confirm that this is fine from gdb pov just in case?
>
> Yes, this is what we expect from the GDB side.  This mapping is marked
> as "dd", so it does not make sense to dump it.

OK.

> While I have you guys, would it be possible for the Linux kernel to
> include a new flag on VmFlags to uniquely identify an anonymous mapping?

Note that "anonymous" is not the right term here... I mean it is a bit
confusing. Lets discuss this again on debug-list, then we will see if
gdb needs more info from kernel.

> Currently, there is no easy way to do that from userspace.  My patch
> implements the following heuristic on GDB:
>
>   if (pathname == "/dev/zero (deleted)"
>       || pathname == "/SYSV%08x (deleted)"
>       || pathname == "<file> (deleted)"

And for example, this is not anonymous mapping. But,

>     mapping is anonymous;

I agree, gdb should treat it as anonymous.

> However, this can be fragile.  The Linux kernel checks for i_nlink == 0,

Yes, as we already disccussed, I think the kernel should be changed.

It should do something like shmem_mapping() || d_unlinked(), I think.
But this needs another discussion on lkml, and in another thread.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-12 21:39 ` [PATCH v2] " Sergio Durigan Junior
@ 2015-03-13 19:34   ` Pedro Alves
  2015-03-16 23:53     ` Sergio Durigan Junior
  2015-03-14  9:40   ` Eli Zaretskii
  1 sibling, 1 reply; 46+ messages in thread
From: Pedro Alves @ 2015-03-13 19:34 UTC (permalink / raw)
  To: Sergio Durigan Junior, GDB Patches; +Cc: Jan Kratochvil, Oleg Nesterov

On 03/12/2015 09:39 PM, Sergio Durigan Junior wrote:
> gdb/ChangeLog:
> 2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>
> 	    Jan Kratochvil  <jan.kratochvil@redhat.com>
> 	    Oleg Nesterov  <oleg@redhat.com>
> 
> 	PR corefiles/16092
> 	* common/common-defs.h (enum memory_mapping_state): New enum.
> 	* defs.h (find_memory_region_ftype): Remove 'int modified'
> 	parameter, replacing by 'enum memory_mapping_state state'.
> 	* gcore.c (gcore_create_callback): Likewise.  Change 'if/else'
> 	statements and improve the logic of deciding when to ignore a
> 	memory mapping.
> 	(objfile_find_memory_regions): Passing
> 	'MEMORY_MAPPING_UNKNOWN_STATE' or 'MEMORY_MAPPING_MODIFIED' when
> 	needed to 'func' callback, instead of saying the memory mapping
> 	was modified even without knowing it.
> 	* gnu-nat.c (gnu_find_memory_regions): Likewise.
> 	* linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'.
> 	New enum identifying the various options of the coredump_filter
> 	file.
> 	(struct smaps_vmflags): New struct.
> 	(use_coredump_filter): New variable.
> 	(decode_vmflags): New function.
> 	(mapping_is_anonymous_p): Likewise.
> 	(dump_mapping_p): Likewise.
> 	(linux_find_memory_region_ftype): Remove 'int modified' parameter,
> 	replacing by 'enum memory_mapping_state state'.
> 	(linux_find_memory_regions_full): New variables
> 	'coredumpfilter_name', 'coredumpfilterdata', 'pid',
> 	'filterflags'.  Read /proc/<PID>/smaps file; improve parsing of
> 	its information.  Implement memory mapping filtering based on its
> 	contents.
> 	(linux_find_memory_regions_thunk): Remove 'int modified'
> 	parameter, replacing by 'enum memory_mapping_state state'.
> 	(linux_make_mappings_callback): Likewise.
> 	(find_mapping_size): Likewise.
> 	(show_use_coredump_filter): New function.
> 	(_initialize_linux_tdep): New command 'set use-coredump-filter'.
> 	* procfs.c (find_memory_regions_callback): Passing
> 	'MEMORY_MAPPING_UNKNOWN_STATE' when needed to 'func' callback,
> 	instead of saying the memory mapping was modified even without
> 	knowing it.
> 
> gdb/doc/ChangeLog:
> 2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>
> 
> 	PR corefiles/16092
> 	* gdb.texinfo (gcore): Mention new command 'set
> 	use-coredump-filter'.
> 	(set use-coredump-filter): Document new command.
> 
> gdb/testsuite/ChangeLog:
> 2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>
> 
> 	PR corefiles/16092
> 	* gdb.base/coredump-filter.c: New file.
> 	* gdb.base/coredump-filter.exp: Likewise.
> 
> 
> diff --git a/gdb/common/common-defs.h b/gdb/common/common-defs.h
> index 62d9de5..01b05f5 100644
> --- a/gdb/common/common-defs.h
> +++ b/gdb/common/common-defs.h
> @@ -60,4 +60,14 @@
>  # define EXTERN_C_POP
>  #endif
>  
> +/* Enum used to inform the state of a memory mapping.  This is used in
> +   functions implementing find_memory_region_ftype.  */

Why isn't this enum defined next to find_memory_region_ftype?

> +
> +enum memory_mapping_state
> +  {
> +    MEMORY_MAPPING_MODIFIED,
> +    MEMORY_MAPPING_UNMODIFIED,
> +    MEMORY_MAPPING_UNKNOWN_STATE,
> +  };
> +
>  #endif /* COMMON_DEFS_H */
> diff --git a/gdb/defs.h b/gdb/defs.h
> index 72512f6..4829b62 100644
> --- a/gdb/defs.h
> +++ b/gdb/defs.h
> @@ -338,7 +338,8 @@ extern void init_source_path (void);
>  
>  typedef int (*find_memory_region_ftype) (CORE_ADDR addr, unsigned long size,
>  					 int read, int write, int exec,
> -					 int modified, void *data);
> +					 enum memory_mapping_state state,
> +					 void *data);
>  
>  /* * Possible lvalue types.  Like enum language, this should be in
>     value.h, but needs to be here for the same reason.  */
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 9e71642..092bc93 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -10952,6 +10952,67 @@ specified, the file name defaults to @file{core.@var{pid}}, where
>  
>  Note that this command is implemented only for some systems (as of
>  this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
> +
> +On @sc{gnu}/Linux, this command can take into account the value of the
> +file @file{/proc/@var{pid}/coredump_filter} when generating the core
> +dump (@pxref{set use-coredump-filter}).
> +
> +@kindex set use-coredump-filter
> +@anchor{set use-coredump-filter}
> +@item set use-coredump-filter on
> +@itemx set use-coredump-filter off
> +Enable or disable the use of the file
> +@file{/proc/@var{pid}/coredump_filter} when generating core dump
> +files.  This file is used by the Linux kernel to decide what types of
> +memory mappings will be dumped or ignored when generating a core dump
> +file.
> +
> +To make use of this feature, you have to write in the
> +@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
> +which is a bit mask representing the memory mapping types.  If a bit
> +is set in the bit mask, then the memory mappings of the corresponding
> +types will be dumped; otherwise, they will be ignored.  The bits in
> +this bit mask have the following meanings:
> +
> +@table @code
> +@item bit 0
> +Dump anonymous private mappings.
> +@item bit 1
> +Dump anonymous shared mappings.
> +@item bit 2
> +Dump file-backed private mappings.
> +@item bit 3
> +Dump file-backed shared mappings.
> +@item bit 4
> +(since Linux 2.6.24)
> +Dump ELF headers. (@value{GDBN} does not take this bit into account)
> +@item bit 5
> +(since Linux 2.6.28)
> +Dump private huge pages.
> +@item bit 6
> +(since Linux 2.6.28)
> +Dump shared huge pages.
> +@end table
> +
> +For example, supposing that the @code{pid} of the program being
> +debugging is @code{1234}, if you wanted to dump everything except the
> +anonymous private and the file-backed shared mappings, you would do:
> +
> +@smallexample
> +$ echo 0x76 > /proc/1234/coredump_filter
> +@end smallexample
> +
> +For more documentation about how to use the @file{coredump_filter}
> +file, see the manpage of @code{proc(5)}.
> +
> +By default, this option is @code{on}.  If this option is turned
> +@code{off}, @value{GDBN} will not read the @file{coredump_filter}
> +file, but it uses the same default value as the Linux kernel in order

"will not read (...) uses".  I think the grammar isn't correct
that way.

I think preferred is to use present in both cases, thus:

  "(...) turned @code{off}, @value{GDBN} does not the read the"

and no "it", in "but uses the same".  Suggest "and" instead of "but":

By default, this option is @code{on}.  If this option is turned
@code{off}, @value{GDBN} does not read the @file{coredump_filter}
file and instead uses the same default value as the Linux kernel in order
...

> +to decide which pages will be dumped in the core dump file.  This
> +value currently is @code{0x33}, which means that the bits @code{0}

"is currently", I think.   Also, "which means that bits" would sound
more natural to me, though take it with a grain of salt.

> +(anonymous private mappings), @code{1} (anonymous shared mappings) and
> +@code{4} (ELF headers) are active.  This will cause these memory
> +mappings to be dumped automatically.
>  @end table
>  
>  @node Character Sets
> diff --git a/gdb/gcore.c b/gdb/gcore.c
> index 44b9d0c..89d8285 100644
> --- a/gdb/gcore.c
> +++ b/gdb/gcore.c
> @@ -415,27 +415,22 @@ make_output_phdrs (bfd *obfd, asection *osec, void *ignored)
>  
>  static int
>  gcore_create_callback (CORE_ADDR vaddr, unsigned long size, int read,
> -		       int write, int exec, int modified, void *data)
> +		       int write, int exec, enum memory_mapping_state state,
> +		       void *data)
>  {
>    bfd *obfd = data;
>    asection *osec;
>    flagword flags = SEC_ALLOC | SEC_HAS_CONTENTS | SEC_LOAD;
>  
> -  /* If the memory segment has no permissions set, ignore it, otherwise
> -     when we later try to access it for read/write, we'll get an error
> -     or jam the kernel.  */
> -  if (read == 0 && write == 0 && exec == 0 && modified == 0)
> -    {
> -      if (info_verbose)
> -        {
> -          fprintf_filtered (gdb_stdout, "Ignore segment, %s bytes at %s\n",
> -                            plongest (size), paddress (target_gdbarch (), vaddr));
> -        }
> -
> -      return 0;
> -    }
> -
> -  if (write == 0 && modified == 0 && !solib_keep_data_in_core (vaddr, size))
> +  /* If the memory segment has no read permission set, or if it has
> +     been marked as unmodified, then we have to generate a segment
> +     header for it, but without contents (i.e., FileSiz = 0),
> +     otherwise when we later try to access it for read/write, we'll
> +     get an error or jam the kernel.  */
> +  if (read == 0 || state == MEMORY_MAPPING_UNMODIFIED)
> +    flags &= ~(SEC_LOAD | SEC_HAS_CONTENTS);

I'm feeling dense and I'm understanding this change / comment.  :-/
Why didn't we need to do this before, and we need to do it now?

> +  else if (write == 0 && state == MEMORY_MAPPING_UNKNOWN_STATE
> +	   && !solib_keep_data_in_core (vaddr, size))
>      {
>        /* See if this region of memory lies inside a known file on disk.
>  	 If so, we can avoid copying its contents by clearing SEC_LOAD.  */
> @@ -528,7 +523,8 @@ objfile_find_memory_regions (struct target_ops *self,
>  			 1, /* All sections will be readable.  */
>  			 (flags & SEC_READONLY) == 0, /* Writable.  */
>  			 (flags & SEC_CODE) != 0, /* Executable.  */
> -			 1, /* MODIFIED is unknown, pass it as true.  */
> +			 MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is
> +							 unknown.  */
>  			 obfd);
>  	  if (ret != 0)
>  	    return ret;
> @@ -541,7 +537,7 @@ objfile_find_memory_regions (struct target_ops *self,
>  	     1, /* Stack section will be readable.  */
>  	     1, /* Stack section will be writable.  */
>  	     0, /* Stack section will not be executable.  */
> -	     1, /* Stack section will be modified.  */
> +	     MEMORY_MAPPING_MODIFIED, /* Stack section will be modified.  */
>  	     obfd);
>  
>    /* Make a heap segment.  */
> @@ -550,7 +546,7 @@ objfile_find_memory_regions (struct target_ops *self,
>  	     1, /* Heap section will be readable.  */
>  	     1, /* Heap section will be writable.  */
>  	     0, /* Heap section will not be executable.  */
> -	     1, /* Heap section will be modified.  */
> +	     MEMORY_MAPPING_MODIFIED, /* Heap section will be modified.  */
>  	     obfd);
>  
>    return 0;
> diff --git a/gdb/gnu-nat.c b/gdb/gnu-nat.c
> index d830773..60612a7 100644
> --- a/gdb/gnu-nat.c
> +++ b/gdb/gnu-nat.c
> @@ -2611,7 +2611,7 @@ gnu_find_memory_regions (struct target_ops *self,
>  		     last_protection & VM_PROT_READ,
>  		     last_protection & VM_PROT_WRITE,
>  		     last_protection & VM_PROT_EXECUTE,
> -		     1, /* MODIFIED is unknown, pass it as true.  */
> +		     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
>  		     data);
>  	  last_region_address = region_address;
>  	  last_region_end = region_address += region_length;
> @@ -2625,7 +2625,7 @@ gnu_find_memory_regions (struct target_ops *self,
>  	     last_protection & VM_PROT_READ,
>  	     last_protection & VM_PROT_WRITE,
>  	     last_protection & VM_PROT_EXECUTE,
> -	     1, /* MODIFIED is unknown, pass it as true.  */
> +	     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
>  	     data);
>  
>    return 0;
> diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
> index ea0d4cd..ae11a2e 100644
> --- a/gdb/linux-tdep.c
> +++ b/gdb/linux-tdep.c
> @@ -35,9 +35,58 @@
>  #include "observer.h"
>  #include "objfiles.h"
>  #include "infcall.h"
> +#include "gdbcmd.h"
> +#include "gdb_regex.h"
>  
>  #include <ctype.h>
>  
> +/* This enum represents the values that the user can choose when
> +   informing the Linux kernel about which memory mappings will be
> +   dumped in a corefile.  They are described in the file
> +   Documentation/filesystems/proc.txt, inside the Linux kernel
> +   tree.  */
> +
> +enum
> +  {
> +    COREFILTER_ANON_PRIVATE = 1 << 0,
> +    COREFILTER_ANON_SHARED = 1 << 1,
> +    COREFILTER_MAPPED_PRIVATE = 1 << 2,
> +    COREFILTER_MAPPED_SHARED = 1 << 3,
> +    COREFILTER_ELF_HEADERS = 1 << 4,
> +    COREFILTER_HUGETLB_PRIVATE = 1 << 5,
> +    COREFILTER_HUGETLB_SHARED = 1 << 6,
> +  };
> +
> +struct smaps_vmflags

Missing intro comment.

> +  {
> +    /* Zero if this structure has not been initialized yet.  It
> +       probably means that the Linux kernel being used does not emit
> +       the "VmFlags:" field on "/proc/PID/smaps".  */
> +
> +    unsigned int initialized_p : 1;
> +
> +    /* Memory mapped I/O area (VM_IO, "io").  */
> +
> +    unsigned int io_page : 1;
> +
> +    /* Area uses huge TLB pages (VM_HUGETLB, "ht").  */
> +
> +    unsigned int uses_huge_tlb : 1;
> +
> +    /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd").  */
> +
> +    unsigned int exclude_coredump : 1;
> +
> +    /* Is this a MAP_SHARED mapping (VM_SHARED, "sh").  */
> +
> +    unsigned int shared_mapping : 1;
> +  };
> +
> +/* Whether to take the /proc/PID/coredump_filter into account when
> +   generating a corefile.  */
> +
> +static int use_coredump_filter = 1;
> +
>  /* This enum represents the signals' numbers on a generic architecture
>     running the Linux kernel.  The definition of "generic" comes from
>     the file <include/uapi/asm-generic/signal.h>, from the Linux kernel
> @@ -381,6 +430,159 @@ read_mapping (const char *line,
>    *filename = p;
>  }
>  
> +/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
> +
> +   This function was based on the documentation found on
> +   <Documentation/filesystems/proc.txt>, on the Linux kernel.
> +
> +   Linux kernels before commit
> +   834f82e2aa9a8ede94b17b656329f850c1471514 do not have this field on
> +   smaps.  */
> +
> +static void
> +decode_vmflags (char *p, struct smaps_vmflags *v)

Can 'p' be made const ?

> +{
> +  char *saveptr;
> +  char *s;

Likewise.

> +
> +  v->initialized_p = 1;
> +  p = skip_to_space (p);
> +  p = skip_spaces (p);
> +
> +  for (s = strtok_r (p, " ", &saveptr);
> +       s != NULL;
> +       s = strtok_r (NULL, " ", &saveptr))
> +    {
> +      if (strcmp (s, "io") == 0)
> +	v->io_page = 1;
> +      else if (strcmp (s, "ht") == 0)
> +	v->uses_huge_tlb = 1;
> +      else if (strcmp (s, "dd") == 0)
> +	v->exclude_coredump = 1;
> +      else if (strcmp (s, "sh") == 0)
> +	v->shared_mapping = 1;
> +    }
> +}
> +
> +/* Return 1 if the memory mapping is anonymous, 0 otherwise.
> +
> +   FILENAME is the name of the file present in the first line of the
> +   memory mapping, in the "/proc/PID/smaps" output.  For example, if
> +   the first line is:
> +
> +   7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770   /path/to/file
> +
> +   Then FILENAME will be "/path/to/file".  */
> +
> +static int
> +mapping_is_anonymous_p (const char *filename)

As Oleg mentioned, these aren't really anonymous.  Is there a different
term we can use, or improve the comment?

> +{
> +  static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex;
> +  static int init_regex_p = 0;
> +
> +  if (!init_regex_p)
> +    {
> +      struct cleanup *c = make_cleanup (null_cleanup, NULL);
> +
> +      init_regex_p = 1;
> +      compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?$",
> +			   _("Could not compile regex to match /dev/zero "
> +			     "filename"));
> +      compile_rx_or_error (&shmem_file_regex,
> +			   "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$",
> +			   _("Could not compile regex to match shmem "
> +			     "filenames"));

Could you add some comment about what these regexes above are for?

> +      /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the
> +	 Linux kernel's 'n_link == 0' code, which is responsible to
> +	 decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS'
> +	 mapping.  In other words, if FILE_DELETED_REGEX matches, it
> +	 does not necessarily mean that we are dealing with an
> +	 anonymous shared mapping.  However, there is no easy way to
> +	 detect this currently, so this is the best approximation we
> +	 have.
> +
> +	 As a result, GDB will dump readonly pages of deleted
> +	 executables when using the default value of coredump_filter
> +	 (0x33), while the Linux kernel will not dump those pages.
> +	 But we can live with that.  */
> +      compile_rx_or_error (&file_deleted_regex, " (deleted)$",
> +			   _("Could not compile regex to match "
> +			     "'<file> (deleted)'"));
> +      /* We will never release these regexes, so just discard the
> +	 cleanups.  */
> +      discard_cleanups (c);
> +    }
> +

Above, on error, init_regex_p is left set to the same value as
if no error was thrown.  Seems like then the second time
this function is called we'll reach here with invalid compiled
regexes:

> +  if (*filename == '\0'
> +      || regexec (&dev_zero_regex, filename, 0, NULL, 0) == 0
> +      || regexec (&shmem_file_regex, filename, 0, NULL, 0) == 0
> +      || regexec (&file_deleted_regex, filename, 0, NULL, 0) == 0)
> +    return 1;

I think it'd be safer if the top did:

if (init_regex_p == -1)
  return 1; // assume anonymous ?

if (init_regex_p == 0)
  {
    init_regex_p = -1; /* assume error */

    compile_rx_or_error ();

    init_regex_p = 1; /* success! */
}


> +
> +  return 0;
> +}
> +
> +/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
> +   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
> +   greater than 0 if it should.  */
> +
> +static int
> +dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
> +		int maybe_private_p, int mapping_anon_p, const char *filename)
> +{
> +  /* Initially, we trust in what we received from outside.  This value
> +     may not be very precise (i.e., it was probably gathered from the
> +     permission line in the /proc/PID/smaps list, which actually
> +     refers to VM_MAYSHARE, and not VM_SHARED), but it is what we have
> +     for now.  */
> +  int private_p = maybe_private_p;
> +
> +  /* We always dump vDSO and vsyscall mappings.  */

Add comment on why this is special cased?

In the v1 patch intro you said:

 "now also respects the MADV_DONTDUMP flag and does not dump the memory
 mapping marked as so, and won't try to dump "[vsyscall]" or "[vdso]"
 mappings as before (just like the Linux kernel)."

Was that incorrect then?

> +  if (strcmp ("[vdso]", filename) == 0
> +      || strcmp ("[vsyscall]", filename) == 0)
> +    return 1;
> +
> +  if (v->initialized_p)
> +    {
> +      /* We never dump I/O mappings.  */
> +      if (v->io_page)
> +	return 0;
> +
> +      /* Check if we should exclude this mapping.  */
> +      if (v->exclude_coredump)
> +	return 0;
> +
> +      /* Updating our notion of whether this mapping is shared or

s/Updating/Update/

> +	 private based on a trustworthy value.  */
> +      private_p = !v->shared_mapping;
> +
> +      /* HugeTLB checking.  */
> +      if (v->uses_huge_tlb)
> +	{
> +	  if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE))
> +	      || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED)))
> +	    return 1;
> +
> +	  return 0;
> +	}
> +    }
> +
> +  if (private_p)
> +    {
> +      if (mapping_anon_p)
> +	return (filterflags & COREFILTER_ANON_PRIVATE) != 0;
> +      else
> +	return (filterflags & COREFILTER_MAPPED_PRIVATE) != 0;
> +    }
> +  else
> +    {
> +      if (mapping_anon_p)
> +	return (filterflags & COREFILTER_ANON_SHARED) != 0;
> +      else
> +	return (filterflags & COREFILTER_MAPPED_SHARED) != 0;
> +    }
> +}
> +
>  /* Implement the "info proc" command.  */
>  
>  static void
> @@ -807,7 +1009,8 @@ linux_core_info_proc (struct gdbarch *gdbarch, const char *args,
>  typedef int linux_find_memory_region_ftype (ULONGEST vaddr, ULONGEST size,
>  					    ULONGEST offset, ULONGEST inode,
>  					    int read, int write,
> -					    int exec, int modified,
> +					    int exec,
> +					    enum memory_mapping_state state,
>  					    const char *filename,
>  					    void *data);
>  
> @@ -819,48 +1022,84 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
>  				void *obfd)
>  {
>    char mapsfilename[100];
> -  char *data;
> +  char coredumpfilter_name[100];
> +  char *data, *coredumpfilterdata;
> +  pid_t pid;
> +  /* Default dump behavior of coredump_filter (0x33), according to
> +     Documentation/filesystems/proc.txt from the Linux kernel
> +     tree.  */
> +  unsigned int filterflags = (COREFILTER_ANON_PRIVATE
> +			      | COREFILTER_ANON_SHARED
> +			      | COREFILTER_ELF_HEADERS
> +			      | COREFILTER_HUGETLB_PRIVATE);
>  
>    /* We need to know the real target PID to access /proc.  */
>    if (current_inferior ()->fake_pid_p)
>      return 1;
>  
> -  xsnprintf (mapsfilename, sizeof mapsfilename,
> -	     "/proc/%d/smaps", current_inferior ()->pid);
> +  pid = current_inferior ()->pid;
> +
> +  if (use_coredump_filter)
> +    {
> +      xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name),
> +		 "/proc/%d/coredump_filter", pid);
> +      coredumpfilterdata = target_fileio_read_stralloc (coredumpfilter_name);
> +      if (coredumpfilterdata != NULL)
> +	{
> +	  sscanf (coredumpfilterdata, "%x", &filterflags);
> +	  xfree (coredumpfilterdata);
> +	}
> +    }
> +
> +  xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid);
>    data = target_fileio_read_stralloc (mapsfilename);
>    if (data == NULL)
>      {
>        /* Older Linux kernels did not support /proc/PID/smaps.  */
> -      xsnprintf (mapsfilename, sizeof mapsfilename,
> -		 "/proc/%d/maps", current_inferior ()->pid);
> +      xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid);
>        data = target_fileio_read_stralloc (mapsfilename);
>      }
> -  if (data)
> +
> +  if (data != NULL)
>      {
>        struct cleanup *cleanup = make_cleanup (xfree, data);
> -      char *line;
> +      char *line, *t;
>  
> -      line = strtok (data, "\n");
> -      while (line)
> +      line = strtok_r (data, "\n", &t);
> +      while (line != NULL)
>  	{
>  	  ULONGEST addr, endaddr, offset, inode;
>  	  const char *permissions, *device, *filename;
> +	  struct smaps_vmflags v;
>  	  size_t permissions_len, device_len;
> -	  int read, write, exec;
> -	  int modified = 0, has_anonymous = 0;
> +	  int read, write, exec, private;
> +	  enum memory_mapping_state state;
> +	  int has_anonymous = 0;
> +	  int mapping_anon_p;
>  
> +	  memset (&v, 0, sizeof (v));
>  	  read_mapping (line, &addr, &endaddr, &permissions, &permissions_len,
>  			&offset, &device, &device_len, &inode, &filename);
> +	  mapping_anon_p = mapping_is_anonymous_p (filename);
>  
>  	  /* Decode permissions.  */
>  	  read = (memchr (permissions, 'r', permissions_len) != 0);
>  	  write = (memchr (permissions, 'w', permissions_len) != 0);
>  	  exec = (memchr (permissions, 'x', permissions_len) != 0);
> +	  /* 'private' here actually means VM_MAYSHARE, and not
> +	     VM_SHARED.  In order to know if a mapping is really
> +	     private or not, we must check the flag "sh" in the
> +	     VmFlags field.  This is done by decode_vmflags.  However,
> +	     if we are using an old Linux kernel, we will not have the

It's best to avoid "old", "new", etc.  New will get old soon too.
Do we have some version string/number to put here instead?
Likewise other places.

> +	     VmFlags there.  In this case, there is really no way to
> +	     know if we are dealing with VM_SHARED, so we just assume
> +	     that VM_MAYSHARE is enough.  */
> +	  private = memchr (permissions, 'p', permissions_len) != 0;
>  
>  	  /* Try to detect if region was modified by parsing smaps counters.  */
> -	  for (line = strtok (NULL, "\n");
> -	       line && line[0] >= 'A' && line[0] <= 'Z';
> -	       line = strtok (NULL, "\n"))
> +	  for (line = strtok_r (NULL, "\n", &t);
> +	       line != NULL && line[0] >= 'A' && line[0] <= 'Z';
> +	       line = strtok_r (NULL, "\n", &t))
>  	    {
>  	      char keyword[64 + 1];
>  
> @@ -869,11 +1108,17 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
>  		  warning (_("Error parsing {s,}maps file '%s'"), mapsfilename);
>  		  break;
>  		}
> +
>  	      if (strcmp (keyword, "Anonymous:") == 0)
> -		has_anonymous = 1;
> -	      if (strcmp (keyword, "Shared_Dirty:") == 0
> -		  || strcmp (keyword, "Private_Dirty:") == 0
> -		  || strcmp (keyword, "Swap:") == 0
> +		{
> +		  /* Older Linux kernels did not support the
> +		     "Anonymous:" counter.  Check it here.  */
> +		  has_anonymous = 1;
> +		}
> +	      else if (strcmp (keyword, "VmFlags:") == 0)
> +		decode_vmflags (line, &v);
> +
> +	      if (strcmp (keyword, "AnonHugePages:") == 0
>  		  || strcmp (keyword, "Anonymous:") == 0)
>  		{
>  		  unsigned long number;
> @@ -884,19 +1129,43 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
>  			       mapsfilename);
>  		      break;
>  		    }
> -		  if (number != 0)
> -		    modified = 1;
> +		  if (number > 0)
> +		    {
> +		      /* Even if we are dealing with a file-backed
> +			 mapping, if it contains anonymous pages we
> +			 consider it to be an anonymous mapping,
> +			 because this is what the Linux kernel does:
> +
> +			 // Dump segments that have been written to.
> +			 if (vma->anon_vma && FILTER(ANON_PRIVATE))
> +			 	goto whole;
> +		      */
> +		      mapping_anon_p = 1;
> +		    }
>  		}
>  	    }
>  
> -	  /* Older Linux kernels did not support the "Anonymous:" counter.
> -	     If it is missing, we can't be sure - dump all the pages.  */
> -	  if (!has_anonymous)
> -	    modified = 1;
> +	  /* If a mapping should not be dumped we still should create
> +	     a segment for it, just without SEC_LOAD (see
> +	     gcore_create_callback).  */

I saw gcore_create_callback, but I ended up still clueless.  :-)

> +	  if (has_anonymous)
> +	    {
> +	      if (dump_mapping_p (filterflags, &v, private, mapping_anon_p,
> +				  filename))
> +		state = MEMORY_MAPPING_MODIFIED;
> +	      else
> +		state = MEMORY_MAPPING_UNMODIFIED;
> +	    }
> +	  else
> +	    {
> +	      /* Older Linux kernels did not support the "Anonymous:" counter.
> +		 If it is missing, we can't be sure - dump all the pages.  */
> +	      state = MEMORY_MAPPING_UNKNOWN_STATE;
> +	    }
>  
>  	  /* Invoke the callback function to create the corefile segment.  */
>  	  func (addr, endaddr - addr, offset, inode,
> -		read, write, exec, modified, filename, obfd);
> +		read, write, exec, state, filename, obfd);
>  	}
>  
>        do_cleanups (cleanup);
> @@ -926,12 +1195,13 @@ struct linux_find_memory_regions_data
>  static int
>  linux_find_memory_regions_thunk (ULONGEST vaddr, ULONGEST size,
>  				 ULONGEST offset, ULONGEST inode,
> -				 int read, int write, int exec, int modified,
> +				 int read, int write, int exec,
> +				 enum memory_mapping_state state,

read/write, etc. are also state.  "enum memory_mapping_state state"
doesn't really indicate immediately what it's about.  How about
using "modified_state" for the variable name?  (here and elsewhere).
I think the end result will be more readable.

>  				 const char *filename, void *arg)
>  {
>    struct linux_find_memory_regions_data *data = arg;
>  
> -  return data->func (vaddr, size, read, write, exec, modified, data->obfd);
> +  return data->func (vaddr, size, read, write, exec, state, data->obfd);
>  }
>  
>  /* A variant of linux_find_memory_regions_full that is suitable as the
> @@ -1074,7 +1344,8 @@ static linux_find_memory_region_ftype linux_make_mappings_callback;
>  static int
>  linux_make_mappings_callback (ULONGEST vaddr, ULONGEST size,
>  			      ULONGEST offset, ULONGEST inode,
> -			      int read, int write, int exec, int modified,
> +			      int read, int write, int exec,
> +			      enum memory_mapping_state state,
>  			      const char *filename, void *data)
>  {
>    struct linux_make_mappings_data *map_data = data;
> @@ -1872,7 +2143,8 @@ linux_gdb_signal_to_target (struct gdbarch *gdbarch,
>  
>  static int
>  find_mapping_size (CORE_ADDR vaddr, unsigned long size,
> -		   int read, int write, int exec, int modified,
> +		   int read, int write, int exec,
> +		   enum memory_mapping_state state,
>  		   void *data)
>  {
>    struct mem_range *range = data;
> @@ -1972,6 +2244,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot)
>    return retval;
>  }
>  
> +/* Display whether the gcore command is using the
> +   /proc/PID/coredump_filter file.  */
> +
> +static void
> +show_use_coredump_filter (struct ui_file *file, int from_tty,
> +			  struct cmd_list_element *c, const char *value)
> +{
> +  fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to generate"
> +			    " corefiles is %s.\n"), value);
> +}
> +
>  /* To be called from the various GDB_OSABI_LINUX handlers for the
>     various GNU/Linux architectures and machine types.  */
>  
> @@ -2008,4 +2291,16 @@ _initialize_linux_tdep (void)
>    /* Observers used to invalidate the cache when needed.  */
>    observer_attach_inferior_exit (invalidate_linux_cache_inf);
>    observer_attach_inferior_appeared (invalidate_linux_cache_inf);
> +
> +  add_setshow_boolean_cmd ("use-coredump-filter", class_files,
> +			   &use_coredump_filter, _("\
> +Set whether gcore should consider /proc/PID/coredump_filter."),
> +			   _("\
> +Show whether gcore should consider /proc/PID/coredump_filter."),
> +			   _("\
> +Use this command to set whether gcore should consider the contents\n\
> +of /proc/PID/coredump_filter when generating the corefile.  For more information\n\
> +about this file, refer to the manpage of core(5)."),
> +			   NULL, show_use_coredump_filter,
> +			   &setlist, &showlist);
>  }
> diff --git a/gdb/procfs.c b/gdb/procfs.c
> index b62539f..d074dd3 100644
> --- a/gdb/procfs.c
> +++ b/gdb/procfs.c
> @@ -4967,7 +4967,7 @@ find_memory_regions_callback (struct prmap *map,
>  		  (map->pr_mflags & MA_READ) != 0,
>  		  (map->pr_mflags & MA_WRITE) != 0,
>  		  (map->pr_mflags & MA_EXEC) != 0,
> -		  1, /* MODIFIED is unknown, pass it as true.  */
> +		  MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
>  		  data);
>  }
>  
> diff --git a/gdb/testsuite/gdb.base/coredump-filter.c b/gdb/testsuite/gdb.base/coredump-filter.c
> new file mode 100644
> index 0000000..192c469
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/coredump-filter.c
> @@ -0,0 +1,61 @@
> +/* Copyright 2015 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#define _GNU_SOURCE
> +#include <stdlib.h>
> +#include <assert.h>
> +#include <unistd.h>
> +#include <stdio.h>
> +#include <sys/mman.h>
> +#include <errno.h>
> +#include <string.h>
> +
> +static void *
> +do_mmap (void *addr, size_t size, int prot, int flags, int fd, off_t offset)
> +{
> +  void *ret = mmap (addr, size, prot, flags, fd, offset);
> +
> +  assert (ret != NULL);
> +  return ret;
> +}
> +
> +int
> +main (int argc, char *argv[])
> +{
> +  const size_t size = 10;
> +  const int default_prot = PROT_READ | PROT_WRITE;
> +  char *private_anon, *shared_anon;
> +  char *dont_dump;
> +  int i;
> +
> +  private_anon = do_mmap (NULL, size, default_prot,
> +			  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +  memset (private_anon, 0x11, size);
> +
> +  shared_anon = do_mmap (NULL, size, default_prot,
> +			 MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> +  memset (shared_anon, 0x22, size);
> +
> +  dont_dump = do_mmap (NULL, size, default_prot,
> +		       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +  memset (dont_dump, 0x55, size);
> +  i = madvise (dont_dump, size, MADV_DONTDUMP);
> +  assert_perror (errno);
> +  assert (i == 0);
> +
> +  return 0; /* break-here */
> +}
> diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
> new file mode 100644
> index 0000000..c7ae91d
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/coredump-filter.exp
> @@ -0,0 +1,129 @@
> +# Copyright 2015 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +standard_testfile
> +
> +if { [prepare_for_testing "failed to prepare" $testfile $srcfile debug] } {
> +    untested $testfile.exp
> +    return -1
> +}
> +
> +if { ![runto_main] } {
> +    untested $testfile.exp
> +    return -1
> +}
> +
> +gdb_breakpoint [gdb_get_line_number "break-here"]
> +gdb_continue_to_breakpoint "break-here" ".* break-here .*"
> +
> +proc do_save_core { filter_flag core ipid } {
> +    verbose -log "writing $filter_flag to /proc/$ipid/coredump_filter"
> +    if { [catch {open /proc/$ipid/coredump_filter w} fileid] } {

This is opening the /proc file on the build machine, but it
should be the file on the target machine.  Can you use
"remote_file target" for this?

If not, perhaps something around:

 remote_exec target "echo $filter_flag > /proc/$ipid/coredump_filter"

?

> +	untested $testfile.exp
> +	return -1
> +    }
> +
> +    # Set coredump_filter to the value we want
> +    puts $fileid $filter_flag
> +    close $fileid
> +
> +    # Generate a corefile
> +    gdb_gcore_cmd "$core" "save corefile $core"
> +}
> +
> +proc do_load_and_test_core { core var working_var working_value } {
> +    global hex decimal addr
> +
> +    set core_loaded [gdb_core_cmd "$core" "load $core"]
> +    if { $core_loaded == -1 } {
> +	fail "loading $core"
> +	return
> +    }
> +
> +    # Use 'int' as any variants of 'char' try to read the target bytes.

I don't understand this comment.

> +    gdb_test "print *(unsigned int *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
> +	"printing $var when core is loaded (should not work)"
> +    gdb_test "print/x *(unsigned int *) $addr($working_var)" " = $working_value.*" \
> +	"print/x *$working_var ( = $working_value)"
> +}
> +
> +set non_private_anon_core [standard_output_file non-private-anon.gcore]
> +set non_shared_anon_core [standard_output_file non-shared-anon.gcore]
> +set dont_dump_core [standard_output_file dont-dump.gcore]
> +
> +# We will generate a few corefiles

Missing period.

> +#
> +# This list is composed by sub-lists, and their elements are (in
> +# order):
> +#
> +# - name of the test
> +# - hexadecimal value to be put in the /proc/PID/coredump_filter file
> +# - name of the variable that contains the name of the corefile to be
> +#   generated (including the initial $).
> +# - name of the variable in the C source code that points to the
> +#   memory mapping that will NOT be present in the corefile.
> +# - name of a variable in the C source code that points to a memory
> +#   mapping that WILL be present in the corefile
> +# - corresponding value expected for the above variable
> +
> +set all_corefiles { { "non-Private-Anonymous" "0x7e" \
> +			  $non_private_anon_core \
> +			  "private_anon" \
> +			  "shared_anon" "0x22" }
> +    { "non-Shared-Anonymous" "0x7d" \
> +	  $non_shared_anon_core "shared_anon" \
> +	  "private_anon" "0x11" }
> +    { "DoNotDump" "0x33" \
> +	  $dont_dump_core "dont_dump" \
> +	  "shared_anon" "0x22" } }
> +
> +set core_supported [gdb_gcore_cmd "$non_private_anon_core" "save a corefile"]
> +if { !$core_supported } {
> +    untested $testfile.exp

https://sourceware.org/gdb/wiki/GDBTestcaseCookbook#A.22untested.22_calls

> +    return -1
> +}
> +
> +# Getting the inferior's PID
> +gdb_test_multiple "info inferiors" "getting inferior pid" {
> +    -re "process \($decimal\).*\r\n$gdb_prompt $" {
> +	set infpid $expect_out(1,string)
> +    }
> +}

Don't leave infpid undefined on gdb_test_multiple failure.
Set it upfront:

  set infpid ""
  gdb_test_multiple "info inferiors" "getting inferior pid" {
    ...


> +
> +foreach item $all_corefiles {
> +    foreach name [list [lindex $item 3] [lindex $item 4]] {
> +	set test "print/x $name"
> +	gdb_test_multiple $test $test {
> +	    -re " = \($hex\)\r\n$gdb_prompt $" {
> +		set addr($name) $expect_out(1,string)

I'm probably being dense, but I can't see where is addr
ever used?

> +	    }
> +	}
> +    }
> +}
> +
> +foreach item $all_corefiles {
> +    with_test_prefix "saving corefile for [lindex $item 0]" {
> +	do_save_core [lindex $item 1] [subst [lindex $item 2]] $infpid
> +    }
> +}
> +
> +clean_restart $testfile
> +
> +foreach item $all_corefiles {
> +    with_test_prefix "loading and testing corefile for [lindex $item 0]" {
> +	do_load_and_test_core [subst [lindex $item 2]] [lindex $item 3] \
> +	    [lindex $item 4] [lindex $item 5]
> +    }
> +}

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-05  3:48 [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Sergio Durigan Junior
  2015-03-05 15:48 ` Jan Kratochvil
  2015-03-12 21:39 ` [PATCH v2] " Sergio Durigan Junior
@ 2015-03-13 19:37 ` Pedro Alves
  2015-03-13 19:48   ` Pedro Alves
  2 siblings, 1 reply; 46+ messages in thread
From: Pedro Alves @ 2015-03-13 19:37 UTC (permalink / raw)
  To: Sergio Durigan Junior, GDB Patches; +Cc: Jan Kratochvil, Oleg Nesterov

On 03/05/2015 03:48 AM, Sergio Durigan Junior wrote:
> In a nutshell, what the new code is doing is:
> 
> - If the mapping is associated to a file whose name ends with "
>   (deleted)", or if the file is "/dev/zero", or if it is "/SYSV%08x"
>   (shared memory), or if there is no file associated with it, or if the
>   AnonHugePages: or the Anonymous: fields in the /proc/PID/smaps have
>   contents, then GDB considers this mapping to be anonymous.  Otherwise,
>   GDB considers this mapping to be a file-backed mapping (because there
>   will be a file associated with it).
> 
>   It is worth mentioning that, from all those checks described above,
>   the most fragile is the one to see if the file name ends with "
>   (deleted)".  This does not necessarily mean that the mapping is
>   anonymous, because the deleted file associated with the mapping may
>   have been a hard link to another file, for example.  The Linux kernel
>   checks to see if "i_nlink == 0", but GDB cannot easily do this check.
>   Therefore, we made a compromise here, and we assume that if the file
>   name ends with " (deleted)", then the mapping is indeed anonymous.
>   FWIW, this is something the Linux kernel could do better: expose this
>   information in a more direct way.
> 
> - If we see the flag "sh" in the VmFlags: field (in /proc/PID/smaps),
>   then certainly the memory mapping is shared (VM_SHARED).  If we have
>   access to the VmFlags, and we don't see the "sh" there, then certainly
>   the mapping is private.  However, older Linus kernels do not have the
>   VmFlags field; in that case, we use another heuristic: if we see 'p'
>   in the permission flags, then we assume that the mapping is private,
>   even though the presence of the 's' flag there would mean VM_MAYSHARE,
>   which means the mapping could still be private.  This should work OK
>   enough, however.

I missed seeing a git commit log in v2, but looking here, I think
it'd be good to move paragraphs to the code instead, to a general
overview section, even.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-13 19:37 ` [PATCH] " Pedro Alves
@ 2015-03-13 19:48   ` Pedro Alves
  0 siblings, 0 replies; 46+ messages in thread
From: Pedro Alves @ 2015-03-13 19:48 UTC (permalink / raw)
  To: Sergio Durigan Junior, GDB Patches; +Cc: Jan Kratochvil, Oleg Nesterov

On 03/13/2015 07:37 PM, Pedro Alves wrote:
> I missed seeing a git commit log in v2, but looking here, I think
> it'd be good to move paragraphs to the code instead, to a general
> overview section, even.

Sorry, missed a word: TBC, I meant, "move these paragraphs", as in,
probably that whole chunk.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-12 21:39 ` [PATCH v2] " Sergio Durigan Junior
  2015-03-13 19:34   ` Pedro Alves
@ 2015-03-14  9:40   ` Eli Zaretskii
  2015-03-16  2:42     ` Sergio Durigan Junior
  1 sibling, 1 reply; 46+ messages in thread
From: Eli Zaretskii @ 2015-03-14  9:40 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: gdb-patches, jan.kratochvil, palves, oleg

> From: Sergio Durigan Junior <sergiodj@redhat.com>
> Cc: Jan Kratochvil <jan.kratochvil@redhat.com>,        Pedro Alves <palves@redhat.com>, Oleg Nesterov <oleg@redhat.com>
> Date: Thu, 12 Mar 2015 17:39:39 -0400
> 
> +On @sc{gnu}/Linux, this command can take into account the value of the
> +file @file{/proc/@var{pid}/coredump_filter} when generating the core
> +dump (@pxref{set use-coredump-filter}).

You never explain what @var{pid} is, until you get to the example.  I
think we should tell that earlier.

> +To make use of this feature, you have to write in the
> +@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
> +which is a bit mask representing the memory mapping types.  If a bit
> +is set in the bit mask, then the memory mappings of the corresponding
> +types will be dumped; otherwise, they will be ignored.  The bits in
> +this bit mask have the following meanings:
> +
> +@table @code
> +@item bit 0
> +Dump anonymous private mappings.
> +@item bit 1
> +Dump anonymous shared mappings.
> +@item bit 2
> +Dump file-backed private mappings.
> +@item bit 3
> +Dump file-backed shared mappings.
> +@item bit 4
> +(since Linux 2.6.24)
> +Dump ELF headers. (@value{GDBN} does not take this bit into account)
> +@item bit 5
> +(since Linux 2.6.28)
> +Dump private huge pages.
> +@item bit 6
> +(since Linux 2.6.28)
> +Dump shared huge pages.
> +@end table
> +
> +For example, supposing that the @code{pid} of the program being
> +debugging is @code{1234}, if you wanted to dump everything except the
> +anonymous private and the file-backed shared mappings, you would do:
> +
> +@smallexample
> +$ echo 0x76 > /proc/1234/coredump_filter
> +@end smallexample
> +
> +For more documentation about how to use the @file{coredump_filter}
> +file, see the manpage of @code{proc(5)}.

I don't think we should repeat all that information here, we should
just refer to the man page you cite, possibly also telling what
section of that page to look in.

Thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-14  9:40   ` Eli Zaretskii
@ 2015-03-16  2:42     ` Sergio Durigan Junior
  0 siblings, 0 replies; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-16  2:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches, jan.kratochvil, palves, oleg

Thanks for the review, Eli.

On Saturday, March 14 2015, Eli Zaretskii wrote:

>> From: Sergio Durigan Junior <sergiodj@redhat.com>
>> Cc: Jan Kratochvil <jan.kratochvil@redhat.com>,        Pedro Alves <palves@redhat.com>, Oleg Nesterov <oleg@redhat.com>
>> Date: Thu, 12 Mar 2015 17:39:39 -0400
>> 
>> +On @sc{gnu}/Linux, this command can take into account the value of the
>> +file @file{/proc/@var{pid}/coredump_filter} when generating the core
>> +dump (@pxref{set use-coredump-filter}).
>
> You never explain what @var{pid} is, until you get to the example.  I
> think we should tell that earlier.

Fixed.

>> +To make use of this feature, you have to write in the
>> +@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
>> +which is a bit mask representing the memory mapping types.  If a bit
>> +is set in the bit mask, then the memory mappings of the corresponding
>> +types will be dumped; otherwise, they will be ignored.  The bits in
>> +this bit mask have the following meanings:
>> +
>> +@table @code
>> +@item bit 0
>> +Dump anonymous private mappings.
>> +@item bit 1
>> +Dump anonymous shared mappings.
>> +@item bit 2
>> +Dump file-backed private mappings.
>> +@item bit 3
>> +Dump file-backed shared mappings.
>> +@item bit 4
>> +(since Linux 2.6.24)
>> +Dump ELF headers. (@value{GDBN} does not take this bit into account)
>> +@item bit 5
>> +(since Linux 2.6.28)
>> +Dump private huge pages.
>> +@item bit 6
>> +(since Linux 2.6.28)
>> +Dump shared huge pages.
>> +@end table
>> +
>> +For example, supposing that the @code{pid} of the program being
>> +debugging is @code{1234}, if you wanted to dump everything except the
>> +anonymous private and the file-backed shared mappings, you would do:
>> +
>> +@smallexample
>> +$ echo 0x76 > /proc/1234/coredump_filter
>> +@end smallexample
>> +
>> +For more documentation about how to use the @file{coredump_filter}
>> +file, see the manpage of @code{proc(5)}.
>
> I don't think we should repeat all that information here, we should
> just refer to the man page you cite, possibly also telling what
> section of that page to look in.

My very first personal version of the patch did not have this table
here, and was just mentioning the man page, as you proposed.  However,
because man pages are not the official GNU documentation format, I
decided to include a "bit" more info here.

Either, I don't have a strong preference.  I will mention the man page.

I am not sending a fixed version of the patch in this message because I
am still working on Pedro's comments.  When I reply to his e-mail, the
attached patch will include your suggestions as well.

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-12 17:48               ` Oleg Nesterov
  2015-03-12 17:55                 ` Andy Lutomirski
  2015-03-12 18:20                 ` Pedro Alves
@ 2015-03-16 19:03                 ` Oleg Nesterov
  2015-03-16 19:20                   ` Andy Lutomirski
  2015-03-16 19:40                   ` Pedro Alves
  2 siblings, 2 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-16 19:03 UTC (permalink / raw)
  To: Andy Lutomirski, Hugh Dickins, Linus Torvalds
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Oleg Nesterov wrote:
>
> OTOH. We can probably add ->access() into special_mapping_vmops, this
> way __access_remote_vm() could work even if gup() fails ?

So I tried to think how special_mapping_vmops->access() can work, it
needs to rely on ->vm_pgoff.

But afaics this logic is just broken. Lets even forget about vvar vma
which uses remap_file_pages(). Lets look at "[vdso]" which uses the
"normal" pages.

The comment in special_mapping_fault() says

	 * special mappings have no vm_file, and in that case, the mm
	 * uses vm_pgoff internally.

Yes. But afaics mm/ doesn't do this correctly. So

	 * do not copy this code into drivers!

looks like a good recommendation ;)

I think that this logic is wrong even if ARRAY_SIZE(pages) == 1, but I am
not sure. But since vdso use 2 pages, it is trivial to show that this logic
is wrong. To verify, I changed show_map_vma() to expose pgoff even if !file,
but this test-case can show the problem too:

	#include <stdio.h>
	#include <unistd.h>
	#include <stdlib.h>
	#include <string.h>
	#include <sys/mman.h>
	#include <assert.h>

	void *find_vdso_vaddr(void)
	{
		FILE *perl;
		char buf[32] = {};

		perl = popen("perl -e 'open STDIN,qq|/proc/@{[getppid]}/maps|;"
				"/^(.*?)-.*vdso/ && print hex $1 while <>'", "r");
		fread(buf, sizeof(buf), 1, perl);
		fclose(perl);

		return (void *)atol(buf);
	}

	#define PAGE_SIZE	4096

	int main(void)
	{
		void *vdso = find_vdso_vaddr();
		assert(vdso);

		// of course they should differ, and they do so far
		printf("vdso pages differ: %d\n",
			!!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));

		// split into 2 vma's
		assert(mprotect(vdso, PAGE_SIZE, PROT_READ) == 0);

		// force another fault on the next check
		assert(madvise(vdso, 2 * PAGE_SIZE, MADV_DONTNEED) == 0);

		// now they no longer differ, the 2nd vm_pgoff is wrong
		printf("vdso pages differ: %d\n",
			!!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));

		return 0;
	}

output:

	vdso pages differ: 1
	vdso pages differ: 0

And not only "split_vma" is wrong, I think that "move_vma" is not right too.
Note this check in copy_vma(),

	/*
	 * If anonymous vma has not yet been faulted, update new pgoff
	 * to match new location, to increase its chance of merging.
	 */
	if (unlikely(!vma->vm_file && !vma->anon_vma)) {
		pgoff = addr >> PAGE_SHIFT;
		faulted_in_anon_vma = false;
	}

I can easily misread this code. But it doesn't look right too. If vdso was cow'ed
(breakpoint installed by gdb) and sys_nremap()'ed, then the new pgoff will be wrong
too after, say, MADV_DONTNEED.

Or I am totally confused?

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-16 19:03                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
@ 2015-03-16 19:20                   ` Andy Lutomirski
  2015-03-16 19:46                     ` Oleg Nesterov
  2015-03-16 19:40                   ` Pedro Alves
  1 sibling, 1 reply; 46+ messages in thread
From: Andy Lutomirski @ 2015-03-16 19:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

[cc: linux-mm]

On Mon, Mar 16, 2015 at 12:01 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Oleg Nesterov wrote:
>>
>> OTOH. We can probably add ->access() into special_mapping_vmops, this
>> way __access_remote_vm() could work even if gup() fails ?
>
> So I tried to think how special_mapping_vmops->access() can work, it
> needs to rely on ->vm_pgoff.
>
> But afaics this logic is just broken. Lets even forget about vvar vma
> which uses remap_file_pages(). Lets look at "[vdso]" which uses the
> "normal" pages.
>
> The comment in special_mapping_fault() says
>
>          * special mappings have no vm_file, and in that case, the mm
>          * uses vm_pgoff internally.
>
> Yes. But afaics mm/ doesn't do this correctly. So
>
>          * do not copy this code into drivers!
>
> looks like a good recommendation ;)
>
> I think that this logic is wrong even if ARRAY_SIZE(pages) == 1, but I am
> not sure. But since vdso use 2 pages, it is trivial to show that this logic
> is wrong. To verify, I changed show_map_vma() to expose pgoff even if !file,
> but this test-case can show the problem too:
>
>         #include <stdio.h>
>         #include <unistd.h>
>         #include <stdlib.h>
>         #include <string.h>
>         #include <sys/mman.h>
>         #include <assert.h>
>
>         void *find_vdso_vaddr(void)
>         {
>                 FILE *perl;
>                 char buf[32] = {};
>
>                 perl = popen("perl -e 'open STDIN,qq|/proc/@{[getppid]}/maps|;"
>                                 "/^(.*?)-.*vdso/ && print hex $1 while <>'", "r");
>                 fread(buf, sizeof(buf), 1, perl);
>                 fclose(perl);
>
>                 return (void *)atol(buf);
>         }
>
>         #define PAGE_SIZE       4096
>
>         int main(void)
>         {
>                 void *vdso = find_vdso_vaddr();
>                 assert(vdso);
>
>                 // of course they should differ, and they do so far
>                 printf("vdso pages differ: %d\n",
>                         !!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));
>
>                 // split into 2 vma's
>                 assert(mprotect(vdso, PAGE_SIZE, PROT_READ) == 0);
>
>                 // force another fault on the next check
>                 assert(madvise(vdso, 2 * PAGE_SIZE, MADV_DONTNEED) == 0);

I really hope this doesn't do anything (or fails) on the vvar page,
which is a pfnmap.

>
>                 // now they no longer differ, the 2nd vm_pgoff is wrong
>                 printf("vdso pages differ: %d\n",
>                         !!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));
>
>                 return 0;
>         }
>
> output:
>
>         vdso pages differ: 1
>         vdso pages differ: 0
>
> And not only "split_vma" is wrong, I think that "move_vma" is not right too.
> Note this check in copy_vma(),
>
>         /*
>          * If anonymous vma has not yet been faulted, update new pgoff
>          * to match new location, to increase its chance of merging.
>          */
>         if (unlikely(!vma->vm_file && !vma->anon_vma)) {
>                 pgoff = addr >> PAGE_SHIFT;
>                 faulted_in_anon_vma = false;
>         }
>
> I can easily misread this code. But it doesn't look right too. If vdso was cow'ed
> (breakpoint installed by gdb) and sys_nremap()'ed, then the new pgoff will be wrong
> too after, say, MADV_DONTNEED.
>
> Or I am totally confused?

Ick, you're probably right.  For what it's worth, the vdso *seems* to
be okay (on 64-bit only, and only if you don't poke at it too hard) if
you mremap it in one piece.  CRIU does that.

What does the mm code do with vm_pgoff for vmas with no vm_file?  I'm
mystified.  There's this comment:

 * The way we recognize COWed pages within VM_PFNMAP mappings is through the
 * rules set up by "remap_pfn_range()": the vma will have the VM_PFNMAP bit
 * set, and the vm_pgoff will point to the first PFN mapped: thus every special
 * mapping will always honor the rule
 *
 *    pfn_of_page == vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT)

Is that referring to special mappings in the install_special_mapping
sense or to something else.  FWIW, the vdso ins't a VM_PFNMAP at all.

--Andy

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-16 19:03                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
  2015-03-16 19:20                   ` Andy Lutomirski
@ 2015-03-16 19:40                   ` Pedro Alves
  1 sibling, 0 replies; 46+ messages in thread
From: Pedro Alves @ 2015-03-16 19:40 UTC (permalink / raw)
  To: Oleg Nesterov, Andy Lutomirski, Hugh Dickins, Linus Torvalds
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, linux-kernel

Thanks for looking over all this, guys.  Really appreciated.

On 03/16/2015 07:01 PM, Oleg Nesterov wrote:
> is wrong. To verify, I changed show_map_vma() to expose pgoff even if !file,
> but this test-case can show the problem too:

Might be good to add tests like this to selftests/ once all
this is sorted.

Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-16 19:20                   ` Andy Lutomirski
@ 2015-03-16 19:46                     ` Oleg Nesterov
  2015-03-17 13:45                       ` Oleg Nesterov
  0 siblings, 1 reply; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-16 19:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/16, Andy Lutomirski wrote:
>
> Ick, you're probably right.  For what it's worth, the vdso *seems* to
> be okay (on 64-bit only, and only if you don't poke at it too hard) if
> you mremap it in one piece.  CRIU does that.

I need to run away till tomorrow, but looking at this code even if "one piece"
case doesn't look right if it was cow'ed. I'll verify tomorrow.

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-13 19:34   ` Pedro Alves
@ 2015-03-16 23:53     ` Sergio Durigan Junior
  2015-03-18 19:10       ` Pedro Alves
  0 siblings, 1 reply; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-16 23:53 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches, Jan Kratochvil, Oleg Nesterov

On Friday, March 13 2015, Pedro Alves wrote:

Thanks for the review, and sorry for taking a bit of time to respond.  I
had to dig deeper into this patch to properly answer some of your
questions.

> On 03/12/2015 09:39 PM, Sergio Durigan Junior wrote:
>> gdb/ChangeLog:
>> 2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>
>> 	    Jan Kratochvil  <jan.kratochvil@redhat.com>
>> 	    Oleg Nesterov  <oleg@redhat.com>
>> 
>> 	PR corefiles/16092
>> 	* common/common-defs.h (enum memory_mapping_state): New enum.
>> 	* defs.h (find_memory_region_ftype): Remove 'int modified'
>> 	parameter, replacing by 'enum memory_mapping_state state'.
>> 	* gcore.c (gcore_create_callback): Likewise.  Change 'if/else'
>> 	statements and improve the logic of deciding when to ignore a
>> 	memory mapping.
>> 	(objfile_find_memory_regions): Passing
>> 	'MEMORY_MAPPING_UNKNOWN_STATE' or 'MEMORY_MAPPING_MODIFIED' when
>> 	needed to 'func' callback, instead of saying the memory mapping
>> 	was modified even without knowing it.
>> 	* gnu-nat.c (gnu_find_memory_regions): Likewise.
>> 	* linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'.
>> 	New enum identifying the various options of the coredump_filter
>> 	file.
>> 	(struct smaps_vmflags): New struct.
>> 	(use_coredump_filter): New variable.
>> 	(decode_vmflags): New function.
>> 	(mapping_is_anonymous_p): Likewise.
>> 	(dump_mapping_p): Likewise.
>> 	(linux_find_memory_region_ftype): Remove 'int modified' parameter,
>> 	replacing by 'enum memory_mapping_state state'.
>> 	(linux_find_memory_regions_full): New variables
>> 	'coredumpfilter_name', 'coredumpfilterdata', 'pid',
>> 	'filterflags'.  Read /proc/<PID>/smaps file; improve parsing of
>> 	its information.  Implement memory mapping filtering based on its
>> 	contents.
>> 	(linux_find_memory_regions_thunk): Remove 'int modified'
>> 	parameter, replacing by 'enum memory_mapping_state state'.
>> 	(linux_make_mappings_callback): Likewise.
>> 	(find_mapping_size): Likewise.
>> 	(show_use_coredump_filter): New function.
>> 	(_initialize_linux_tdep): New command 'set use-coredump-filter'.
>> 	* procfs.c (find_memory_regions_callback): Passing
>> 	'MEMORY_MAPPING_UNKNOWN_STATE' when needed to 'func' callback,
>> 	instead of saying the memory mapping was modified even without
>> 	knowing it.
>> 
>> gdb/doc/ChangeLog:
>> 2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>
>> 
>> 	PR corefiles/16092
>> 	* gdb.texinfo (gcore): Mention new command 'set
>> 	use-coredump-filter'.
>> 	(set use-coredump-filter): Document new command.
>> 
>> gdb/testsuite/ChangeLog:
>> 2015-03-12  Sergio Durigan Junior  <sergiodj@redhat.com>
>> 
>> 	PR corefiles/16092
>> 	* gdb.base/coredump-filter.c: New file.
>> 	* gdb.base/coredump-filter.exp: Likewise.
>> 
>> 
>> diff --git a/gdb/common/common-defs.h b/gdb/common/common-defs.h
>> index 62d9de5..01b05f5 100644
>> --- a/gdb/common/common-defs.h
>> +++ b/gdb/common/common-defs.h
>> @@ -60,4 +60,14 @@
>>  # define EXTERN_C_POP
>>  #endif
>>  
>> +/* Enum used to inform the state of a memory mapping.  This is used in
>> +   functions implementing find_memory_region_ftype.  */
>
> Why isn't this enum defined next to find_memory_region_ftype?

Bad judgement, sorry.  I thought it would make sense to put this into
common/ because it could be used by gdbserver.  But then again, it is
not (yet).  Moved to defs.h.

>> +
>> +enum memory_mapping_state
>> +  {
>> +    MEMORY_MAPPING_MODIFIED,
>> +    MEMORY_MAPPING_UNMODIFIED,
>> +    MEMORY_MAPPING_UNKNOWN_STATE,
>> +  };
>> +
>>  #endif /* COMMON_DEFS_H */
>> diff --git a/gdb/defs.h b/gdb/defs.h
>> index 72512f6..4829b62 100644
>> --- a/gdb/defs.h
>> +++ b/gdb/defs.h
>> @@ -338,7 +338,8 @@ extern void init_source_path (void);
>>  
>>  typedef int (*find_memory_region_ftype) (CORE_ADDR addr, unsigned long size,
>>  					 int read, int write, int exec,
>> -					 int modified, void *data);
>> +					 enum memory_mapping_state state,
>> +					 void *data);
>>  
>>  /* * Possible lvalue types.  Like enum language, this should be in
>>     value.h, but needs to be here for the same reason.  */
>> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
>> index 9e71642..092bc93 100644
>> --- a/gdb/doc/gdb.texinfo
>> +++ b/gdb/doc/gdb.texinfo
>> @@ -10952,6 +10952,67 @@ specified, the file name defaults to @file{core.@var{pid}}, where
>>  
>>  Note that this command is implemented only for some systems (as of
>>  this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
>> +
>> +On @sc{gnu}/Linux, this command can take into account the value of the
>> +file @file{/proc/@var{pid}/coredump_filter} when generating the core
>> +dump (@pxref{set use-coredump-filter}).
>> +
>> +@kindex set use-coredump-filter
>> +@anchor{set use-coredump-filter}
>> +@item set use-coredump-filter on
>> +@itemx set use-coredump-filter off
>> +Enable or disable the use of the file
>> +@file{/proc/@var{pid}/coredump_filter} when generating core dump
>> +files.  This file is used by the Linux kernel to decide what types of
>> +memory mappings will be dumped or ignored when generating a core dump
>> +file.
>> +
>> +To make use of this feature, you have to write in the
>> +@file{/proc/@var{pid}/coredump_filter} file a value, in hexadecimal,
>> +which is a bit mask representing the memory mapping types.  If a bit
>> +is set in the bit mask, then the memory mappings of the corresponding
>> +types will be dumped; otherwise, they will be ignored.  The bits in
>> +this bit mask have the following meanings:
>> +
>> +@table @code
>> +@item bit 0
>> +Dump anonymous private mappings.
>> +@item bit 1
>> +Dump anonymous shared mappings.
>> +@item bit 2
>> +Dump file-backed private mappings.
>> +@item bit 3
>> +Dump file-backed shared mappings.
>> +@item bit 4
>> +(since Linux 2.6.24)
>> +Dump ELF headers. (@value{GDBN} does not take this bit into account)
>> +@item bit 5
>> +(since Linux 2.6.28)
>> +Dump private huge pages.
>> +@item bit 6
>> +(since Linux 2.6.28)
>> +Dump shared huge pages.
>> +@end table
>> +
>> +For example, supposing that the @code{pid} of the program being
>> +debugging is @code{1234}, if you wanted to dump everything except the
>> +anonymous private and the file-backed shared mappings, you would do:
>> +
>> +@smallexample
>> +$ echo 0x76 > /proc/1234/coredump_filter
>> +@end smallexample
>> +
>> +For more documentation about how to use the @file{coredump_filter}
>> +file, see the manpage of @code{proc(5)}.
>> +
>> +By default, this option is @code{on}.  If this option is turned
>> +@code{off}, @value{GDBN} will not read the @file{coredump_filter}
>> +file, but it uses the same default value as the Linux kernel in order
>
> "will not read (...) uses".  I think the grammar isn't correct
> that way.
>
> I think preferred is to use present in both cases, thus:
>
>   "(...) turned @code{off}, @value{GDBN} does not the read the"
>
> and no "it", in "but uses the same".  Suggest "and" instead of "but":
>
> By default, this option is @code{on}.  If this option is turned
> @code{off}, @value{GDBN} does not read the @file{coredump_filter}
> file and instead uses the same default value as the Linux kernel in order
> ...

Fixed.

>> +to decide which pages will be dumped in the core dump file.  This
>> +value currently is @code{0x33}, which means that the bits @code{0}
>
> "is currently", I think.   Also, "which means that bits" would sound
> more natural to me, though take it with a grain of salt.

Fixed.

>> +(anonymous private mappings), @code{1} (anonymous shared mappings) and
>> +@code{4} (ELF headers) are active.  This will cause these memory
>> +mappings to be dumped automatically.
>>  @end table
>>  
>>  @node Character Sets
>> diff --git a/gdb/gcore.c b/gdb/gcore.c
>> index 44b9d0c..89d8285 100644
>> --- a/gdb/gcore.c
>> +++ b/gdb/gcore.c
>> @@ -415,27 +415,22 @@ make_output_phdrs (bfd *obfd, asection *osec, void *ignored)
>>  
>>  static int
>>  gcore_create_callback (CORE_ADDR vaddr, unsigned long size, int read,
>> -		       int write, int exec, int modified, void *data)
>> +		       int write, int exec, enum memory_mapping_state state,
>> +		       void *data)
>>  {
>>    bfd *obfd = data;
>>    asection *osec;
>>    flagword flags = SEC_ALLOC | SEC_HAS_CONTENTS | SEC_LOAD;
>>  
>> -  /* If the memory segment has no permissions set, ignore it, otherwise
>> -     when we later try to access it for read/write, we'll get an error
>> -     or jam the kernel.  */
>> -  if (read == 0 && write == 0 && exec == 0 && modified == 0)
>> -    {
>> -      if (info_verbose)
>> -        {
>> -          fprintf_filtered (gdb_stdout, "Ignore segment, %s bytes at %s\n",
>> -                            plongest (size), paddress (target_gdbarch (), vaddr));
>> -        }
>> -
>> -      return 0;
>> -    }
>> -
>> -  if (write == 0 && modified == 0 && !solib_keep_data_in_core (vaddr, size))
>> +  /* If the memory segment has no read permission set, or if it has
>> +     been marked as unmodified, then we have to generate a segment
>> +     header for it, but without contents (i.e., FileSiz = 0),
>> +     otherwise when we later try to access it for read/write, we'll
>> +     get an error or jam the kernel.  */
>> +  if (read == 0 || state == MEMORY_MAPPING_UNMODIFIED)
>> +    flags &= ~(SEC_LOAD | SEC_HAS_CONTENTS);
>
> I'm feeling dense and I'm understanding this change / comment.  :-/
> Why didn't we need to do this before, and we need to do it now?

I'm glad you asked, because that made me review the decision to hack
this part of the code, which led me to the conclusion that it needed to
be fixed/improved.

First of all, I would like to apologize for not having splited this
patch into smaller hunks; I am now fairly convinced that this would have
helped in the review process.

This specific part of the code started to be changed when I wanted to
simplify the original condition to ignore a memory mapping:

  if (read == 0 && write == 0 && exec == 0 && modified == 0)

The simplification was initially:

  if (read == 0)

The rationale is that if the mapping ccannot be read, then this is
enough for ignoring it.  However, after some discussions with Jan, he
suggested that instead of returning immediately from the function, GDB
should actually mark the memory region as '~(SEC_LOAD |
SEC_HAS_CONTENTS)', which basically means that it would create a segment
header for the mapping in the corefile, but mark its FileSiz as 0 (IOW,
when this corefile is opened, GDB would not try to read the contents of
this memory region).

I found this to be a good idea, and implemented this part.
Additionally, when I introduced 'enum memory_mapping_state' and tweaked
the code to provide exactly whether the mapping was MODIFIED, UNMODIFIED
or UNKNOWN, I also decided to generate a segment header for the mappings
that were marked as UNMODIFIED (in the patch, this state actually means
that the mapping should not be dumped at all, either because it was
marked as VM_DONTDUMP or because the user chose to ignore it via the
coredump_filter mechanism).  This was a bad call, because the purpose of
this patch is to bring GDB closer to the Linux kernel when it comes to
generating coredump files, and the Linux kernel does *not* generate
anything (not even a segment header) for the mappings that should be
ignored.

To summarize: I decided to change this part of the code, and make GDB
actually ignore (i.e., return 0) mappings marked as UNMODIFIED.  After
all, as explained above, this is what Linux does.

>> +  else if (write == 0 && state == MEMORY_MAPPING_UNKNOWN_STATE
>> +	   && !solib_keep_data_in_core (vaddr, size))
>>      {
>>        /* See if this region of memory lies inside a known file on disk.
>>  	 If so, we can avoid copying its contents by clearing SEC_LOAD.  */
>> @@ -528,7 +523,8 @@ objfile_find_memory_regions (struct target_ops *self,
>>  			 1, /* All sections will be readable.  */
>>  			 (flags & SEC_READONLY) == 0, /* Writable.  */
>>  			 (flags & SEC_CODE) != 0, /* Executable.  */
>> -			 1, /* MODIFIED is unknown, pass it as true.  */
>> +			 MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is
>> +							 unknown.  */
>>  			 obfd);
>>  	  if (ret != 0)
>>  	    return ret;
>> @@ -541,7 +537,7 @@ objfile_find_memory_regions (struct target_ops *self,
>>  	     1, /* Stack section will be readable.  */
>>  	     1, /* Stack section will be writable.  */
>>  	     0, /* Stack section will not be executable.  */
>> -	     1, /* Stack section will be modified.  */
>> +	     MEMORY_MAPPING_MODIFIED, /* Stack section will be modified.  */
>>  	     obfd);
>>  
>>    /* Make a heap segment.  */
>> @@ -550,7 +546,7 @@ objfile_find_memory_regions (struct target_ops *self,
>>  	     1, /* Heap section will be readable.  */
>>  	     1, /* Heap section will be writable.  */
>>  	     0, /* Heap section will not be executable.  */
>> -	     1, /* Heap section will be modified.  */
>> +	     MEMORY_MAPPING_MODIFIED, /* Heap section will be modified.  */
>>  	     obfd);
>>  
>>    return 0;
>> diff --git a/gdb/gnu-nat.c b/gdb/gnu-nat.c
>> index d830773..60612a7 100644
>> --- a/gdb/gnu-nat.c
>> +++ b/gdb/gnu-nat.c
>> @@ -2611,7 +2611,7 @@ gnu_find_memory_regions (struct target_ops *self,
>>  		     last_protection & VM_PROT_READ,
>>  		     last_protection & VM_PROT_WRITE,
>>  		     last_protection & VM_PROT_EXECUTE,
>> -		     1, /* MODIFIED is unknown, pass it as true.  */
>> +		     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
>>  		     data);
>>  	  last_region_address = region_address;
>>  	  last_region_end = region_address += region_length;
>> @@ -2625,7 +2625,7 @@ gnu_find_memory_regions (struct target_ops *self,
>>  	     last_protection & VM_PROT_READ,
>>  	     last_protection & VM_PROT_WRITE,
>>  	     last_protection & VM_PROT_EXECUTE,
>> -	     1, /* MODIFIED is unknown, pass it as true.  */
>> +	     MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
>>  	     data);
>>  
>>    return 0;
>> diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
>> index ea0d4cd..ae11a2e 100644
>> --- a/gdb/linux-tdep.c
>> +++ b/gdb/linux-tdep.c
>> @@ -35,9 +35,58 @@
>>  #include "observer.h"
>>  #include "objfiles.h"
>>  #include "infcall.h"
>> +#include "gdbcmd.h"
>> +#include "gdb_regex.h"
>>  
>>  #include <ctype.h>
>>  
>> +/* This enum represents the values that the user can choose when
>> +   informing the Linux kernel about which memory mappings will be
>> +   dumped in a corefile.  They are described in the file
>> +   Documentation/filesystems/proc.txt, inside the Linux kernel
>> +   tree.  */
>> +
>> +enum
>> +  {
>> +    COREFILTER_ANON_PRIVATE = 1 << 0,
>> +    COREFILTER_ANON_SHARED = 1 << 1,
>> +    COREFILTER_MAPPED_PRIVATE = 1 << 2,
>> +    COREFILTER_MAPPED_SHARED = 1 << 3,
>> +    COREFILTER_ELF_HEADERS = 1 << 4,
>> +    COREFILTER_HUGETLB_PRIVATE = 1 << 5,
>> +    COREFILTER_HUGETLB_SHARED = 1 << 6,
>> +  };
>> +
>> +struct smaps_vmflags
>
> Missing intro comment.

Fixed.

>> +  {
>> +    /* Zero if this structure has not been initialized yet.  It
>> +       probably means that the Linux kernel being used does not emit
>> +       the "VmFlags:" field on "/proc/PID/smaps".  */
>> +
>> +    unsigned int initialized_p : 1;
>> +
>> +    /* Memory mapped I/O area (VM_IO, "io").  */
>> +
>> +    unsigned int io_page : 1;
>> +
>> +    /* Area uses huge TLB pages (VM_HUGETLB, "ht").  */
>> +
>> +    unsigned int uses_huge_tlb : 1;
>> +
>> +    /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd").  */
>> +
>> +    unsigned int exclude_coredump : 1;
>> +
>> +    /* Is this a MAP_SHARED mapping (VM_SHARED, "sh").  */
>> +
>> +    unsigned int shared_mapping : 1;
>> +  };
>> +
>> +/* Whether to take the /proc/PID/coredump_filter into account when
>> +   generating a corefile.  */
>> +
>> +static int use_coredump_filter = 1;
>> +
>>  /* This enum represents the signals' numbers on a generic architecture
>>     running the Linux kernel.  The definition of "generic" comes from
>>     the file <include/uapi/asm-generic/signal.h>, from the Linux kernel
>> @@ -381,6 +430,159 @@ read_mapping (const char *line,
>>    *filename = p;
>>  }
>>  
>> +/* Helper function to decode the "VmFlags" field in /proc/PID/smaps.
>> +
>> +   This function was based on the documentation found on
>> +   <Documentation/filesystems/proc.txt>, on the Linux kernel.
>> +
>> +   Linux kernels before commit
>> +   834f82e2aa9a8ede94b17b656329f850c1471514 do not have this field on
>> +   smaps.  */
>> +
>> +static void
>> +decode_vmflags (char *p, struct smaps_vmflags *v)
>
> Can 'p' be made const ?

strtok expects char *, so not really.

>> +{
>> +  char *saveptr;
>> +  char *s;
>
> Likewise.

Likewise for saveptr.  But I've made 's' const.

>> +
>> +  v->initialized_p = 1;
>> +  p = skip_to_space (p);
>> +  p = skip_spaces (p);
>> +
>> +  for (s = strtok_r (p, " ", &saveptr);
>> +       s != NULL;
>> +       s = strtok_r (NULL, " ", &saveptr))
>> +    {
>> +      if (strcmp (s, "io") == 0)
>> +	v->io_page = 1;
>> +      else if (strcmp (s, "ht") == 0)
>> +	v->uses_huge_tlb = 1;
>> +      else if (strcmp (s, "dd") == 0)
>> +	v->exclude_coredump = 1;
>> +      else if (strcmp (s, "sh") == 0)
>> +	v->shared_mapping = 1;
>> +    }
>> +}
>> +
>> +/* Return 1 if the memory mapping is anonymous, 0 otherwise.
>> +
>> +   FILENAME is the name of the file present in the first line of the
>> +   memory mapping, in the "/proc/PID/smaps" output.  For example, if
>> +   the first line is:
>> +
>> +   7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770   /path/to/file
>> +
>> +   Then FILENAME will be "/path/to/file".  */
>> +
>> +static int
>> +mapping_is_anonymous_p (const char *filename)
>
> As Oleg mentioned, these aren't really anonymous.  Is there a different
> term we can use, or improve the comment?

Well, for GDB those *are* anonymous mappings (in the sense of
MAP_ANONYMOUS).  In fact, I am not sure why Oleg said these mappings are
not anonymous; they are not file-backed either.  Maybe he means that we
shouldn't say only "anonymous" without specifying whether the mapping is
shared or private...  Anyway, I will have to discuss with him before I
can (a) understand why we can't say "anonymous" here, and (b) suggest a
better term, if there is one.

>> +{
>> +  static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex;
>> +  static int init_regex_p = 0;
>> +
>> +  if (!init_regex_p)
>> +    {
>> +      struct cleanup *c = make_cleanup (null_cleanup, NULL);
>> +
>> +      init_regex_p = 1;
>> +      compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?$",
>> +			   _("Could not compile regex to match /dev/zero "
>> +			     "filename"));
>> +      compile_rx_or_error (&shmem_file_regex,
>> +			   "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$",
>> +			   _("Could not compile regex to match shmem "
>> +			     "filenames"));
>
> Could you add some comment about what these regexes above are for?

Done.

>> +      /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the
>> +	 Linux kernel's 'n_link == 0' code, which is responsible to
>> +	 decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS'
>> +	 mapping.  In other words, if FILE_DELETED_REGEX matches, it
>> +	 does not necessarily mean that we are dealing with an
>> +	 anonymous shared mapping.  However, there is no easy way to
>> +	 detect this currently, so this is the best approximation we
>> +	 have.
>> +
>> +	 As a result, GDB will dump readonly pages of deleted
>> +	 executables when using the default value of coredump_filter
>> +	 (0x33), while the Linux kernel will not dump those pages.
>> +	 But we can live with that.  */
>> +      compile_rx_or_error (&file_deleted_regex, " (deleted)$",
>> +			   _("Could not compile regex to match "
>> +			     "'<file> (deleted)'"));
>> +      /* We will never release these regexes, so just discard the
>> +	 cleanups.  */
>> +      discard_cleanups (c);
>> +    }
>> +
>
> Above, on error, init_regex_p is left set to the same value as
> if no error was thrown.  Seems like then the second time
> this function is called we'll reach here with invalid compiled
> regexes:
>
>> +  if (*filename == '\0'
>> +      || regexec (&dev_zero_regex, filename, 0, NULL, 0) == 0
>> +      || regexec (&shmem_file_regex, filename, 0, NULL, 0) == 0
>> +      || regexec (&file_deleted_regex, filename, 0, NULL, 0) == 0)
>> +    return 1;
>
> I think it'd be safer if the top did:
>
> if (init_regex_p == -1)
>   return 1; // assume anonymous ?
>
> if (init_regex_p == 0)
>   {
>     init_regex_p = -1; /* assume error */
>
>     compile_rx_or_error ();
>
>     init_regex_p = 1; /* success! */
> }

I improved this code.  Instead of just assuming that the mapping is
anonymous, I decided to try to look (using strstr) for the string
"(deleted)" in the filename.  I think this is the least we can do to try
to guess what kind of mapping we're dealing with.

>> +
>> +  return 0;
>> +}
>> +
>> +/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V,
>> +   MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or
>> +   greater than 0 if it should.  */
>> +
>> +static int
>> +dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v,
>> +		int maybe_private_p, int mapping_anon_p, const char *filename)
>> +{
>> +  /* Initially, we trust in what we received from outside.  This value
>> +     may not be very precise (i.e., it was probably gathered from the
>> +     permission line in the /proc/PID/smaps list, which actually
>> +     refers to VM_MAYSHARE, and not VM_SHARED), but it is what we have
>> +     for now.  */
>> +  int private_p = maybe_private_p;
>> +
>> +  /* We always dump vDSO and vsyscall mappings.  */
>
> Add comment on why this is special cased?
>
> In the v1 patch intro you said:
>
>  "now also respects the MADV_DONTDUMP flag and does not dump the memory
>  mapping marked as so, and won't try to dump "[vsyscall]" or "[vdso]"
>  mappings as before (just like the Linux kernel)."
>
> Was that incorrect then?

Yeah, this was a thinko, Jan corrected right after I sent the message.
What I meant was that GDB *will* dump those mappings.

>> +  if (strcmp ("[vdso]", filename) == 0
>> +      || strcmp ("[vsyscall]", filename) == 0)
>> +    return 1;
>> +
>> +  if (v->initialized_p)
>> +    {
>> +      /* We never dump I/O mappings.  */
>> +      if (v->io_page)
>> +	return 0;
>> +
>> +      /* Check if we should exclude this mapping.  */
>> +      if (v->exclude_coredump)
>> +	return 0;
>> +
>> +      /* Updating our notion of whether this mapping is shared or
>
> s/Updating/Update/

Fixed.

>> +	 private based on a trustworthy value.  */
>> +      private_p = !v->shared_mapping;
>> +
>> +      /* HugeTLB checking.  */
>> +      if (v->uses_huge_tlb)
>> +	{
>> +	  if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE))
>> +	      || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED)))
>> +	    return 1;
>> +
>> +	  return 0;
>> +	}
>> +    }
>> +
>> +  if (private_p)
>> +    {
>> +      if (mapping_anon_p)
>> +	return (filterflags & COREFILTER_ANON_PRIVATE) != 0;
>> +      else
>> +	return (filterflags & COREFILTER_MAPPED_PRIVATE) != 0;
>> +    }
>> +  else
>> +    {
>> +      if (mapping_anon_p)
>> +	return (filterflags & COREFILTER_ANON_SHARED) != 0;
>> +      else
>> +	return (filterflags & COREFILTER_MAPPED_SHARED) != 0;
>> +    }
>> +}
>> +
>>  /* Implement the "info proc" command.  */
>>  
>>  static void
>> @@ -807,7 +1009,8 @@ linux_core_info_proc (struct gdbarch *gdbarch, const char *args,
>>  typedef int linux_find_memory_region_ftype (ULONGEST vaddr, ULONGEST size,
>>  					    ULONGEST offset, ULONGEST inode,
>>  					    int read, int write,
>> -					    int exec, int modified,
>> +					    int exec,
>> +					    enum memory_mapping_state state,
>>  					    const char *filename,
>>  					    void *data);
>>  
>> @@ -819,48 +1022,84 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
>>  				void *obfd)
>>  {
>>    char mapsfilename[100];
>> -  char *data;
>> +  char coredumpfilter_name[100];
>> +  char *data, *coredumpfilterdata;
>> +  pid_t pid;
>> +  /* Default dump behavior of coredump_filter (0x33), according to
>> +     Documentation/filesystems/proc.txt from the Linux kernel
>> +     tree.  */
>> +  unsigned int filterflags = (COREFILTER_ANON_PRIVATE
>> +			      | COREFILTER_ANON_SHARED
>> +			      | COREFILTER_ELF_HEADERS
>> +			      | COREFILTER_HUGETLB_PRIVATE);
>>  
>>    /* We need to know the real target PID to access /proc.  */
>>    if (current_inferior ()->fake_pid_p)
>>      return 1;
>>  
>> -  xsnprintf (mapsfilename, sizeof mapsfilename,
>> -	     "/proc/%d/smaps", current_inferior ()->pid);
>> +  pid = current_inferior ()->pid;
>> +
>> +  if (use_coredump_filter)
>> +    {
>> +      xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name),
>> +		 "/proc/%d/coredump_filter", pid);
>> +      coredumpfilterdata = target_fileio_read_stralloc (coredumpfilter_name);
>> +      if (coredumpfilterdata != NULL)
>> +	{
>> +	  sscanf (coredumpfilterdata, "%x", &filterflags);
>> +	  xfree (coredumpfilterdata);
>> +	}
>> +    }
>> +
>> +  xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid);
>>    data = target_fileio_read_stralloc (mapsfilename);
>>    if (data == NULL)
>>      {
>>        /* Older Linux kernels did not support /proc/PID/smaps.  */
>> -      xsnprintf (mapsfilename, sizeof mapsfilename,
>> -		 "/proc/%d/maps", current_inferior ()->pid);
>> +      xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid);
>>        data = target_fileio_read_stralloc (mapsfilename);
>>      }
>> -  if (data)
>> +
>> +  if (data != NULL)
>>      {
>>        struct cleanup *cleanup = make_cleanup (xfree, data);
>> -      char *line;
>> +      char *line, *t;
>>  
>> -      line = strtok (data, "\n");
>> -      while (line)
>> +      line = strtok_r (data, "\n", &t);
>> +      while (line != NULL)
>>  	{
>>  	  ULONGEST addr, endaddr, offset, inode;
>>  	  const char *permissions, *device, *filename;
>> +	  struct smaps_vmflags v;
>>  	  size_t permissions_len, device_len;
>> -	  int read, write, exec;
>> -	  int modified = 0, has_anonymous = 0;
>> +	  int read, write, exec, private;
>> +	  enum memory_mapping_state state;
>> +	  int has_anonymous = 0;
>> +	  int mapping_anon_p;
>>  
>> +	  memset (&v, 0, sizeof (v));
>>  	  read_mapping (line, &addr, &endaddr, &permissions, &permissions_len,
>>  			&offset, &device, &device_len, &inode, &filename);
>> +	  mapping_anon_p = mapping_is_anonymous_p (filename);
>>  
>>  	  /* Decode permissions.  */
>>  	  read = (memchr (permissions, 'r', permissions_len) != 0);
>>  	  write = (memchr (permissions, 'w', permissions_len) != 0);
>>  	  exec = (memchr (permissions, 'x', permissions_len) != 0);
>> +	  /* 'private' here actually means VM_MAYSHARE, and not
>> +	     VM_SHARED.  In order to know if a mapping is really
>> +	     private or not, we must check the flag "sh" in the
>> +	     VmFlags field.  This is done by decode_vmflags.  However,
>> +	     if we are using an old Linux kernel, we will not have the
>
> It's best to avoid "old", "new", etc.  New will get old soon too.
> Do we have some version string/number to put here instead?
> Likewise other places.

Yeah.  Fixed.

>> +	     VmFlags there.  In this case, there is really no way to
>> +	     know if we are dealing with VM_SHARED, so we just assume
>> +	     that VM_MAYSHARE is enough.  */
>> +	  private = memchr (permissions, 'p', permissions_len) != 0;
>>  
>>  	  /* Try to detect if region was modified by parsing smaps counters.  */
>> -	  for (line = strtok (NULL, "\n");
>> -	       line && line[0] >= 'A' && line[0] <= 'Z';
>> -	       line = strtok (NULL, "\n"))
>> +	  for (line = strtok_r (NULL, "\n", &t);
>> +	       line != NULL && line[0] >= 'A' && line[0] <= 'Z';
>> +	       line = strtok_r (NULL, "\n", &t))
>>  	    {
>>  	      char keyword[64 + 1];
>>  
>> @@ -869,11 +1108,17 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
>>  		  warning (_("Error parsing {s,}maps file '%s'"), mapsfilename);
>>  		  break;
>>  		}
>> +
>>  	      if (strcmp (keyword, "Anonymous:") == 0)
>> -		has_anonymous = 1;
>> -	      if (strcmp (keyword, "Shared_Dirty:") == 0
>> -		  || strcmp (keyword, "Private_Dirty:") == 0
>> -		  || strcmp (keyword, "Swap:") == 0
>> +		{
>> +		  /* Older Linux kernels did not support the
>> +		     "Anonymous:" counter.  Check it here.  */
>> +		  has_anonymous = 1;
>> +		}
>> +	      else if (strcmp (keyword, "VmFlags:") == 0)
>> +		decode_vmflags (line, &v);
>> +
>> +	      if (strcmp (keyword, "AnonHugePages:") == 0
>>  		  || strcmp (keyword, "Anonymous:") == 0)
>>  		{
>>  		  unsigned long number;
>> @@ -884,19 +1129,43 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch,
>>  			       mapsfilename);
>>  		      break;
>>  		    }
>> -		  if (number != 0)
>> -		    modified = 1;
>> +		  if (number > 0)
>> +		    {
>> +		      /* Even if we are dealing with a file-backed
>> +			 mapping, if it contains anonymous pages we
>> +			 consider it to be an anonymous mapping,
>> +			 because this is what the Linux kernel does:
>> +
>> +			 // Dump segments that have been written to.
>> +			 if (vma->anon_vma && FILTER(ANON_PRIVATE))
>> +			 	goto whole;
>> +		      */
>> +		      mapping_anon_p = 1;
>> +		    }
>>  		}
>>  	    }
>>  
>> -	  /* Older Linux kernels did not support the "Anonymous:" counter.
>> -	     If it is missing, we can't be sure - dump all the pages.  */
>> -	  if (!has_anonymous)
>> -	    modified = 1;
>> +	  /* If a mapping should not be dumped we still should create
>> +	     a segment for it, just without SEC_LOAD (see
>> +	     gcore_create_callback).  */
>
> I saw gcore_create_callback, but I ended up still clueless.  :-)

I hope my explanation above helped you in this.  I will also try to
improve the comments on the code.

>> +	  if (has_anonymous)
>> +	    {
>> +	      if (dump_mapping_p (filterflags, &v, private, mapping_anon_p,
>> +				  filename))
>> +		state = MEMORY_MAPPING_MODIFIED;
>> +	      else
>> +		state = MEMORY_MAPPING_UNMODIFIED;
>> +	    }
>> +	  else
>> +	    {
>> +	      /* Older Linux kernels did not support the "Anonymous:" counter.
>> +		 If it is missing, we can't be sure - dump all the pages.  */
>> +	      state = MEMORY_MAPPING_UNKNOWN_STATE;
>> +	    }
>>  
>>  	  /* Invoke the callback function to create the corefile segment.  */
>>  	  func (addr, endaddr - addr, offset, inode,
>> -		read, write, exec, modified, filename, obfd);
>> +		read, write, exec, state, filename, obfd);
>>  	}
>>  
>>        do_cleanups (cleanup);
>> @@ -926,12 +1195,13 @@ struct linux_find_memory_regions_data
>>  static int
>>  linux_find_memory_regions_thunk (ULONGEST vaddr, ULONGEST size,
>>  				 ULONGEST offset, ULONGEST inode,
>> -				 int read, int write, int exec, int modified,
>> +				 int read, int write, int exec,
>> +				 enum memory_mapping_state state,
>
> read/write, etc. are also state.  "enum memory_mapping_state state"
> doesn't really indicate immediately what it's about.  How about
> using "modified_state" for the variable name?  (here and elsewhere).
> I think the end result will be more readable.

Right, fixed.

>>  				 const char *filename, void *arg)
>>  {
>>    struct linux_find_memory_regions_data *data = arg;
>>  
>> -  return data->func (vaddr, size, read, write, exec, modified, data->obfd);
>> +  return data->func (vaddr, size, read, write, exec, state, data->obfd);
>>  }
>>  
>>  /* A variant of linux_find_memory_regions_full that is suitable as the
>> @@ -1074,7 +1344,8 @@ static linux_find_memory_region_ftype linux_make_mappings_callback;
>>  static int
>>  linux_make_mappings_callback (ULONGEST vaddr, ULONGEST size,
>>  			      ULONGEST offset, ULONGEST inode,
>> -			      int read, int write, int exec, int modified,
>> +			      int read, int write, int exec,
>> +			      enum memory_mapping_state state,
>>  			      const char *filename, void *data)
>>  {
>>    struct linux_make_mappings_data *map_data = data;
>> @@ -1872,7 +2143,8 @@ linux_gdb_signal_to_target (struct gdbarch *gdbarch,
>>  
>>  static int
>>  find_mapping_size (CORE_ADDR vaddr, unsigned long size,
>> -		   int read, int write, int exec, int modified,
>> +		   int read, int write, int exec,
>> +		   enum memory_mapping_state state,
>>  		   void *data)
>>  {
>>    struct mem_range *range = data;
>> @@ -1972,6 +2244,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot)
>>    return retval;
>>  }
>>  
>> +/* Display whether the gcore command is using the
>> +   /proc/PID/coredump_filter file.  */
>> +
>> +static void
>> +show_use_coredump_filter (struct ui_file *file, int from_tty,
>> +			  struct cmd_list_element *c, const char *value)
>> +{
>> +  fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to generate"
>> +			    " corefiles is %s.\n"), value);
>> +}
>> +
>>  /* To be called from the various GDB_OSABI_LINUX handlers for the
>>     various GNU/Linux architectures and machine types.  */
>>  
>> @@ -2008,4 +2291,16 @@ _initialize_linux_tdep (void)
>>    /* Observers used to invalidate the cache when needed.  */
>>    observer_attach_inferior_exit (invalidate_linux_cache_inf);
>>    observer_attach_inferior_appeared (invalidate_linux_cache_inf);
>> +
>> +  add_setshow_boolean_cmd ("use-coredump-filter", class_files,
>> +			   &use_coredump_filter, _("\
>> +Set whether gcore should consider /proc/PID/coredump_filter."),
>> +			   _("\
>> +Show whether gcore should consider /proc/PID/coredump_filter."),
>> +			   _("\
>> +Use this command to set whether gcore should consider the contents\n\
>> +of /proc/PID/coredump_filter when generating the corefile.  For more information\n\
>> +about this file, refer to the manpage of core(5)."),
>> +			   NULL, show_use_coredump_filter,
>> +			   &setlist, &showlist);
>>  }
>> diff --git a/gdb/procfs.c b/gdb/procfs.c
>> index b62539f..d074dd3 100644
>> --- a/gdb/procfs.c
>> +++ b/gdb/procfs.c
>> @@ -4967,7 +4967,7 @@ find_memory_regions_callback (struct prmap *map,
>>  		  (map->pr_mflags & MA_READ) != 0,
>>  		  (map->pr_mflags & MA_WRITE) != 0,
>>  		  (map->pr_mflags & MA_EXEC) != 0,
>> -		  1, /* MODIFIED is unknown, pass it as true.  */
>> +		  MEMORY_MAPPING_UNKNOWN_STATE, /* MODIFIED is unknown.  */
>>  		  data);
>>  }
>>  
>> diff --git a/gdb/testsuite/gdb.base/coredump-filter.c b/gdb/testsuite/gdb.base/coredump-filter.c
>> new file mode 100644
>> index 0000000..192c469
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.base/coredump-filter.c
>> @@ -0,0 +1,61 @@
>> +/* Copyright 2015 Free Software Foundation, Inc.
>> +
>> +   This file is part of GDB.
>> +
>> +   This program is free software; you can redistribute it and/or modify
>> +   it under the terms of the GNU General Public License as published by
>> +   the Free Software Foundation; either version 3 of the License, or
>> +   (at your option) any later version.
>> +
>> +   This program is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +   GNU General Public License for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
>> +
>> +#define _GNU_SOURCE
>> +#include <stdlib.h>
>> +#include <assert.h>
>> +#include <unistd.h>
>> +#include <stdio.h>
>> +#include <sys/mman.h>
>> +#include <errno.h>
>> +#include <string.h>
>> +
>> +static void *
>> +do_mmap (void *addr, size_t size, int prot, int flags, int fd, off_t offset)
>> +{
>> +  void *ret = mmap (addr, size, prot, flags, fd, offset);
>> +
>> +  assert (ret != NULL);
>> +  return ret;
>> +}
>> +
>> +int
>> +main (int argc, char *argv[])
>> +{
>> +  const size_t size = 10;
>> +  const int default_prot = PROT_READ | PROT_WRITE;
>> +  char *private_anon, *shared_anon;
>> +  char *dont_dump;
>> +  int i;
>> +
>> +  private_anon = do_mmap (NULL, size, default_prot,
>> +			  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>> +  memset (private_anon, 0x11, size);
>> +
>> +  shared_anon = do_mmap (NULL, size, default_prot,
>> +			 MAP_SHARED | MAP_ANONYMOUS, -1, 0);
>> +  memset (shared_anon, 0x22, size);
>> +
>> +  dont_dump = do_mmap (NULL, size, default_prot,
>> +		       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>> +  memset (dont_dump, 0x55, size);
>> +  i = madvise (dont_dump, size, MADV_DONTDUMP);
>> +  assert_perror (errno);
>> +  assert (i == 0);
>> +
>> +  return 0; /* break-here */
>> +}
>> diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
>> new file mode 100644
>> index 0000000..c7ae91d
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.base/coredump-filter.exp
>> @@ -0,0 +1,129 @@
>> +# Copyright 2015 Free Software Foundation, Inc.
>> +
>> +# This program is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 3 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> +
>> +standard_testfile
>> +
>> +if { [prepare_for_testing "failed to prepare" $testfile $srcfile debug] } {
>> +    untested $testfile.exp
>> +    return -1
>> +}
>> +
>> +if { ![runto_main] } {
>> +    untested $testfile.exp
>> +    return -1
>> +}
>> +
>> +gdb_breakpoint [gdb_get_line_number "break-here"]
>> +gdb_continue_to_breakpoint "break-here" ".* break-here .*"
>> +
>> +proc do_save_core { filter_flag core ipid } {
>> +    verbose -log "writing $filter_flag to /proc/$ipid/coredump_filter"
>> +    if { [catch {open /proc/$ipid/coredump_filter w} fileid] } {
>
> This is opening the /proc file on the build machine, but it
> should be the file on the target machine.  Can you use
> "remote_file target" for this?
>
> If not, perhaps something around:
>
>  remote_exec target "echo $filter_flag > /proc/$ipid/coredump_filter"
>
> ?

Fixed.  I chose to use the second method.

>> +	untested $testfile.exp
>> +	return -1
>> +    }
>> +
>> +    # Set coredump_filter to the value we want
>> +    puts $fileid $filter_flag
>> +    close $fileid
>> +
>> +    # Generate a corefile
>> +    gdb_gcore_cmd "$core" "save corefile $core"
>> +}
>> +
>> +proc do_load_and_test_core { core var working_var working_value } {
>> +    global hex decimal addr
>> +
>> +    set core_loaded [gdb_core_cmd "$core" "load $core"]
>> +    if { $core_loaded == -1 } {
>> +	fail "loading $core"
>> +	return
>> +    }
>> +
>> +    # Use 'int' as any variants of 'char' try to read the target bytes.
>
> I don't understand this comment.

This is not really needed, but I can expand on the comment.

Jan suggested that I use the "*(unsigned int *)" form to access the
variable's contents because it is clearer this way that I am interested
in the contents, rather in the address itself.  It is just to make the
code clearer, but I will explain better.

>> +    gdb_test "print *(unsigned int *) $addr($var)" "\(\\\$$decimal = <error: \)?Cannot access memory at address $hex\(>\)?" \
>> +	"printing $var when core is loaded (should not work)"
>> +    gdb_test "print/x *(unsigned int *) $addr($working_var)" " = $working_value.*" \
>> +	"print/x *$working_var ( = $working_value)"
>> +}
>> +
>> +set non_private_anon_core [standard_output_file non-private-anon.gcore]
>> +set non_shared_anon_core [standard_output_file non-shared-anon.gcore]
>> +set dont_dump_core [standard_output_file dont-dump.gcore]
>> +
>> +# We will generate a few corefiles
>
> Missing period.

Fixed.

>> +#
>> +# This list is composed by sub-lists, and their elements are (in
>> +# order):
>> +#
>> +# - name of the test
>> +# - hexadecimal value to be put in the /proc/PID/coredump_filter file
>> +# - name of the variable that contains the name of the corefile to be
>> +#   generated (including the initial $).
>> +# - name of the variable in the C source code that points to the
>> +#   memory mapping that will NOT be present in the corefile.
>> +# - name of a variable in the C source code that points to a memory
>> +#   mapping that WILL be present in the corefile
>> +# - corresponding value expected for the above variable
>> +
>> +set all_corefiles { { "non-Private-Anonymous" "0x7e" \
>> +			  $non_private_anon_core \
>> +			  "private_anon" \
>> +			  "shared_anon" "0x22" }
>> +    { "non-Shared-Anonymous" "0x7d" \
>> +	  $non_shared_anon_core "shared_anon" \
>> +	  "private_anon" "0x11" }
>> +    { "DoNotDump" "0x33" \
>> +	  $dont_dump_core "dont_dump" \
>> +	  "shared_anon" "0x22" } }
>> +
>> +set core_supported [gdb_gcore_cmd "$non_private_anon_core" "save a corefile"]
>> +if { !$core_supported } {
>> +    untested $testfile.exp
>
> https://sourceware.org/gdb/wiki/GDBTestcaseCookbook#A.22untested.22_calls

Fixed.

>> +    return -1
>> +}
>> +
>> +# Getting the inferior's PID
>> +gdb_test_multiple "info inferiors" "getting inferior pid" {
>> +    -re "process \($decimal\).*\r\n$gdb_prompt $" {
>> +	set infpid $expect_out(1,string)
>> +    }
>> +}
>
> Don't leave infpid undefined on gdb_test_multiple failure.
> Set it upfront:
>
>   set infpid ""
>   gdb_test_multiple "info inferiors" "getting inferior pid" {
>     ...

Fixed.

>
>> +
>> +foreach item $all_corefiles {
>> +    foreach name [list [lindex $item 3] [lindex $item 4]] {
>> +	set test "print/x $name"
>> +	gdb_test_multiple $test $test {
>> +	    -re " = \($hex\)\r\n$gdb_prompt $" {
>> +		set addr($name) $expect_out(1,string)
>
> I'm probably being dense, but I can't see where is addr
> ever used?

'addr' is an associative array that maps variable names to addresses.
This is needed because sometimes (depending on which pags we
dump/ignore), the stack is not dumped and GDB would not be able to find
the variable names; therefore, this dependency is removed here.

>> +	    }
>> +	}
>> +    }
>> +}
>> +
>> +foreach item $all_corefiles {
>> +    with_test_prefix "saving corefile for [lindex $item 0]" {
>> +	do_save_core [lindex $item 1] [subst [lindex $item 2]] $infpid
>> +    }
>> +}
>> +
>> +clean_restart $testfile
>> +
>> +foreach item $all_corefiles {
>> +    with_test_prefix "loading and testing corefile for [lindex $item 0]" {
>> +	do_load_and_test_core [subst [lindex $item 2]] [lindex $item 3] \
>> +	    [lindex $item 4] [lindex $item 5]
>> +    }
>> +}
>
> Thanks,
> Pedro Alves


I will not send the full patch again because I intend to split it into
minor, more logical patches.  I should be able to send it later
today/tomorrow.

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-16 19:46                     ` Oleg Nesterov
@ 2015-03-17 13:45                       ` Oleg Nesterov
  2015-03-18  1:45                         ` Andy Lutomirski
  0 siblings, 1 reply; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-17 13:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/16, Oleg Nesterov wrote:
>
> On 03/16, Andy Lutomirski wrote:
> >
> > Ick, you're probably right.  For what it's worth, the vdso *seems* to
> > be okay (on 64-bit only, and only if you don't poke at it too hard) if
> > you mremap it in one piece.  CRIU does that.
>
> I need to run away till tomorrow, but looking at this code even if "one piece"
> case doesn't look right if it was cow'ed. I'll verify tomorrow.

And I am still not sure this all is 100% correct, but I got lost in this code.
Probably this is fine...

But at least the bug exposed by the test-case looks clear:

	do_linear_fault:

		vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
				+ vma->vm_pgoff;
		...

		special_mapping_fault:

			pgoff = vmf->pgoff - vma->vm_pgoff;


So special_mapping_fault() can only work if this mapping starts from the
first page in ->pages[].

So perhaps we need _something like_ the (wrong/incomplete) patch below...

Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
simply mmap the anon_inode file...

Oleg.

--- x/mm/mmap.c
+++ x/mm/mmap.c
@@ -2832,6 +2832,8 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
 	return 0;
 }
 
+bool is_special_vma(struct vm_area_struct *vma);
+
 /*
  * Copy the vma structure to a new location in the same mm,
  * prior to moving page table entries, to effect an mremap move.
@@ -2851,7 +2853,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
 	 * If anonymous vma has not yet been faulted, update new pgoff
 	 * to match new location, to increase its chance of merging.
 	 */
-	if (unlikely(!vma->vm_file && !vma->anon_vma)) {
+	if (unlikely(!vma->vm_file && !is_special_vma(vma) && !vma->anon_vma)) {
 		pgoff = addr >> PAGE_SHIFT;
 		faulted_in_anon_vma = false;
 	}
@@ -2953,6 +2955,11 @@ static const struct vm_operations_struct legacy_special_mapping_vmops = {
 	.fault = special_mapping_fault,
 };
 
+bool is_special_vma(struct vm_area_struct *vma)
+{
+	return vma->vm_ops == &special_mapping_vmops;
+}
+
 static int special_mapping_fault(struct vm_area_struct *vma,
 				struct vm_fault *vmf)
 {
@@ -2965,7 +2972,7 @@ static int special_mapping_fault(struct vm_area_struct *vma,
 	 * We are allowed to do this because we are the mm; do not copy
 	 * this code into drivers!
 	 */
-	pgoff = vmf->pgoff - vma->vm_pgoff;
+	pgoff = vmf->pgoff;
 
 	if (vma->vm_ops == &legacy_special_mapping_vmops)
 		pages = vma->vm_private_data;
@@ -3014,6 +3021,7 @@ static struct vm_area_struct *__install_special_mapping(
 	if (ret)
 		goto out;
 
+	vma->vm_pgoff = 0;
 	mm->total_vm += len >> PAGE_SHIFT;
 
 	perf_event_mmap(vma);

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-17 13:45                       ` Oleg Nesterov
@ 2015-03-18  1:45                         ` Andy Lutomirski
  2015-03-18 18:08                           ` Oleg Nesterov
  0 siblings, 1 reply; 46+ messages in thread
From: Andy Lutomirski @ 2015-03-18  1:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On Tue, Mar 17, 2015 at 6:43 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/16, Oleg Nesterov wrote:
>>
>> On 03/16, Andy Lutomirski wrote:
>> >
>> > Ick, you're probably right.  For what it's worth, the vdso *seems* to
>> > be okay (on 64-bit only, and only if you don't poke at it too hard) if
>> > you mremap it in one piece.  CRIU does that.
>>
>> I need to run away till tomorrow, but looking at this code even if "one piece"
>> case doesn't look right if it was cow'ed. I'll verify tomorrow.
>
> And I am still not sure this all is 100% correct, but I got lost in this code.
> Probably this is fine...
>
> But at least the bug exposed by the test-case looks clear:
>
>         do_linear_fault:
>
>                 vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
>                                 + vma->vm_pgoff;
>                 ...
>
>                 special_mapping_fault:
>
>                         pgoff = vmf->pgoff - vma->vm_pgoff;
>
>
> So special_mapping_fault() can only work if this mapping starts from the
> first page in ->pages[].
>
> So perhaps we need _something like_ the (wrong/incomplete) patch below...
>
> Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
> simply mmap the anon_inode file...

That's slightly tricky, I think, because it could start showing up in
/proc/PID/map_files or whatever it's called, and I don't think we want
that.  I also don't want to commit to all special mappings everywhere
being semantically identical (there are already two kinds on both x86
and arm64, and I'd eventually like to have them vary per-process as
well).  None of that precludes using non-null vm_file, but it's a
complication.

Your patch does look like a considerable improvement, though.  Let me
see if I can find some time to fold it in with the rest of my special
mapping rework over the next few days.

--Andy

>
> Oleg.
>
> --- x/mm/mmap.c
> +++ x/mm/mmap.c
> @@ -2832,6 +2832,8 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
>         return 0;
>  }
>
> +bool is_special_vma(struct vm_area_struct *vma);
> +
>  /*
>   * Copy the vma structure to a new location in the same mm,
>   * prior to moving page table entries, to effect an mremap move.
> @@ -2851,7 +2853,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>          * If anonymous vma has not yet been faulted, update new pgoff
>          * to match new location, to increase its chance of merging.
>          */
> -       if (unlikely(!vma->vm_file && !vma->anon_vma)) {
> +       if (unlikely(!vma->vm_file && !is_special_vma(vma) && !vma->anon_vma)) {
>                 pgoff = addr >> PAGE_SHIFT;
>                 faulted_in_anon_vma = false;
>         }
> @@ -2953,6 +2955,11 @@ static const struct vm_operations_struct legacy_special_mapping_vmops = {
>         .fault = special_mapping_fault,
>  };
>
> +bool is_special_vma(struct vm_area_struct *vma)
> +{
> +       return vma->vm_ops == &special_mapping_vmops;
> +}
> +
>  static int special_mapping_fault(struct vm_area_struct *vma,
>                                 struct vm_fault *vmf)
>  {
> @@ -2965,7 +2972,7 @@ static int special_mapping_fault(struct vm_area_struct *vma,
>          * We are allowed to do this because we are the mm; do not copy
>          * this code into drivers!
>          */
> -       pgoff = vmf->pgoff - vma->vm_pgoff;
> +       pgoff = vmf->pgoff;
>
>         if (vma->vm_ops == &legacy_special_mapping_vmops)
>                 pages = vma->vm_private_data;
> @@ -3014,6 +3021,7 @@ static struct vm_area_struct *__install_special_mapping(
>         if (ret)
>                 goto out;
>
> +       vma->vm_pgoff = 0;
>         mm->total_vm += len >> PAGE_SHIFT;
>
>         perf_event_mmap(vma);
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-18  1:45                         ` Andy Lutomirski
@ 2015-03-18 18:08                           ` Oleg Nesterov
  0 siblings, 0 replies; 46+ messages in thread
From: Oleg Nesterov @ 2015-03-18 18:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/17, Andy Lutomirski wrote:
>
> On Tue, Mar 17, 2015 at 6:43 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > But at least the bug exposed by the test-case looks clear:
> >
> >         do_linear_fault:
> >
> >                 vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
> >                                 + vma->vm_pgoff;
> >                 ...
> >
> >                 special_mapping_fault:
> >
> >                         pgoff = vmf->pgoff - vma->vm_pgoff;
> >
> >
> > So special_mapping_fault() can only work if this mapping starts from the
> > first page in ->pages[].
> >
> > So perhaps we need _something like_ the (wrong/incomplete) patch below...
> >
> > Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
> > simply mmap the anon_inode file...
>
> That's slightly tricky, I think, because it could start showing up in
> /proc/PID/map_files or whatever it's called, and I don't think we want
> that.

Hmm. To me this looke liks improvement. And again, with this change
uprobe-in-vdso can work.

OK, this is off-topic right now, lets forget this for the moment.

> Your patch does look like a considerable improvement, though.  Let me
> see if I can find some time to fold it in with the rest of my special
> mapping rework over the next few days.

I'll try to recheck... Perhaps I'll send this (changed) patch for review.
This is a bugfix, even if the bug is minor.

And note that with this change vvar->access() becomes trivial. I think it
makes sense to fix "gup() fails in vvar" too. Gdb developers have enough
other problems with the poor kernel interfaces ;)

Oleg.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-16 23:53     ` Sergio Durigan Junior
@ 2015-03-18 19:10       ` Pedro Alves
  2015-03-18 19:39         ` Sergio Durigan Junior
  0 siblings, 1 reply; 46+ messages in thread
From: Pedro Alves @ 2015-03-18 19:10 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches, Jan Kratochvil, Oleg Nesterov

On 03/16/2015 11:53 PM, Sergio Durigan Junior wrote:

> To summarize: I decided to change this part of the code, and make GDB
> actually ignore (i.e., return 0) mappings marked as UNMODIFIED.  After
> all, as explained above, this is what Linux does.

I can't tell whether we'll still need the UNKNOWN state after that,
but offhand, if a mapping isn't supposed to be dumped, why not
just skip calling the callback (gcore_create_callback)?

> 
>>
>>> +
>>> +foreach item $all_corefiles {
>>> +    foreach name [list [lindex $item 3] [lindex $item 4]] {
>>> +	set test "print/x $name"
>>> +	gdb_test_multiple $test $test {
>>> +	    -re " = \($hex\)\r\n$gdb_prompt $" {
>>> +		set addr($name) $expect_out(1,string)
>>
>> I'm probably being dense, but I can't see where is addr
>> ever used?
> 
> 'addr' is an associative array that maps variable names to addresses.
> This is needed because sometimes (depending on which pags we
> dump/ignore), the stack is not dumped and GDB would not be able to find
> the variable names; therefore, this dependency is removed here.

Ah, it's used in do_load_and_test_core.  Somehow missed that the first
time.

> I will not send the full patch again because I intend to split it into
> minor, more logical patches.  I should be able to send it later
> today/tomorrow.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902)
  2015-03-18 19:10       ` Pedro Alves
@ 2015-03-18 19:39         ` Sergio Durigan Junior
  0 siblings, 0 replies; 46+ messages in thread
From: Sergio Durigan Junior @ 2015-03-18 19:39 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches, Jan Kratochvil, Oleg Nesterov

On Wednesday, March 18 2015, Pedro Alves wrote:

> On 03/16/2015 11:53 PM, Sergio Durigan Junior wrote:
>
>> To summarize: I decided to change this part of the code, and make GDB
>> actually ignore (i.e., return 0) mappings marked as UNMODIFIED.  After
>> all, as explained above, this is what Linux does.
>
> I can't tell whether we'll still need the UNKNOWN state after that,
> but offhand, if a mapping isn't supposed to be dumped, why not
> just skip calling the callback (gcore_create_callback)?

That is another option.  Initially we would call gcore_create_callback
because at least the segment header should be created in the corefile;
however, as I mentioned before, Linux doesn't do that (therefore GDB
shouldn't too).

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2015-03-18 19:39 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-05  3:48 [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Sergio Durigan Junior
2015-03-05 15:48 ` Jan Kratochvil
2015-03-05 20:53   ` Sergio Durigan Junior
2015-03-05 20:57     ` Jan Kratochvil
2015-03-11 20:02       ` Oleg Nesterov
2015-03-12 11:31         ` Sergio Durigan Junior
2015-03-12 14:36         ` vvar, gup && coredump Oleg Nesterov
2015-03-12 16:29           ` Andy Lutomirski
2015-03-12 16:56             ` Oleg Nesterov
2015-03-12 17:18               ` Andy Lutomirski
2015-03-12 17:40                 ` Oleg Nesterov
2015-03-12 17:45                   ` Sergio Durigan Junior
2015-03-12 18:04                     ` Oleg Nesterov
2015-03-13  4:50                       ` Sergio Durigan Junior
2015-03-13 15:06                         ` Oleg Nesterov
2015-03-12 17:56                   ` Andy Lutomirski
2015-03-12 18:28                     ` Oleg Nesterov
2015-03-12 17:48               ` Oleg Nesterov
2015-03-12 17:55                 ` Andy Lutomirski
2015-03-12 18:16                   ` Oleg Nesterov
2015-03-12 18:23                     ` Sergio Durigan Junior
2015-03-12 18:20                 ` Pedro Alves
2015-03-12 18:26                   ` Andy Lutomirski
2015-03-16 19:03                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
2015-03-16 19:20                   ` Andy Lutomirski
2015-03-16 19:46                     ` Oleg Nesterov
2015-03-17 13:45                       ` Oleg Nesterov
2015-03-18  1:45                         ` Andy Lutomirski
2015-03-18 18:08                           ` Oleg Nesterov
2015-03-16 19:40                   ` Pedro Alves
2015-03-12 15:02         ` [PATCH] Improve corefile generation by using /proc/PID/coredump_filter (PR corefile/16902) Oleg Nesterov
2015-03-12 15:46           ` Pedro Alves
2015-03-12 15:57             ` Jan Kratochvil
2015-03-12 16:19               ` Pedro Alves
2015-03-12 16:07             ` Oleg Nesterov
2015-03-12 16:28               ` Pedro Alves
2015-03-12 17:37           ` Sergio Durigan Junior
2015-03-12 21:39 ` [PATCH v2] " Sergio Durigan Junior
2015-03-13 19:34   ` Pedro Alves
2015-03-16 23:53     ` Sergio Durigan Junior
2015-03-18 19:10       ` Pedro Alves
2015-03-18 19:39         ` Sergio Durigan Junior
2015-03-14  9:40   ` Eli Zaretskii
2015-03-16  2:42     ` Sergio Durigan Junior
2015-03-13 19:37 ` [PATCH] " Pedro Alves
2015-03-13 19:48   ` Pedro Alves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).