public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Use target_read_code in skip_prologue
@ 2013-11-29 14:27 Yao Qi
  2013-11-29 14:27 ` [PATCH 2/3] skip_prolgoue (amd64) Yao Qi
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Yao Qi @ 2013-11-29 14:27 UTC (permalink / raw)
  To: gdb-patches

GDB is able to cache memory accesses requested in target_read_code,
so target_read_code is more efficient than general target_read_memory.
This patch series use target_read_code and its variant in the
skip_prologue related code, and it improves the performance when doing
'b foo' (foo is a function) in remote debugging.

Nowadays, when we set a breakpoint on function f1, GDB will fetch the
code in f1 to determine the start of the function body (say skip the
prologue), it requests read from target many times on x86, see below,

(gdb) b f1                                                                                                                                                   
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,e#9c...Packet received: 5589e5b8000000005dc35589e583
Sending packet: $m80483c4,1#68...Packet received: 55                                                                                                         
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                                         
Sending packet: $m80483c5,1#69...Packet received: 89                                                                               
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                                         
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                        
Sending packet: $m80483c6,5#6e...Packet received: e5b8000000                                                                                                 
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c6,5#6e...Packet received: e5b8000000
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                     
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                                         
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                                         
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                                         
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                                         
Sending packet: $m80483c5,1#69...Packet received: 89                                                                                                         
Sending packet: $m80483c6,1#6a...Packet received: e5                                                                                                         
Sending packet: $m80483c7,1#6b...Packet received: b8                                                                                                         
Sending packet: $m80483c7,1#6b...Packet received: b8                                                                                                         
Sending packet: $m80483c7,1#6b...Packet received: b8
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c4,e#9c...Packet received: 5589e5b8000000005dc35589e583
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c6,5#6e...Packet received: e5b8000000
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c6,5#6e...Packet received: e5b8000000
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c5,1#69...Packet received: 89
Sending packet: $m80483c6,1#6a...Packet received: e5
Sending packet: $m80483c7,1#6b...Packet received: b8
Sending packet: $m80483c7,1#6b...Packet received: b8
Sending packet: $m80483c7,1#6b...Packet received: b8
Sending packet: $m80483c4,1#68...Packet received: 55
Sending packet: $m80483c7,1#6b...Packet received: b8

these address are identical or close to, so caching
should be useful here.  With this patch applied, the
number of RSP 'm' packets are reduced, 

(gdb) b f1
Sending packet: $m80483c0,40#97...Packet received: ffd0c9c35589e5b8000000005dc35589e583ec188b450c8945f88b45108945fc8b45148945f08b45188945f48b45f80345088945ecdb45ecdc45f0d97dea0fb7
Sending packet: $m80483c7,1#6b...Packet received: b8
 
Note that once we use target_read_code in breakpoint.c, the last 'm'
will disappear too.

The changes are quite specific to each port, because skip_prologue is
port specific.

Patch #3 adds a perf test case.  We run it on x86-linux and get the
result below.

                         Base  Patched
skip-prologue cpu_time 1 1.77  0.08
skip-prologue cpu_time 2 2.66  0.09
skip-prologue cpu_time 3 2.09  0.1
skip-prologue wall_time 1 3.79094099998  0.138980865479
skip-prologue wall_time 2 5.69234204292  0.132542133331
skip-prologue wall_time 3 4.39887309074  0.157964944839
skip-prologue vmsize 1 16336  16336
skip-prologue vmsize 2 16336  16336
skip-prologue vmsize 3 16336  16336

*** BLURB HERE ***

Yao Qi (3):
  Use target_read_code in skip_prologue (i386)
  skip_prolgoue (amd64)
  Perf test case: skip-prologue

 gdb/amd64-tdep.c                         |   10 ++--
 gdb/corefile.c                           |   32 +++++++++++++
 gdb/gdbcore.h                            |   17 +++++++
 gdb/i386-tdep.c                          |   71 +++++++++++++++---------------
 gdb/testsuite/gdb.perf/skip-prologue.c   |   48 ++++++++++++++++++++
 gdb/testsuite/gdb.perf/skip-prologue.exp |   59 +++++++++++++++++++++++++
 gdb/testsuite/gdb.perf/skip-prologue.py  |   39 ++++++++++++++++
 7 files changed, 236 insertions(+), 40 deletions(-)
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.c
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.exp
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.py

-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 2/3] skip_prolgoue (amd64)
  2013-11-29 14:27 [PATCH 0/3] Use target_read_code in skip_prologue Yao Qi
@ 2013-11-29 14:27 ` Yao Qi
  2013-11-29 14:38   ` Mark Kettenis
  2013-11-29 14:36 ` [PATCH 1/3] Use target_read_code in skip_prologue (i386) Yao Qi
  2013-11-29 14:38 ` [PATCH 3/3] Perf test case: skip-prologue Yao Qi
  2 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-11-29 14:27 UTC (permalink / raw)
  To: gdb-patches

gdb:

2013-11-29  Yao Qi  <yao@codesourcery.com>

	* amd64-tdep.c (amd64_analyze_stack_align): Call
	target_read_code instead of target_read_memory.
	(amd64_analyze_prologue): Call read_code_unsigned_integer
	instead of read_memory_unsigned_integer.  Call read_code
	instead of read_memory.
	(amd64_skip_xmm_prologue): Likewise.
---
 gdb/amd64-tdep.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index 1bb72ca..19968fc 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -1761,7 +1761,7 @@ amd64_analyze_stack_align (CORE_ADDR pc, CORE_ADDR current_pc,
   int reg, r;
   int offset, offset_and;
 
-  if (target_read_memory (pc, buf, sizeof buf))
+  if (target_read_code (pc, buf, sizeof buf))
     return pc;
 
   /* Check caller-saved saved register.  The first instruction has
@@ -2104,7 +2104,7 @@ amd64_analyze_prologue (struct gdbarch *gdbarch,
   else
     pc = amd64_analyze_stack_align (pc, current_pc, cache);
 
-  op = read_memory_unsigned_integer (pc, 1, byte_order);
+  op = read_code_unsigned_integer (pc, 1, byte_order);
 
   if (op == 0x55)		/* pushq %rbp */
     {
@@ -2117,7 +2117,7 @@ amd64_analyze_prologue (struct gdbarch *gdbarch,
       if (current_pc <= pc + 1)
         return current_pc;
 
-      read_memory (pc + 1, buf, 3);
+      read_code (pc + 1, buf, 3);
 
       /* Check for `movq %rsp, %rbp'.  */
       if (memcmp (buf, mov_rsp_rbp_1, 3) == 0
@@ -2185,7 +2185,7 @@ amd64_skip_xmm_prologue (CORE_ADDR pc, CORE_ADDR start_pc)
     return pc;
 
   /* START_PC can be from overlayed memory, ignored here.  */
-  if (target_read_memory (next_sal.pc - 4, buf, sizeof (buf)) != 0)
+  if (target_read_code (next_sal.pc - 4, buf, sizeof (buf)) != 0)
     return pc;
 
   /* test %al,%al */
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 1/3] Use target_read_code in skip_prologue (i386)
  2013-11-29 14:27 [PATCH 0/3] Use target_read_code in skip_prologue Yao Qi
  2013-11-29 14:27 ` [PATCH 2/3] skip_prolgoue (amd64) Yao Qi
@ 2013-11-29 14:36 ` Yao Qi
  2013-11-30 11:43   ` Pedro Alves
  2013-11-29 14:38 ` [PATCH 3/3] Perf test case: skip-prologue Yao Qi
  2 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-11-29 14:36 UTC (permalink / raw)
  To: gdb-patches

This patch uses target_read_code and its variants to read target
memory in the functions related to i386_skip_prologue.

gdb:

2013-11-29  Yao Qi  <yao@codesourcery.com>

	* corefile.c (read_code): New function.
	(read_code_integer): New function.
	(read_code_unsigned_integer): New function.
	* gdbcore.h (read_code): Declare
	(read_code_integer): Declare.
	(read_code_unsigned_integer): Declare.
	* i386-tdep.c (i386_follow_jump): Call target_read_code instead
	of target_read_memory.  Call read_code_unsigned_integer instead
	of read_memory_unsigned_integer.
	(i386_analyze_struct_return): Likewise.
	(i386_skip_probe): Likewise.
	(i386_analyze_stack_align): Likewise.
	(i386_match_pattern): Likewise.
	(i386_skip_noop): Likewise.
	(i386_analyze_frame_setup): Likewise.
	(i386_analyze_register_saves): Likewise.
	(i386_skip_prologue): Likewise.
	(i386_skip_main_prologue): Likewise.
	(i386_frame_cache_1): Likewise.
---
 gdb/corefile.c  |   32 +++++++++++++++++++++++++
 gdb/gdbcore.h   |   17 +++++++++++++
 gdb/i386-tdep.c |   69 ++++++++++++++++++++++++++++---------------------------
 3 files changed, 84 insertions(+), 34 deletions(-)

diff --git a/gdb/corefile.c b/gdb/corefile.c
index 878ab3b..d821fdd 100644
--- a/gdb/corefile.c
+++ b/gdb/corefile.c
@@ -276,6 +276,18 @@ read_stack (CORE_ADDR memaddr, gdb_byte *myaddr, ssize_t len)
     memory_error (status, memaddr);
 }
 
+/* Same as target_read_code, but report an error if can't read.  */
+
+void
+read_code (CORE_ADDR memaddr, gdb_byte *myaddr, ssize_t len)
+{
+  int status;
+
+  status = target_read_code (memaddr, myaddr, len);
+  if (status != 0)
+    memory_error (status, memaddr);
+}
+
 /* Argument / return result struct for use with
    do_captured_read_memory_integer().  MEMADDR and LEN are filled in
    by gdb_read_memory_integer().  RESULT is the contents that were
@@ -354,6 +366,26 @@ read_memory_unsigned_integer (CORE_ADDR memaddr, int len,
   return extract_unsigned_integer (buf, len, byte_order);
 }
 
+LONGEST
+read_code_integer (CORE_ADDR memaddr, int len,
+		   enum bfd_endian byte_order)
+{
+  gdb_byte buf[sizeof (LONGEST)];
+
+  read_code (memaddr, buf, len);
+  return extract_signed_integer (buf, len, byte_order);
+}
+
+ULONGEST
+read_code_unsigned_integer (CORE_ADDR memaddr, int len,
+			    enum bfd_endian byte_order)
+{
+  gdb_byte buf[sizeof (ULONGEST)];
+
+  read_code (memaddr, buf, len);
+  return extract_unsigned_integer (buf, len, byte_order);
+}
+
 void
 read_memory_string (CORE_ADDR memaddr, char *buffer, int max_len)
 {
diff --git a/gdb/gdbcore.h b/gdb/gdbcore.h
index 41bcf1f..8e0847b 100644
--- a/gdb/gdbcore.h
+++ b/gdb/gdbcore.h
@@ -57,6 +57,10 @@ extern void read_memory (CORE_ADDR memaddr, gdb_byte *myaddr, ssize_t len);
 
 extern void read_stack (CORE_ADDR memaddr, gdb_byte *myaddr, ssize_t len);
 
+/* Like target_read_code, but report an error if can't read.  */
+
+extern void read_code (CORE_ADDR memaddr, gdb_byte *myaddr, ssize_t len);
+
 /* Read an integer from debugged memory, given address and number of
    bytes.  */
 
@@ -73,6 +77,19 @@ extern ULONGEST read_memory_unsigned_integer (CORE_ADDR memaddr,
 					      int len,
 					      enum bfd_endian byte_order);
 
+/* Read an integer from debugged code memory, given address and
+   number of bytes.  */
+
+extern LONGEST read_code_integer (CORE_ADDR memaddr, int len,
+				  enum bfd_endian byte_order);
+
+/* Read an unsigned integer from debugged code memory, given address
+   and number of bytes.  */
+
+extern ULONGEST read_code_unsigned_integer (CORE_ADDR memaddr,
+					    int len,
+					    enum bfd_endian byte_order);
+
 /* Read a null-terminated string from the debuggee's memory, given
    address, a buffer into which to place the string, and the maximum
    available space.  */
diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index 597d672..a1a4453 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -931,13 +931,14 @@ i386_follow_jump (struct gdbarch *gdbarch, CORE_ADDR pc)
   long delta = 0;
   int data16 = 0;
 
-  if (target_read_memory (pc, &op, 1))
+  if (target_read_code (pc, &op, 1))
     return pc;
 
   if (op == 0x66)
     {
       data16 = 1;
-      op = read_memory_unsigned_integer (pc + 1, 1, byte_order);
+
+      op = read_code_unsigned_integer (pc + 1, 1, byte_order);
     }
 
   switch (op)
@@ -999,13 +1000,13 @@ i386_analyze_struct_return (CORE_ADDR pc, CORE_ADDR current_pc,
   if (current_pc <= pc)
     return pc;
 
-  if (target_read_memory (pc, &op, 1))
+  if (target_read_code (pc, &op, 1))
     return pc;
 
   if (op != 0x58)		/* popl %eax */
     return pc;
 
-  if (target_read_memory (pc + 1, buf, 4))
+  if (target_read_code (pc + 1, buf, 4))
     return pc;
 
   if (memcmp (buf, proto1, 3) != 0 && memcmp (buf, proto2, 4) != 0)
@@ -1046,7 +1047,7 @@ i386_skip_probe (CORE_ADDR pc)
   gdb_byte buf[8];
   gdb_byte op;
 
-  if (target_read_memory (pc, &op, 1))
+  if (target_read_code (pc, &op, 1))
     return pc;
 
   if (op == 0x68 || op == 0x6a)
@@ -1116,7 +1117,7 @@ i386_analyze_stack_align (CORE_ADDR pc, CORE_ADDR current_pc,
     I386_EDI_REGNUM		/* %edi */
   };
 
-  if (target_read_memory (pc, buf, sizeof buf))
+  if (target_read_code (pc, buf, sizeof buf))
     return pc;
 
   /* Check caller-saved saved register.  The first instruction has
@@ -1205,7 +1206,7 @@ i386_match_pattern (CORE_ADDR pc, struct i386_insn pattern)
 {
   gdb_byte op;
 
-  if (target_read_memory (pc, &op, 1))
+  if (target_read_code (pc, &op, 1))
     return 0;
 
   if ((op & pattern.mask[0]) == pattern.insn[0])
@@ -1217,7 +1218,7 @@ i386_match_pattern (CORE_ADDR pc, struct i386_insn pattern)
       gdb_assert (pattern.len > 1);
       gdb_assert (pattern.len <= I386_MAX_MATCHED_INSN_LEN);
 
-      if (target_read_memory (pc + 1, buf, pattern.len - 1))
+      if (target_read_code (pc + 1, buf, pattern.len - 1))
 	return 0;
 
       for (i = 1; i < pattern.len; i++)
@@ -1347,7 +1348,7 @@ i386_skip_noop (CORE_ADDR pc)
   gdb_byte op;
   int check = 1;
 
-  if (target_read_memory (pc, &op, 1))
+  if (target_read_code (pc, &op, 1))
     return pc;
 
   while (check) 
@@ -1357,7 +1358,7 @@ i386_skip_noop (CORE_ADDR pc)
       if (op == 0x90) 
 	{
 	  pc += 1;
-	  if (target_read_memory (pc, &op, 1))
+	  if (target_read_code (pc, &op, 1))
 	    return pc;
 	  check = 1;
 	}
@@ -1374,13 +1375,13 @@ i386_skip_noop (CORE_ADDR pc)
 
       else if (op == 0x8b)
 	{
-	  if (target_read_memory (pc + 1, &op, 1))
+	  if (target_read_code (pc + 1, &op, 1))
 	    return pc;
 
 	  if (op == 0xff)
 	    {
 	      pc += 2;
-	      if (target_read_memory (pc, &op, 1))
+	      if (target_read_code (pc, &op, 1))
 		return pc;
 
 	      check = 1;
@@ -1408,7 +1409,7 @@ i386_analyze_frame_setup (struct gdbarch *gdbarch,
   if (limit <= pc)
     return limit;
 
-  if (target_read_memory (pc, &op, 1))
+  if (target_read_code (pc, &op, 1))
     return pc;
 
   if (op == 0x55)		/* pushl %ebp */
@@ -1444,7 +1445,7 @@ i386_analyze_frame_setup (struct gdbarch *gdbarch,
       if (limit <= pc + skip)
 	return limit;
 
-      if (target_read_memory (pc + skip, &op, 1))
+      if (target_read_code (pc + skip, &op, 1))
 	return pc + skip;
 
       /* The i386 prologue looks like
@@ -1465,19 +1466,19 @@ i386_analyze_frame_setup (struct gdbarch *gdbarch,
 	{
 	  /* Check for `movl %esp, %ebp' -- can be written in two ways.  */
 	case 0x8b:
-	  if (read_memory_unsigned_integer (pc + skip + 1, 1, byte_order)
+	  if (read_code_unsigned_integer (pc + skip + 1, 1, byte_order)
 	      != 0xec)
 	    return pc;
 	  pc += (skip + 2);
 	  break;
 	case 0x89:
-	  if (read_memory_unsigned_integer (pc + skip + 1, 1, byte_order)
+	  if (read_code_unsigned_integer (pc + skip + 1, 1, byte_order)
 	      != 0xe5)
 	    return pc;
 	  pc += (skip + 2);
 	  break;
 	case 0x8d: /* Check for 'lea (%ebp), %ebp'.  */
-	  if (read_memory_unsigned_integer (pc + skip + 1, 2, byte_order)
+	  if (read_code_unsigned_integer (pc + skip + 1, 2, byte_order)
 	      != 0x242c)
 	    return pc;
 	  pc += (skip + 3);
@@ -1504,38 +1505,38 @@ i386_analyze_frame_setup (struct gdbarch *gdbarch,
 
 	 NOTE: You can't subtract a 16-bit immediate from a 32-bit
 	 reg, so we don't have to worry about a data16 prefix.  */
-      if (target_read_memory (pc, &op, 1))
+      if (target_read_code (pc, &op, 1))
 	return pc;
       if (op == 0x83)
 	{
 	  /* `subl' with 8-bit immediate.  */
-	  if (read_memory_unsigned_integer (pc + 1, 1, byte_order) != 0xec)
+	  if (read_code_unsigned_integer (pc + 1, 1, byte_order) != 0xec)
 	    /* Some instruction starting with 0x83 other than `subl'.  */
 	    return pc;
 
 	  /* `subl' with signed 8-bit immediate (though it wouldn't
 	     make sense to be negative).  */
-	  cache->locals = read_memory_integer (pc + 2, 1, byte_order);
+	  cache->locals = read_code_integer (pc + 2, 1, byte_order);
 	  return pc + 3;
 	}
       else if (op == 0x81)
 	{
 	  /* Maybe it is `subl' with a 32-bit immediate.  */
-	  if (read_memory_unsigned_integer (pc + 1, 1, byte_order) != 0xec)
+	  if (read_code_unsigned_integer (pc + 1, 1, byte_order) != 0xec)
 	    /* Some instruction starting with 0x81 other than `subl'.  */
 	    return pc;
 
 	  /* It is `subl' with a 32-bit immediate.  */
-	  cache->locals = read_memory_integer (pc + 2, 4, byte_order);
+	  cache->locals = read_code_integer (pc + 2, 4, byte_order);
 	  return pc + 6;
 	}
       else if (op == 0x8d)
 	{
 	  /* The ModR/M byte is 0x64.  */
-	  if (read_memory_unsigned_integer (pc + 1, 1, byte_order) != 0x64)
+	  if (read_code_unsigned_integer (pc + 1, 1, byte_order) != 0x64)
 	    return pc;
 	  /* 'lea' with 8-bit displacement.  */
-	  cache->locals = -1 * read_memory_integer (pc + 3, 1, byte_order);
+	  cache->locals = -1 * read_code_integer (pc + 3, 1, byte_order);
 	  return pc + 4;
 	}
       else
@@ -1546,7 +1547,7 @@ i386_analyze_frame_setup (struct gdbarch *gdbarch,
     }
   else if (op == 0xc8)		/* enter */
     {
-      cache->locals = read_memory_unsigned_integer (pc + 1, 2, byte_order);
+      cache->locals = read_code_unsigned_integer (pc + 1, 2, byte_order);
       return pc + 4;
     }
 
@@ -1570,7 +1571,7 @@ i386_analyze_register_saves (CORE_ADDR pc, CORE_ADDR current_pc,
     offset -= cache->locals;
   for (i = 0; i < 8 && pc < current_pc; i++)
     {
-      if (target_read_memory (pc, &op, 1))
+      if (target_read_code (pc, &op, 1))
 	return pc;
       if (op < 0x50 || op > 0x57)
 	break;
@@ -1680,7 +1681,7 @@ i386_skip_prologue (struct gdbarch *gdbarch, CORE_ADDR start_pc)
 
   for (i = 0; i < 6; i++)
     {
-      if (target_read_memory (pc + i, &op, 1))
+      if (target_read_code (pc + i, &op, 1))
 	return pc;
 
       if (pic_pat[i] != op)
@@ -1690,12 +1691,12 @@ i386_skip_prologue (struct gdbarch *gdbarch, CORE_ADDR start_pc)
     {
       int delta = 6;
 
-      if (target_read_memory (pc + delta, &op, 1))
+      if (target_read_code (pc + delta, &op, 1))
 	return pc;
 
       if (op == 0x89)		/* movl %ebx, x(%ebp) */
 	{
-	  op = read_memory_unsigned_integer (pc + delta + 1, 1, byte_order);
+	  op = read_code_unsigned_integer (pc + delta + 1, 1, byte_order);
 
 	  if (op == 0x5d)	/* One byte offset from %ebp.  */
 	    delta += 3;
@@ -1704,13 +1705,13 @@ i386_skip_prologue (struct gdbarch *gdbarch, CORE_ADDR start_pc)
 	  else			/* Unexpected instruction.  */
 	    delta = 0;
 
-          if (target_read_memory (pc + delta, &op, 1))
+          if (target_read_code (pc + delta, &op, 1))
 	    return pc;
 	}
 
       /* addl y,%ebx */
       if (delta > 0 && op == 0x81
-	  && read_memory_unsigned_integer (pc + delta + 1, 1, byte_order)
+	  && read_code_unsigned_integer (pc + delta + 1, 1, byte_order)
 	     == 0xc3)
 	{
 	  pc += delta + 6;
@@ -1735,13 +1736,13 @@ i386_skip_main_prologue (struct gdbarch *gdbarch, CORE_ADDR pc)
   enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
   gdb_byte op;
 
-  if (target_read_memory (pc, &op, 1))
+  if (target_read_code (pc, &op, 1))
     return pc;
   if (op == 0xe8)
     {
       gdb_byte buf[4];
 
-      if (target_read_memory (pc + 1, buf, sizeof buf) == 0)
+      if (target_read_code (pc + 1, buf, sizeof buf) == 0)
  	{
 	  /* Make sure address is computed correctly as a 32bit
 	     integer even if CORE_ADDR is 64 bit wide.  */
@@ -1834,7 +1835,7 @@ i386_frame_cache_1 (struct frame_info *this_frame,
 	  cache->saved_regs[I386_EIP_REGNUM] -= cache->base;
 	}
       else if (cache->pc != 0
-	       || target_read_memory (get_frame_pc (this_frame), buf, 1))
+	       || target_read_code (get_frame_pc (this_frame), buf, 1))
 	{
 	  /* We're in a known function, but did not find a frame
 	     setup.  Assume that the function does not use %ebp.
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 3/3] Perf test case: skip-prologue
  2013-11-29 14:27 [PATCH 0/3] Use target_read_code in skip_prologue Yao Qi
  2013-11-29 14:27 ` [PATCH 2/3] skip_prolgoue (amd64) Yao Qi
  2013-11-29 14:36 ` [PATCH 1/3] Use target_read_code in skip_prologue (i386) Yao Qi
@ 2013-11-29 14:38 ` Yao Qi
  2013-12-03  7:34   ` Yao Qi
  2 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-11-29 14:38 UTC (permalink / raw)
  To: gdb-patches

This patch add a perf test case on skip-prologue by inserting
breakpoints on two functions many times, in order to exercise
skip-prologue.

gdb/testsuite:

2013-11-29  Yao Qi  <yao@codesourcery.com>

	* gdb.perf/skip-prologue.c: New.
	* gdb.perf/skip-prologue.exp: New.
	* gdb.perf/skip-prologue.py: New.
---
 gdb/testsuite/gdb.perf/skip-prologue.c   |   48 ++++++++++++++++++++++++
 gdb/testsuite/gdb.perf/skip-prologue.exp |   59 ++++++++++++++++++++++++++++++
 gdb/testsuite/gdb.perf/skip-prologue.py  |   39 ++++++++++++++++++++
 3 files changed, 146 insertions(+), 0 deletions(-)
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.c
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.exp
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.py

diff --git a/gdb/testsuite/gdb.perf/skip-prologue.c b/gdb/testsuite/gdb.perf/skip-prologue.c
new file mode 100644
index 0000000..a195f55
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.c
@@ -0,0 +1,48 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright (C) 2013 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* GDB will analyze the prologue of functions f1 and f2.  */
+
+struct foo
+{
+  int a[32];
+};
+
+static int
+f1 (void)
+{
+  return 0;
+}
+
+static double
+f2 (int a, long long b, double c, struct foo f)
+{
+  f.a[0] = a + (int) b + c;
+
+  return c + 0.2;
+}
+
+int
+main (void)
+{
+  struct foo f;
+
+  f1 ();
+  f2 (0, 0, 0.1, f);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.perf/skip-prologue.exp b/gdb/testsuite/gdb.perf/skip-prologue.exp
new file mode 100644
index 0000000..6a4c82c
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.exp
@@ -0,0 +1,59 @@
+# Copyright (C) 2013 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test case is to test the performance of GDB when it analyzes the
+# function prologue.
+
+load_lib perftest.exp
+
+if [skip_perf_tests] {
+    return 0
+}
+
+standard_testfile .c
+set executable $testfile
+set expfile $testfile.exp
+
+PerfTest::assemble {
+    global srcdir subdir srcfile binfile
+
+    if { [gdb_compile "$srcdir/$subdir/$srcfile" ${binfile} executable {debug}] != "" } {
+	return -1
+    }
+
+    return 0
+} {
+    global binfile
+    global gdb_prompt
+
+    clean_restart $binfile
+
+    if ![runto_main] {
+	fail "Can't run to main"
+	return -1
+    }
+} {
+    set test "run"
+    gdb_test_multiple "python SkipPrologue\(\).run()" $test {
+	-re "Breakpoint $decimal at \[^\n\]*\n" {
+	    # GDB prints some messages on breakpoint creation.
+	    # Consume the output to avoid internal buffer full.
+	    exp_continue
+	}
+	-re ".*$gdb_prompt $" {
+	    pass $test
+	}
+    }
+}
diff --git a/gdb/testsuite/gdb.perf/skip-prologue.py b/gdb/testsuite/gdb.perf/skip-prologue.py
new file mode 100644
index 0000000..3af1260
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.py
@@ -0,0 +1,39 @@
+# Copyright (C) 2013 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test case is to test the speed of GDB when it is analyzing the
+# function prologue.
+
+from perftest import perftest
+
+class SkipPrologue(perftest.TestCaseWithBasicMeasurements):
+    def __init__(self):
+        super(SkipPrologue, self).__init__("skip-prologue")
+
+    def _test(self):
+        for i in range(1, 500):
+            # Insert breakpoints on function f1 and f2.
+            bp1 = gdb.Breakpoint("f1")
+            bp2 = gdb.Breakpoint("f2")
+            # Remove them.
+            bp1.delete()
+            bp2.delete()
+
+    def warm_up(self):
+        self._test()
+
+    def execute_test(self):
+        for i in range(1, 4):
+            self.measure.measure(self._test, i)
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-11-29 14:27 ` [PATCH 2/3] skip_prolgoue (amd64) Yao Qi
@ 2013-11-29 14:38   ` Mark Kettenis
  2013-11-29 18:55     ` Mark Kettenis
  0 siblings, 1 reply; 39+ messages in thread
From: Mark Kettenis @ 2013-11-29 14:38 UTC (permalink / raw)
  To: yao; +Cc: gdb-patches

> From: Yao Qi <yao@codesourcery.com>
> Date: Fri, 29 Nov 2013 22:24:10 +0800
> 
> gdb:
> 
> 2013-11-29  Yao Qi  <yao@codesourcery.com>
> 
> 	* amd64-tdep.c (amd64_analyze_stack_align): Call
> 	target_read_code instead of target_read_memory.
> 	(amd64_analyze_prologue): Call read_code_unsigned_integer
> 	instead of read_memory_unsigned_integer.  Call read_code
> 	instead of read_memory.
> 	(amd64_skip_xmm_prologue): Likewise.

Makes sense to me.

> ---
>  gdb/amd64-tdep.c |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-11-29 14:38   ` Mark Kettenis
@ 2013-11-29 18:55     ` Mark Kettenis
  2013-11-30  3:40       ` Yao Qi
  0 siblings, 1 reply; 39+ messages in thread
From: Mark Kettenis @ 2013-11-29 18:55 UTC (permalink / raw)
  To: mark.kettenis; +Cc: yao, gdb-patches

> Date: Fri, 29 Nov 2013 15:36:36 +0100 (CET)
> From: Mark Kettenis <mark.kettenis@xs4all.nl>
> 
> > From: Yao Qi <yao@codesourcery.com>
> > Date: Fri, 29 Nov 2013 22:24:10 +0800
> > 
> > gdb:
> > 
> > 2013-11-29  Yao Qi  <yao@codesourcery.com>
> > 
> > 	* amd64-tdep.c (amd64_analyze_stack_align): Call
> > 	target_read_code instead of target_read_memory.
> > 	(amd64_analyze_prologue): Call read_code_unsigned_integer
> > 	instead of read_memory_unsigned_integer.  Call read_code
> > 	instead of read_memory.
> > 	(amd64_skip_xmm_prologue): Likewise.
> 
> Makes sense to me.

But only if the cache is properly invalidated when control when
running/stepping.  Is it?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-11-29 18:55     ` Mark Kettenis
@ 2013-11-30  3:40       ` Yao Qi
  2013-11-30 12:01         ` Pedro Alves
  0 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-11-30  3:40 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

On 11/30/2013 12:05 AM, Mark Kettenis wrote:
> But only if the cache is properly invalidated when control when
> running/stepping.  Is it?

Yes, cache is invalidated when the inferior is to be resumed.  See
target.c:target_resume.

void
target_resume (ptid_t ptid, int step, enum gdb_signal signal)
{
   struct target_ops *t;

   target_dcache_invalidate ();
   ....

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/3] Use target_read_code in skip_prologue (i386)
  2013-11-29 14:36 ` [PATCH 1/3] Use target_read_code in skip_prologue (i386) Yao Qi
@ 2013-11-30 11:43   ` Pedro Alves
  0 siblings, 0 replies; 39+ messages in thread
From: Pedro Alves @ 2013-11-30 11:43 UTC (permalink / raw)
  To: Yao Qi; +Cc: gdb-patches

On 11/29/2013 02:24 PM, Yao Qi wrote:
> This patch uses target_read_code and its variants to read target
> memory in the functions related to i386_skip_prologue.

(IMO, it'd be nice if relevant part of the '0/3' cover letter
ended up in the commit log.)

> 2013-11-29  Yao Qi  <yao@codesourcery.com>

Looks good to me.  Minor comments below.

> 
> 	* corefile.c (read_code): New function.
> 	(read_code_integer): New function.
> 	(read_code_unsigned_integer): New function.
> 	* gdbcore.h (read_code): Declare

Missing period.

> 	(read_code_integer): Declare.

...

> diff --git a/gdb/gdbcore.h b/gdb/gdbcore.h
> index 41bcf1f..8e0847b 100644
> --- a/gdb/gdbcore.h
> +++ b/gdb/gdbcore.h
> @@ -57,6 +57,10 @@ extern void read_memory (CORE_ADDR memaddr, gdb_byte *myaddr, ssize_t len);
>  
>  extern void read_stack (CORE_ADDR memaddr, gdb_byte *myaddr, ssize_t len);
>  
> +/* Like target_read_code, but report an error if can't read.  */
> +
> +extern void read_code (CORE_ADDR memaddr, gdb_byte *myaddr, ssize_t len);
> +
>  /* Read an integer from debugged memory, given address and number of
>     bytes.  */
>  
> @@ -73,6 +77,19 @@ extern ULONGEST read_memory_unsigned_integer (CORE_ADDR memaddr,
>  					      int len,
>  					      enum bfd_endian byte_order);
>  
> +/* Read an integer from debugged code memory, given address and
> +   number of bytes.  */

/* Read an integer from debugged code memory, given address,
   number of bytes, and byte order for code.  */

> +
> +extern LONGEST read_code_integer (CORE_ADDR memaddr, int len,
> +				  enum bfd_endian byte_order);
> +
> +/* Read an unsigned integer from debugged code memory, given address
> +   and number of bytes.  */

Likewise.

> +
> +extern ULONGEST read_code_unsigned_integer (CORE_ADDR memaddr,
> +					    int len,
> +					    enum bfd_endian byte_order);
> +

Thanks,
-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-11-30  3:40       ` Yao Qi
@ 2013-11-30 12:01         ` Pedro Alves
  2013-12-02  7:34           ` Yao Qi
  2013-12-04 17:54           ` Doug Evans
  0 siblings, 2 replies; 39+ messages in thread
From: Pedro Alves @ 2013-11-30 12:01 UTC (permalink / raw)
  To: Yao Qi; +Cc: Mark Kettenis, gdb-patches

On 11/30/2013 02:33 AM, Yao Qi wrote:
> On 11/30/2013 12:05 AM, Mark Kettenis wrote:
>> But only if the cache is properly invalidated when control when
>> running/stepping.  Is it?
> 
> Yes, cache is invalidated when the inferior is to be resumed.  See
> target.c:target_resume.
> 
> void
> target_resume (ptid_t ptid, int step, enum gdb_signal signal)
> {
>    struct target_ops *t;
> 
>    target_dcache_invalidate ();
>    ....
> 

We also invalidate the cache just before running a command, threads
may be running, in prepare_execute_command:

struct cleanup *
prepare_execute_command (void)
{
...
  /* With multiple threads running while the one we're examining is
     stopped, the dcache can get stale without us being able to detect
     it.  For the duration of the command, though, use the dcache to
     help things like backtrace.  */
  if (non_stop)
    target_dcache_invalidate ();



It seems like we can "dangerously" hit stale cache (probably most
visible with non-stop mode) between target events:

 #1 - all threads set running, dcache is invalidated

 #2 - thread 1 stops.  As we handle the event, we read code, and cache it.

 #3 - others threads continue running.  some thread jits something, or
   changes code that was cached.

 #4 - thread 2 stops.  As we handle the event, we read code, hitting
   stale cache.


I'm thinking we might need to flush the dcache before handling each
event, like we already invalidate the overlay cache (see
"overlay_cache_invalid = 1" in infrun.c) ?

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-11-30 12:01         ` Pedro Alves
@ 2013-12-02  7:34           ` Yao Qi
  2013-12-03 18:28             ` Pedro Alves
                               ` (3 more replies)
  2013-12-04 17:54           ` Doug Evans
  1 sibling, 4 replies; 39+ messages in thread
From: Yao Qi @ 2013-12-02  7:34 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 11/30/2013 06:11 PM, Pedro Alves wrote:
> It seems like we can "dangerously" hit stale cache (probably most
> visible with non-stop mode) between target events:
>
>   #1 - all threads set running, dcache is invalidated
>
>   #2 - thread 1 stops.  As we handle the event, we read code, and cache it.
>
>   #3 - others threads continue running.  some thread jits something, or
>     changes code that was cached.
>
>   #4 - thread 2 stops.  As we handle the event, we read code, hitting
>     stale cache.
>

Yeah, that is a valid case to me.

>
> I'm thinking we might need to flush the dcache before handling each
> event, like we already invalidate the overlay cache (see
> "overlay_cache_invalid = 1" in infrun.c) ?

I don't know why overlay cache is flushed in each event.  Doing some
archaeology doesn't give me any clue.  I doubt that the overly may
change between any two events.  It (flushing overlay cache) looks not 
necessary to me, at least, when overlay events breakpoint is enabled.

GDB target cache contains both code cache and stack cache.  GDB can
know the JIT event by means of jit event breakpoint, but GDB can't know 
whether a thread's stack is modified (by other threads).  So we have
to flush target cache before handling *every* event :-/  I'll send a
follow-up patch.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] Perf test case: skip-prologue
  2013-11-29 14:38 ` [PATCH 3/3] Perf test case: skip-prologue Yao Qi
@ 2013-12-03  7:34   ` Yao Qi
  2013-12-10 12:45     ` Yao Qi
  0 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-12-03  7:34 UTC (permalink / raw)
  To: gdb-patches

On 11/29/2013 10:24 PM, Yao Qi wrote:
> This patch add a perf test case on skip-prologue by inserting
> breakpoints on two functions many times, in order to exercise
> skip-prologue.
> 
> gdb/testsuite:
> 
> 2013-11-29  Yao Qi<yao@codesourcery.com>
> 
> 	* gdb.perf/skip-prologue.c: New.
> 	* gdb.perf/skip-prologue.exp: New.
> 	* gdb.perf/skip-prologue.py: New.

I update this test case to allow setting the number of iterations by

make check-perf RUNTESTFLAGS='skip-prologue.exp SKIP_PROLOGUE_COUNT=500'

-- 
Yao (齐尧)

gdb/testsuite:

2013-12-03  Yao Qi  <yao@codesourcery.com>

	* gdb.perf/skip-prologue.c: New.
	* gdb.perf/skip-prologue.exp: New.
	* gdb.perf/skip-prologue.py: New.
---
 gdb/testsuite/gdb.perf/skip-prologue.c   |   48 +++++++++++++++++++++
 gdb/testsuite/gdb.perf/skip-prologue.exp |   69 ++++++++++++++++++++++++++++++
 gdb/testsuite/gdb.perf/skip-prologue.py  |   40 +++++++++++++++++
 3 files changed, 157 insertions(+), 0 deletions(-)
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.c
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.exp
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.py

diff --git a/gdb/testsuite/gdb.perf/skip-prologue.c b/gdb/testsuite/gdb.perf/skip-prologue.c
new file mode 100644
index 0000000..a195f55
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.c
@@ -0,0 +1,48 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright (C) 2013 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* GDB will analyze the prologue of functions f1 and f2.  */
+
+struct foo
+{
+  int a[32];
+};
+
+static int
+f1 (void)
+{
+  return 0;
+}
+
+static double
+f2 (int a, long long b, double c, struct foo f)
+{
+  f.a[0] = a + (int) b + c;
+
+  return c + 0.2;
+}
+
+int
+main (void)
+{
+  struct foo f;
+
+  f1 ();
+  f2 (0, 0, 0.1, f);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.perf/skip-prologue.exp b/gdb/testsuite/gdb.perf/skip-prologue.exp
new file mode 100644
index 0000000..53ada49
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.exp
@@ -0,0 +1,69 @@
+# Copyright (C) 2013 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test case is to test the performance of GDB when it analyzes the
+# function prologue.
+# There is one parameter in this test:
+# - SKIP_PROLOGUE_COUNT is the number that GDB inserts breakpoints on
+#   function f1 and f2 and removes them.
+
+load_lib perftest.exp
+
+if [skip_perf_tests] {
+    return 0
+}
+
+standard_testfile .c
+set executable $testfile
+set expfile $testfile.exp
+
+# make check-perf RUNTESTFLAGS='skip-prologue.exp SKIP_PROLOGUE_COUNT=500'
+if ![info exists SKIP_PROLOGUE_COUNT] {
+    set SKIP_PROLOGUE_COUNT 500
+}
+
+PerfTest::assemble {
+    global srcdir subdir srcfile binfile
+
+    if { [gdb_compile "$srcdir/$subdir/$srcfile" ${binfile} executable {debug}] != "" } {
+	return -1
+    }
+
+    return 0
+} {
+    global binfile
+    global gdb_prompt
+
+    clean_restart $binfile
+
+    if ![runto_main] {
+	fail "Can't run to main"
+	return -1
+    }
+} {
+    global SKIP_PROLOGUE_COUNT
+
+    set test "run"
+    gdb_test_multiple "python SkipPrologue\($SKIP_PROLOGUE_COUNT\).run()" $test {
+	-re "Breakpoint $decimal at \[^\n\]*\n" {
+	    # GDB prints some messages on breakpoint creation.
+	    # Consume the output to avoid internal buffer full.
+	    exp_continue
+	}
+	-re ".*$gdb_prompt $" {
+	    pass $test
+	}
+    }
+}
diff --git a/gdb/testsuite/gdb.perf/skip-prologue.py b/gdb/testsuite/gdb.perf/skip-prologue.py
new file mode 100644
index 0000000..a1c5e11
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.py
@@ -0,0 +1,40 @@
+# Copyright (C) 2013 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test case is to test the speed of GDB when it is analyzing the
+# function prologue.
+
+from perftest import perftest
+
+class SkipPrologue(perftest.TestCaseWithBasicMeasurements):
+    def __init__(self, count):
+        super(SkipPrologue, self).__init__("skip-prologue")
+        self.count = count
+
+    def _test(self):
+        for i in range(1, self.count):
+            # Insert breakpoints on function f1 and f2.
+            bp1 = gdb.Breakpoint("f1")
+            bp2 = gdb.Breakpoint("f2")
+            # Remove them.
+            bp1.delete()
+            bp2.delete()
+
+    def warm_up(self):
+        self._test()
+
+    def execute_test(self):
+        for i in range(1, 4):
+            self.measure.measure(self._test, i)
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-02  7:34           ` Yao Qi
@ 2013-12-03 18:28             ` Pedro Alves
  2013-12-04  2:34             ` Yao Qi
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Pedro Alves @ 2013-12-03 18:28 UTC (permalink / raw)
  To: Yao Qi; +Cc: Mark Kettenis, gdb-patches

On 12/02/2013 07:32 AM, Yao Qi wrote:
> On 11/30/2013 06:11 PM, Pedro Alves wrote:

>> I'm thinking we might need to flush the dcache before handling each
>> event, like we already invalidate the overlay cache (see
>> "overlay_cache_invalid = 1" in infrun.c) ?
> 
> I don't know why overlay cache is flushed in each event.  Doing some
> archaeology doesn't give me any clue.  I doubt that the overly may
> change between any two events.  It (flushing overlay cache) looks not 
> necessary to me, at least, when overlay events breakpoint is enabled.

Even with overlay events breakpoint enabled, I'd think we'd want to
be careful to not hit the previous cache even _while_ handling
the overlay breakpoint event.  Also, with multi-threading, the
overlay breakpoint event may trigger at the same time as some
other event in another thread, and we may end up handling
the other event first.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-02  7:34           ` Yao Qi
  2013-12-03 18:28             ` Pedro Alves
@ 2013-12-04  2:34             ` Yao Qi
  2013-12-04 12:08               ` Pedro Alves
  2013-12-04 17:42             ` Doug Evans
  2013-12-04 18:00             ` Doug Evans
  3 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-12-04  2:34 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 12/02/2013 03:32 PM, Yao Qi wrote:
> GDB target cache contains both code cache and stack cache.  GDB can
> know the JIT event by means of jit event breakpoint, but GDB can't know
> whether a thread's stack is modified (by other threads).  So we have
> to flush target cache before handling*every*  event :-/  I'll send a
> follow-up patch.

When I finished the patch below, I wondered whether this is the right
way to go.

Nowadays, in non-stop mode, GDB flushes target cache between
each command.  With the patch applied, GDB flushes target cache between
each event.  Then, I realize that GDB flushes cache at any changes (GDB 
commands from user and events from inferior) in non-stop mode, so I 
can't figure out a case that cache can be used.  Do we still need target
cache in non-stop mode?

-- 
Yao (齐尧)

Subject: [PATCH] Flush target dcache in non-stop before handling events

gdb:

2013-12-04  Yao Qi  <yao@codesourcery.com>

	* infrun.c: Include "target-dcache.h".
	(fetch_inferior_event): Call target_dcache_invalidate in
	non-stop.
---
  gdb/infrun.c |   11 +++++++++++
  1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 3b55583..2ed0537 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -60,6 +60,7 @@
  #include "objfiles.h"
  #include "completer.h"
  #include "target-descriptions.h"
+#include "target-dcache.h"

  /* Prototypes for local functions */

@@ -2806,6 +2807,16 @@ fetch_inferior_event (void *client_data)

    overlay_cache_invalid = 1;

+  if (non_stop)
+    {
+      /* In non-stop mode, one thread stops and caches the contents of
+	 stack or code, while other running threads may change the
+	 code (through JIT) or stack.  The target cache can get stale
+	 without us being able to detect it.  Flush target cache
+	 before handling each event.  */
+      target_dcache_invalidate ();
+    }
+
    make_cleanup_restore_integer (&execution_direction);
    execution_direction = target_execution_direction ();

-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-04  2:34             ` Yao Qi
@ 2013-12-04 12:08               ` Pedro Alves
  2013-12-04 15:38                 ` Tom Tromey
                                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Pedro Alves @ 2013-12-04 12:08 UTC (permalink / raw)
  To: Yao Qi; +Cc: Mark Kettenis, gdb-patches

On 12/04/2013 02:33 AM, Yao Qi wrote:
> On 12/02/2013 03:32 PM, Yao Qi wrote:
>> > GDB target cache contains both code cache and stack cache.  GDB can
>> > know the JIT event by means of jit event breakpoint, but GDB can't know
>> > whether a thread's stack is modified (by other threads).  So we have
>> > to flush target cache before handling*every*  event :-/  I'll send a
>> > follow-up patch.
> When I finished the patch below, I wondered whether this is the right
> way to go.
>
> Nowadays, in non-stop mode, GDB flushes target cache between
> each command.  With the patch applied, GDB flushes target cache between
> each event.  Then, I realize that GDB flushes cache at any changes (GDB
> commands from user and events from inferior) in non-stop mode, so I
> can't figure out a case that cache can be used.  Do we still need target
> cache in non-stop mode?

It can still help for the duration of the command, or for the
duration of the event handling.  GDB might end up reading the
same locations more than once while doing either.  Also, the
overfetching can still help anyway.  E.g., in the prologue
analyzers while handling each event.

Actually "non-stop", vs "all-stop" here isn't the ideal
predicate.  The real predicate is "is any thread running".
"non-stop" is just being currently used in
prepare_execute_command as proxy for that, just because
that was the easiest.

On 12/04/2013 02:33 AM, Yao Qi wrote:

> @@ -2806,6 +2807,16 @@ fetch_inferior_event (void *client_data)
> 
>     overlay_cache_invalid = 1;
> 
> +  if (non_stop)
> +    {
> +      /* In non-stop mode, one thread stops and caches the contents of
> +	 stack or code, while other running threads may change the
> +	 code (through JIT) or stack.  The target cache can get stale
> +	 without us being able to detect it.  Flush target cache
> +	 before handling each event.  */
> +      target_dcache_invalidate ();
> +    }

I don't actually think this should be gated on non-stop.  It
should be unconditional.  I mentioned before that it'd be most
visible with non-stop, but that doesn't imply it's not
visible with all-stop.  If we're seeing or going to wait for
a target event, it's because the target was running,
irrespective of all-stop/non-stop.  I really think we
should invalidate the cache at all places we invalidate the
overlay cache (wait_for_inferior, etc.), not just fetch_inferior_event.

For all-stop, it shouldn't really make a difference to
performance, as we invalidate the cache on resumes anyway,
and in all-stop, there must always be a resume prior to
any stop...

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-04 12:08               ` Pedro Alves
@ 2013-12-04 15:38                 ` Tom Tromey
  2013-12-04 18:31                   ` Doug Evans
  2013-12-05 11:31                   ` Pedro Alves
  2013-12-05  1:21                 ` Yao Qi
  2013-12-08  8:01                 ` Yao Qi
  2 siblings, 2 replies; 39+ messages in thread
From: Tom Tromey @ 2013-12-04 15:38 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Yao Qi, Mark Kettenis, gdb-patches

>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:

Pedro> Actually "non-stop", vs "all-stop" here isn't the ideal
Pedro> predicate.  The real predicate is "is any thread running".
Pedro> "non-stop" is just being currently used in
Pedro> prepare_execute_command as proxy for that, just because
Pedro> that was the easiest.

It seemed to me that the predicate must be "is any thread associated
with this particular address space running?" -- but I wanted to ask if
that makes sense, or if that was what you meant.  This idea seems to
open the door to finer-grained cache flushing.

Tom

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-02  7:34           ` Yao Qi
  2013-12-03 18:28             ` Pedro Alves
  2013-12-04  2:34             ` Yao Qi
@ 2013-12-04 17:42             ` Doug Evans
  2013-12-04 18:00             ` Doug Evans
  3 siblings, 0 replies; 39+ messages in thread
From: Doug Evans @ 2013-12-04 17:42 UTC (permalink / raw)
  To: Yao Qi; +Cc: Pedro Alves, Mark Kettenis, gdb-patches

On Sun, Dec 1, 2013 at 11:32 PM, Yao Qi <yao@codesourcery.com> wrote:
> GDB target cache contains both code cache and stack cache.

I hate to keep harping on this, but there's been confusion before and
I want to avoid it in the future.
There is no such thing as the "code cache" or "stack cache".
There is just the target cache that caches target memory.
I can turn off the code cache optimization and stack cache
optimization, and still populate the cache with data from .text or
.data or .stack (so to speak) or whatever.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-11-30 12:01         ` Pedro Alves
  2013-12-02  7:34           ` Yao Qi
@ 2013-12-04 17:54           ` Doug Evans
  2013-12-05  1:39             ` Yao Qi
  2013-12-05 11:47             ` Pedro Alves
  1 sibling, 2 replies; 39+ messages in thread
From: Doug Evans @ 2013-12-04 17:54 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Yao Qi, Mark Kettenis, gdb-patches

On Sat, Nov 30, 2013 at 2:11 AM, Pedro Alves <palves@redhat.com> wrote:
> It seems like we can "dangerously" hit stale cache (probably most
> visible with non-stop mode) between target events:
>
>  #1 - all threads set running, dcache is invalidated
>
>  #2 - thread 1 stops.  As we handle the event, we read code, and cache it.
>
>  #3 - others threads continue running.  some thread jits something, or
>    changes code that was cached.
>
>  #4 - thread 2 stops.  As we handle the event, we read code, hitting
>    stale cache.
>
>
> I'm thinking we might need to flush the dcache before handling each
> event, like we already invalidate the overlay cache (see
> "overlay_cache_invalid = 1" in infrun.c) ?

Hi.  Still catching up on this thread.

The dcache and non-stop are basically incompatible, but we need both.
With any other thread still running the ground can be pulled out from
underneath my view of memory from a stopped thread (even when I'm in
the middle of viewing it - I can imagine a pretty-printer of a complex
object getting really confused for example).
It feels like there's diminishing returns and increasing costs the
farther we go to try to protect the user from this.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-02  7:34           ` Yao Qi
                               ` (2 preceding siblings ...)
  2013-12-04 17:42             ` Doug Evans
@ 2013-12-04 18:00             ` Doug Evans
  3 siblings, 0 replies; 39+ messages in thread
From: Doug Evans @ 2013-12-04 18:00 UTC (permalink / raw)
  To: Yao Qi; +Cc: Pedro Alves, Mark Kettenis, gdb-patches

On Sun, Dec 1, 2013 at 11:32 PM, Yao Qi <yao@codesourcery.com> wrote:
> GDB can
> know the JIT event by means of jit event breakpoint,

Assuming there is one.

gdbserver can do some minimal form of dynamic compilation but doesn't
use gdb's jit interface (right?).
Hmmm, even displaced stepping is a minimal form of dynamic compilation.  :-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-04 15:38                 ` Tom Tromey
@ 2013-12-04 18:31                   ` Doug Evans
  2013-12-05 11:31                   ` Pedro Alves
  1 sibling, 0 replies; 39+ messages in thread
From: Doug Evans @ 2013-12-04 18:31 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Pedro Alves, Yao Qi, Mark Kettenis, gdb-patches

On Wed, Dec 4, 2013 at 7:38 AM, Tom Tromey <tromey@redhat.com> wrote:
>>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
>
> Pedro> Actually "non-stop", vs "all-stop" here isn't the ideal
> Pedro> predicate.  The real predicate is "is any thread running".
> Pedro> "non-stop" is just being currently used in
> Pedro> prepare_execute_command as proxy for that, just because
> Pedro> that was the easiest.
>
> It seemed to me that the predicate must be "is any thread associated
> with this particular address space running?" -- but I wanted to ask if
> that makes sense, or if that was what you meant.  This idea seems to
> open the door to finer-grained cache flushing.

It would be unfortunate to take an unnecessary perf hit when doing a
backtrace and then browsing the stack, or setting a breakpoint in
non-stop mode when, for example, I'm stopped in some thread for
whatever reason (and then other threads happen to have their own
events).

I think some kind of finer-grained flushing *could* be reasonable.
Implementing it may then make the cache "look like" it is a stack
cache and code cache.  1/2 :-)
[One could do something like record in each line an attribute
describing how/where it came from.]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-04 12:08               ` Pedro Alves
  2013-12-04 15:38                 ` Tom Tromey
@ 2013-12-05  1:21                 ` Yao Qi
  2013-12-05 12:08                   ` Pedro Alves
  2013-12-08  8:01                 ` Yao Qi
  2 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-12-05  1:21 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 12/04/2013 08:07 PM, Pedro Alves wrote:
> It can still help for the duration of the command, or for the
> duration of the event handling.  GDB might end up reading the
> same locations more than once while doing either.  Also, the
> overfetching can still help anyway.  E.g., in the prologue
> analyzers while handling each event.

While handling an event, if another thread is still running, we still
can't use cache.

>
> Actually "non-stop", vs "all-stop" here isn't the ideal
> predicate.  The real predicate is "is any thread running".
> "non-stop" is just being currently used in
> prepare_execute_command as proxy for that, just because
> that was the easiest.
>

Beside the predicate "is any thread running", another is "no thread is
resumed since last flushing".  Cache should be flushed when either is
true.

> On 12/04/2013 02:33 AM, Yao Qi wrote:
>
>> >@@ -2806,6 +2807,16 @@ fetch_inferior_event (void *client_data)
>> >
>> >     overlay_cache_invalid = 1;
>> >
>> >+  if (non_stop)
>> >+    {
>> >+      /* In non-stop mode, one thread stops and caches the contents of
>> >+	 stack or code, while other running threads may change the
>> >+	 code (through JIT) or stack.  The target cache can get stale
>> >+	 without us being able to detect it.  Flush target cache
>> >+	 before handling each event.  */
>> >+      target_dcache_invalidate ();
>> >+    }
> I don't actually think this should be gated on non-stop.  It

Right, "non-stop" is not a proper condition here.

> should be unconditional.  I mentioned before that it'd be most
> visible with non-stop, but that doesn't imply it's not
> visible with all-stop.  If we're seeing or going to wait for
> a target event, it's because the target was running,
> irrespective of all-stop/non-stop.  I really think we

That is true, but once we got a target event, target(or thread in the
target) may not stop.  It is still unsafe to use cache.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-04 17:54           ` Doug Evans
@ 2013-12-05  1:39             ` Yao Qi
  2013-12-05 11:47             ` Pedro Alves
  1 sibling, 0 replies; 39+ messages in thread
From: Yao Qi @ 2013-12-05  1:39 UTC (permalink / raw)
  To: Doug Evans; +Cc: Pedro Alves, Mark Kettenis, gdb-patches

On 12/05/2013 01:54 AM, Doug Evans wrote:
> The dcache and non-stop are basically incompatible, but we need both.

Agreed.

> With any other thread still running the ground can be pulled out from
> underneath my view of memory from a stopped thread (even when I'm in
> the middle of viewing it - I can imagine a pretty-printer of a complex
> object getting really confused for example).
> It feels like there's diminishing returns and increasing costs the
> farther we go to try to protect the user from this.

That is reasonable.  In the long term, we'll switch to "running
all-stop on top of non-stop", users shouldn't get staled results
because they think they are in "all-stop" mode, although all-stop is
"simulated" by non-stop, which is invisible to users.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-04 15:38                 ` Tom Tromey
  2013-12-04 18:31                   ` Doug Evans
@ 2013-12-05 11:31                   ` Pedro Alves
  1 sibling, 0 replies; 39+ messages in thread
From: Pedro Alves @ 2013-12-05 11:31 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Yao Qi, Mark Kettenis, gdb-patches

On 12/04/2013 03:38 PM, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
> 
> Pedro> Actually "non-stop", vs "all-stop" here isn't the ideal
> Pedro> predicate.  The real predicate is "is any thread running".
> Pedro> "non-stop" is just being currently used in
> Pedro> prepare_execute_command as proxy for that, just because
> Pedro> that was the easiest.
> 
> It seemed to me that the predicate must be "is any thread associated
> with this particular address space running?" -- but I wanted to ask if
> that makes sense, or if that was what you meant.  This idea seems to
> open the door to finer-grained cache flushing.

Yes.  What I'm getting at is that checking whether in non-stop mode
doesn't even say whether anything is running or not, only that it
could, and that the real predicate revolves around "threads are
running" -- we can go finer-grained from that, though obviously at
the expense of predicate complexity.

Actually, even with target-async/all-stop, the target can also
be running when we get to prepare_execute_command, so the
check for non_stop isn't just being overzealous, it's
actually wrong.  We can't actually trigger badness currently,
I think, as with remote/async/all-stop, GDB can't issue RSP commands
to read memory off the target while the target is running, due to
RSP limitation.  And with Linux native/async, you can't read
memory off a running process, due to backend limination.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-04 17:54           ` Doug Evans
  2013-12-05  1:39             ` Yao Qi
@ 2013-12-05 11:47             ` Pedro Alves
  1 sibling, 0 replies; 39+ messages in thread
From: Pedro Alves @ 2013-12-05 11:47 UTC (permalink / raw)
  To: Doug Evans; +Cc: Yao Qi, Mark Kettenis, gdb-patches

On 12/04/2013 05:54 PM, Doug Evans wrote:
> 
> The dcache and non-stop are basically incompatible, but we need both.
> With any other thread still running the ground can be pulled out from
> underneath my view of memory from a stopped thread (even when I'm in
> the middle of viewing it - I can imagine a pretty-printer of a complex
> object getting really confused for example).

Right.  My view is that for a single quick operation, it's fine
to rely on the cache.  E.g., a backtrace, or printing a variable
while other threads are running.  GDB can give you skewed results
here with or without the cache.  Between single higher level operations
though, a lot of time could pass between two backtraces, and
the likeliness of the cache getting stale increases.  This
suggests investigating the idea of having the cache have a
time based lifetime, in addition.  Though I suspect that's a
lot of more complexity, for diminishing returns.  It's nice
that the second backtrace is faster than the first, but it's
really time of the first backtrace that we should be
optimizing for, IMO.

> It feels like there's diminishing returns and increasing costs the
> farther we go to try to protect the user from this.

Exactly.

Related, I've considered before that some memory accesses may
need to be atomic with the inferior (e.g., things like reading
dynamic linker structures), and that GDB core will need to know
to stop the target, batch a few reads, and re-resume the target.
And that that could also be extended to a user
knob -- "set want-to-be-sure-values-I'm-reading-aren't-skewed-by-running-threads-so-please-pause-them-for-me-as-needed on/off".
Possibly with a shorter spelling.  :-)

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-05  1:21                 ` Yao Qi
@ 2013-12-05 12:08                   ` Pedro Alves
  2013-12-05 14:08                     ` Yao Qi
  0 siblings, 1 reply; 39+ messages in thread
From: Pedro Alves @ 2013-12-05 12:08 UTC (permalink / raw)
  To: Yao Qi; +Cc: Mark Kettenis, gdb-patches

On 12/05/2013 01:19 AM, Yao Qi wrote:
> On 12/04/2013 08:07 PM, Pedro Alves wrote:
>> It can still help for the duration of the command, or for the
>> duration of the event handling.  GDB might end up reading the
>> same locations more than once while doing either.  Also, the
>> overfetching can still help anyway.  E.g., in the prologue
>> analyzers while handling each event.
> 
> While handling an event, if another thread is still running, we still
> can't use cache.

I think we can.  My view here is that handling an event
is a quick and short lived operation.  GDB bursts a few reads
in sequence, and then moves on to the next event.  In that
scenario, you get as much stale results with or without a cache.
IOW, even without the cache, running threads can change memory as
GDB reads it, and so the chances of hitting stale data with or
without a cache are practically the same.  OTOH, distinct target
events (and commands, etc.) can trigger quite apart (time-wise),
and that break the odd balance -- not flushing the cache
between events increases the changes of hitting stale data,
compared to not having a cache.

> Beside the predicate "is any thread running", another is "no thread is
> resumed since last flushing".  Cache should be flushed when either is
> true.

Not sure I understood that.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-05 12:08                   ` Pedro Alves
@ 2013-12-05 14:08                     ` Yao Qi
  2013-12-05 14:37                       ` Pedro Alves
  0 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-12-05 14:08 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 12/05/2013 08:00 PM, Pedro Alves wrote:
> I think we can.  My view here is that handling an event
> is a quick and short lived operation.  GDB bursts a few reads
> in sequence, and then moves on to the next event.  In that
> scenario, you get as much stale results with or without a cache.

I disagree.  Results may be staled with cache, but results may be 
different, not staled, without cache.  They are different because they 
are red on different times, but all of them are valid.  It is a snapshot 
of a piece of memory on a certain moment.

> IOW, even without the cache, running threads can change memory as
> GDB reads it, and so the chances of hitting stale data with or
> without a cache are practically the same.  OTOH, distinct target
> events (and commands, etc.) can trigger quite apart (time-wise),
> and that break the odd balance -- not flushing the cache
> between events increases the changes of hitting stale data,

I suspect you meant "chances" instead of "changes".

> compared to not having a cache.

Flushing the cache decreases likelihood of getting staled data, but
can't completely remove it.  I am fine to use cache in non-stop mode, as 
it helps performance, so we have to compromise.

>
>> >Beside the predicate "is any thread running", another is "no thread is
>> >resumed since last flushing".  Cache should be flushed when either is
>> >true.
> Not sure I understood that.

I meant, even "none of threads is running now", we may have to flush 
cache if "they were resumed" (and all stopped now).

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-05 14:08                     ` Yao Qi
@ 2013-12-05 14:37                       ` Pedro Alves
  0 siblings, 0 replies; 39+ messages in thread
From: Pedro Alves @ 2013-12-05 14:37 UTC (permalink / raw)
  To: Yao Qi; +Cc: Pedro Alves, Mark Kettenis, gdb-patches

On 12/05/2013 02:07 PM, Yao Qi wrote:
> On 12/05/2013 08:00 PM, Pedro Alves wrote:
>> I think we can.  My view here is that handling an event
>> is a quick and short lived operation.  GDB bursts a few reads
>> in sequence, and then moves on to the next event.  In that
>> scenario, you get as much stale results with or without a cache.
> 
> I disagree.  Results may be staled with cache, but results may be 
> different, not staled, without cache.  They are different because they 
> are red on different times, but all of them are valid.  It is a snapshot 
> of a piece of memory on a certain moment.

Sigh, I don't know why I wrote "stale" there.  I meant "wrong,
inconsistent, useless, whatnot".  As in, if a thread changes
memory while GDB is reading it, you can get incoherent/self-inconsistent
results.  E.g,. even if between the inferior's threads, writes to
'struct { int a, int b} ab;' are coordinated, say, with a mutex,
when printing 'ab', GDB can end up reading a chunk of the
structure's contents before the write, and another chunk after
the write, and present that frankenstein value to the user.
You can get such undefined results with or without a cache,
because the "certain moment" will be different for each of
the partial reads.  Even each partial read is not guaranteed
to be atomic.

>> IOW, even without the cache, running threads can change memory as
>> GDB reads it, and so the chances of hitting stale data with or
>> without a cache are practically the same.  OTOH, distinct target
>> events (and commands, etc.) can trigger quite apart (time-wise),
>> and that break the odd balance -- not flushing the cache
>> between events increases the changes of hitting stale data,
> 
> I suspect you meant "chances" instead of "changes".

Yes.

> 
>> compared to not having a cache.
> 
> Flushing the cache decreases likelihood of getting staled data, but
> can't completely remove it.  

Right.  The trick IMO, is selecting flush points that make it so
that that chances of getting an incoherent value/memory chunk
are practically the same with or without a cache.  Places where
GDB needs to be sure to get a coherent, instantaneous
snapshot view of memory need to handle that specially (we do
that nowhere presently), e.g., by pausing all (affected) threads,
or perhaps even something else smarter (say, with kernel help,
the debugger declares intention of reading a memory range,
and the kernel makes sure the associated pages don't change
from the debugger's view, COW pages if some inferior threads
wants to change them while the debugger is accessing them).

> I am fine to use cache in non-stop mode, as
> it helps performance, so we have to compromise.

Right.

>>>> Beside the predicate "is any thread running", another is "no thread is
>>>> resumed since last flushing".  Cache should be flushed when either is
>>>> true.
>> Not sure I understood that.
> 
> I meant, even "none of threads is running now", we may have to flush 
> cache if "they were resumed" (and all stopped now).

OK, I see what you mean.  I was assuming "all stopped now"
means "we've seen all threads report stops", but there are
indeed other ways to implement that predicate, and indeed
that case needs to be considered.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-04 12:08               ` Pedro Alves
  2013-12-04 15:38                 ` Tom Tromey
  2013-12-05  1:21                 ` Yao Qi
@ 2013-12-08  8:01                 ` Yao Qi
  2013-12-08  8:26                   ` Doug Evans
  2013-12-09 11:53                   ` Pedro Alves
  2 siblings, 2 replies; 39+ messages in thread
From: Yao Qi @ 2013-12-08  8:01 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 12/04/2013 08:07 PM, Pedro Alves wrote:
>> >+  if (non_stop)
>> >+    {
>> >+      /* In non-stop mode, one thread stops and caches the contents of
>> >+	 stack or code, while other running threads may change the
>> >+	 code (through JIT) or stack.  The target cache can get stale
>> >+	 without us being able to detect it.  Flush target cache
>> >+	 before handling each event.  */
>> >+      target_dcache_invalidate ();
>> >+    }
> I don't actually think this should be gated on non-stop.  It
> should be unconditional.  I mentioned before that it'd be most
> visible with non-stop, but that doesn't imply it's not
> visible with all-stop.  If we're seeing or going to wait for
> a target event, it's because the target was running,
> irrespective of all-stop/non-stop.  I really think we
> should invalidate the cache at all places we invalidate the
> overlay cache (wait_for_inferior, etc.), not just fetch_inferior_event.

After some discussions, it becomes clear to me that we should flush
target cache before handling events, in the place of the callers of
handle_inferior_event.  I am wondering why don't we flush cache inside
handle_inferior_event?  Although flushing cache is not much relevant
to handle_inferior_event, this can avoid doing cache flush in every
caller of handle_inferior_event.

> 
> For all-stop, it shouldn't really make a difference to
> performance, as we invalidate the cache on resumes anyway,
> and in all-stop, there must always be a resume prior to
> any stop...

If that is the right way to go, I'll move 'overlay_cache_invalid = 1'
to handle_inferior_event too.  WDYT?

gdb:

2013-12-08  Yao Qi  <yao@codesourcery.com>

	* infrun.c: Include "target-dcache.h".
	(fetch_inferior_event): Flush target cache.
---
 gdb/infrun.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 3b55583..0a12107 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -60,6 +60,7 @@
 #include "objfiles.h"
 #include "completer.h"
 #include "target-descriptions.h"
+#include "target-dcache.h"
 
 /* Prototypes for local functions */
 
@@ -3168,6 +3169,11 @@ handle_inferior_event (struct execution_control_state *ecs)
 {
   enum stop_kind stop_soon;
 
+  /* If we've got an event from target, it means the target was
+     running, so cache would be staled.  Flush target cache before
+     handling each event.  */
+  target_dcache_invalidate ();
+
   if (ecs->ws.kind == TARGET_WAITKIND_IGNORE)
     {
       /* We had an event in the inferior, but we are not interested in
-- 
1.7.7.6

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-08  8:01                 ` Yao Qi
@ 2013-12-08  8:26                   ` Doug Evans
  2013-12-09  1:45                     ` Yao Qi
  2013-12-09 11:53                   ` Pedro Alves
  1 sibling, 1 reply; 39+ messages in thread
From: Doug Evans @ 2013-12-08  8:26 UTC (permalink / raw)
  To: Yao Qi; +Cc: Pedro Alves, Mark Kettenis, gdb-patches

On Sat, Dec 7, 2013 at 11:59 PM, Yao Qi <yao@codesourcery.com> wrote:
> On 12/04/2013 08:07 PM, Pedro Alves wrote:
>>> >+  if (non_stop)
>>> >+    {
>>> >+      /* In non-stop mode, one thread stops and caches the contents of
>>> >+    stack or code, while other running threads may change the
>>> >+    code (through JIT) or stack.  The target cache can get stale
>>> >+    without us being able to detect it.  Flush target cache
>>> >+    before handling each event.  */
>>> >+      target_dcache_invalidate ();
>>> >+    }
>> I don't actually think this should be gated on non-stop.  It
>> should be unconditional.  I mentioned before that it'd be most
>> visible with non-stop, but that doesn't imply it's not
>> visible with all-stop.  If we're seeing or going to wait for
>> a target event, it's because the target was running,
>> irrespective of all-stop/non-stop.  I really think we
>> should invalidate the cache at all places we invalidate the
>> overlay cache (wait_for_inferior, etc.), not just fetch_inferior_event.
>
> After some discussions, it becomes clear to me that we should flush
> target cache before handling events, in the place of the callers of
> handle_inferior_event.  I am wondering why don't we flush cache inside
> handle_inferior_event?  Although flushing cache is not much relevant
> to handle_inferior_event, this can avoid doing cache flush in every
> caller of handle_inferior_event.
>
>> For all-stop, it shouldn't really make a difference to
>> performance, as we invalidate the cache on resumes anyway,
>> and in all-stop, there must always be a resume prior to
>> any stop...
>
> If that is the right way to go, I'll move 'overlay_cache_invalid = 1'
> to handle_inferior_event too.  WDYT?
>
> gdb:
>
> 2013-12-08  Yao Qi  <yao@codesourcery.com>
>
>         * infrun.c: Include "target-dcache.h".
>         (fetch_inferior_event): Flush target cache.
> ---
>  gdb/infrun.c |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 3b55583..0a12107 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -60,6 +60,7 @@
>  #include "objfiles.h"
>  #include "completer.h"
>  #include "target-descriptions.h"
> +#include "target-dcache.h"
>
>  /* Prototypes for local functions */
>
> @@ -3168,6 +3169,11 @@ handle_inferior_event (struct execution_control_state *ecs)
>  {
>    enum stop_kind stop_soon;
>
> +  /* If we've got an event from target, it means the target was
> +     running, so cache would be staled.  Flush target cache before
> +     handling each event.  */
> +  target_dcache_invalidate ();
> +
>    if (ecs->ws.kind == TARGET_WAITKIND_IGNORE)
>      {
>        /* We had an event in the inferior, but we are not interested in
> --

If you're going for correctness here, note that this is just another heuristic.
It gets one closer, but as long as at least one thread is still running ....

Question: Does this obviate the need to call target_dcache_invalidate elsewhere?
If so, removing other uses that this subsumes should be part of this patch.

Nit: "cache would be staled" is bad English.
"the cache could be stale" would be better, though I'm sure there is
something even better than that.
[I suspect there's at least one too many commas in there too, but one
can nitpick too much. :-)]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-08  8:26                   ` Doug Evans
@ 2013-12-09  1:45                     ` Yao Qi
  2013-12-09 11:32                       ` Pedro Alves
  0 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-12-09  1:45 UTC (permalink / raw)
  To: Doug Evans; +Cc: Pedro Alves, Mark Kettenis, gdb-patches

On 12/08/2013 04:25 PM, Doug Evans wrote:
> If you're going for correctness here, note that this is just another heuristic.
> It gets one closer, but as long as at least one thread is still running ....
> 

I remind myself adding comments like that, but forgot doing so in the
end.  Add it in the comments in the updated patch.

> Question: Does this obviate the need to call target_dcache_invalidate elsewhere?
> If so, removing other uses that this subsumes should be part of this patch.
> 

Presently, target_dcache_invalidate is called by some places:

 - In commands on dcache and memattr,
 - In target_load and target_resume,
 - In tfind_1,
 - In prepare_execute_command,

I am wondering we may remove the call in prepare_execute_command,
because it is more "precise" to flush cache for every event than for
every command.  However, I am not sure, but I attach the patch.

> Nit: "cache would be staled" is bad English.
> "the cache could be stale" would be better, though I'm sure there is
> something even better than that.
> [I suspect there's at least one too many commas in there too, but one
> can nitpick too much. :-)]

Fixed.

-- 
Yao (齐尧)

gdb:

2013-12-09  Yao Qi  <yao@codesourcery.com>

	* infrun.c: Include "target-dcache.h".
	(handle_inferior_event): Flush target cache.
---
 gdb/infrun.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 3b55583..2901d09 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -60,6 +60,7 @@
 #include "objfiles.h"
 #include "completer.h"
 #include "target-descriptions.h"
+#include "target-dcache.h"
 
 /* Prototypes for local functions */
 
@@ -3168,6 +3169,12 @@ handle_inferior_event (struct execution_control_state *ecs)
 {
   enum stop_kind stop_soon;
 
+  /* If we've got an event from target, it means the target was
+     running and cache could be stale.  Flush target cache before
+     handling each event.  This is just a heuristic.  Running threads
+     may modify target memory, but we don't get any event.  */
+  target_dcache_invalidate ();
+
   if (ecs->ws.kind == TARGET_WAITKIND_IGNORE)
     {
       /* We had an event in the inferior, but we are not interested in
-- 
1.7.7.6

gdb:

2013-12-09  Yao Qi  <yao@codesourcery.com>

	* top.c: Don't include "target-dcache.h".
	(prepare_execute_command): Remove the call to
	target_dcache_invalidate.
---
 gdb/top.c |    8 --------
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/gdb/top.c b/gdb/top.c
index 8ce1a9f..b7df81b 100644
--- a/gdb/top.c
+++ b/gdb/top.c
@@ -28,7 +28,6 @@
 #include "exceptions.h"
 #include <signal.h>
 #include "target.h"
-#include "target-dcache.h"
 #include "breakpoint.h"
 #include "gdbtypes.h"
 #include "expression.h"
@@ -332,13 +331,6 @@ prepare_execute_command (void)
   mark = value_mark ();
   cleanup = make_cleanup_value_free_to_mark (mark);
 
-  /* With multiple threads running while the one we're examining is
-     stopped, the dcache can get stale without us being able to detect
-     it.  For the duration of the command, though, use the dcache to
-     help things like backtrace.  */
-  if (non_stop)
-    target_dcache_invalidate ();
-
   return cleanup;
 }
 
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-09  1:45                     ` Yao Qi
@ 2013-12-09 11:32                       ` Pedro Alves
  0 siblings, 0 replies; 39+ messages in thread
From: Pedro Alves @ 2013-12-09 11:32 UTC (permalink / raw)
  To: Yao Qi; +Cc: Doug Evans, Mark Kettenis, gdb-patches

On 12/09/2013 01:44 AM, Yao Qi wrote:

> I am wondering we may remove the call in prepare_execute_command,
> because it is more "precise" to flush cache for every event than for
> every command.  However, I am not sure, but I attach the patch.

No, because one can issue several commands even if the target
never reports any stop.  E.g., disassemble, wait a while,
memory changes, disassemble, etc.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-08  8:01                 ` Yao Qi
  2013-12-08  8:26                   ` Doug Evans
@ 2013-12-09 11:53                   ` Pedro Alves
  2013-12-09 13:03                     ` Yao Qi
  1 sibling, 1 reply; 39+ messages in thread
From: Pedro Alves @ 2013-12-09 11:53 UTC (permalink / raw)
  To: Yao Qi; +Cc: Mark Kettenis, gdb-patches

On 12/08/2013 07:59 AM, Yao Qi wrote:

> After some discussions, it becomes clear to me that we should flush
> target cache before handling events, in the place of the callers of
> handle_inferior_event.  I am wondering why don't we flush cache inside
> handle_inferior_event?  Although flushing cache is not much relevant
> to handle_inferior_event, this can avoid doing cache flush in every
> caller of handle_inferior_event.

The concern is that target_wait implementations may read memory.

wait_for_inferior & friends used to have a registers_changed call
and this comment:

      /* We have to invalidate the registers BEFORE calling target_wait
	 because they can be loaded from the target while in target_wait.
	 This makes remote debugging a bit more efficient for those
	 targets that provide critical registers as part of their normal
	 status mechanism. */

      overlay_cache_invalid = 1;
      registers_changed ();

      if (deprecated_target_wait_hook)
	ecs->ptid = deprecated_target_wait_hook (waiton_ptid, &ecs->ws, 0);
      else
	ecs->ptid = target_wait (waiton_ptid, &ecs->ws, 0);


Removed here:

 https://sourceware.org/ml/gdb-patches/2011-09/msg00122.html

That was easy to do given we can't read registers of running
threads, so we're sure a new regcache will be created for
the just stopped thread, but for memory it looks simpler and
safer to me to flush before target_wait instead of going
through all possible target_wait paths that could read
memory and be sure to flush there.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-09 11:53                   ` Pedro Alves
@ 2013-12-09 13:03                     ` Yao Qi
  2013-12-09 13:13                       ` Pedro Alves
  0 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-12-09 13:03 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 12/09/2013 07:53 PM, Pedro Alves wrote:
> The concern is that target_wait implementations may read memory.
>

We've flushed cache in target_resume.  We don't have to worry about
this, do we?

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-09 13:03                     ` Yao Qi
@ 2013-12-09 13:13                       ` Pedro Alves
  2013-12-09 13:58                         ` Yao Qi
  0 siblings, 1 reply; 39+ messages in thread
From: Pedro Alves @ 2013-12-09 13:13 UTC (permalink / raw)
  To: Yao Qi; +Cc: Mark Kettenis, gdb-patches

On 12/09/2013 01:01 PM, Yao Qi wrote:
> On 12/09/2013 07:53 PM, Pedro Alves wrote:
>> The concern is that target_wait implementations may read memory.
>>
> 
> We've flushed cache in target_resume.  We don't have to worry about
> this, do we?

We can have more stops than resumes.

 #1 - resume everything (1000 threads)
 #2 - event in one thread triggers, we call target_wait
 #3 - gdb decides to leave thread stopped.
 #4 - one hour passes, while threads poke at memory.
 #5 - another event triggers, and we call target_wait again

No resume happened between #2 and #5.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-09 13:13                       ` Pedro Alves
@ 2013-12-09 13:58                         ` Yao Qi
  2013-12-09 15:34                           ` Pedro Alves
  0 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-12-09 13:58 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 12/09/2013 09:13 PM, Pedro Alves wrote:
> We can have more stops than resumes.
>
>   #1 - resume everything (1000 threads)
>   #2 - event in one thread triggers, we call target_wait
>   #3 - gdb decides to leave thread stopped.
>   #4 - one hour passes, while threads poke at memory.
>   #5 - another event triggers, and we call target_wait again
>
> No resume happened between #2 and #5.

Thanks for the explanation.  IIUC, #2, #3, and #5 are the result of 
handle_inferior_event, where cache is flushed (with my patch applied).

"wait -> handle event -> wait" is like a loop or circle to me, and we
can flush at any point(s) of this circle, depending on what heuristic
we are using.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-09 13:58                         ` Yao Qi
@ 2013-12-09 15:34                           ` Pedro Alves
  2013-12-10  0:57                             ` Yao Qi
  0 siblings, 1 reply; 39+ messages in thread
From: Pedro Alves @ 2013-12-09 15:34 UTC (permalink / raw)
  To: Yao Qi; +Cc: Mark Kettenis, gdb-patches

On 12/09/2013 01:56 PM, Yao Qi wrote:
> On 12/09/2013 09:13 PM, Pedro Alves wrote:
>> We can have more stops than resumes.
>>
>>   #1 - resume everything (1000 threads)
>>   #2 - event in one thread triggers, we call target_wait
>>   #3 - gdb decides to leave thread stopped.
>>   #4 - one hour passes, while threads poke at memory.
>>   #5 - another event triggers, and we call target_wait again
>>
>> No resume happened between #2 and #5.
> 
> Thanks for the explanation.  IIUC, #2, #3, and #5 are the result of 
> handle_inferior_event, where cache is flushed (with my patch applied).

No, #2 happens before handle_inferior_event is called.

> 
> "wait -> handle event -> wait" is like a loop or circle to me, and we
> can flush at any point(s) of this circle, depending on what heuristic
> we are using.

Again, the point is making it so that the cache does not enlarge
the race window with the inferior itself.  IOW, make the cache
transparent WRT to chances of seeing a teared value, prologue, or
whatever.  Between starting to handle an event and finishing
it, a very short time passes.  Between finishing handling an
event and the next event, an unbound amount of time passes.
If we don't flush the cache just before handling the event,
having the cache active has a much much wider race window width
than without the cache active.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-09 15:34                           ` Pedro Alves
@ 2013-12-10  0:57                             ` Yao Qi
  2013-12-10 10:23                               ` Pedro Alves
  0 siblings, 1 reply; 39+ messages in thread
From: Yao Qi @ 2013-12-10  0:57 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 12/09/2013 11:34 PM, Pedro Alves wrote:
>> "wait -> handle event -> wait" is like a loop or circle to me, and we
>> >can flush at any point(s) of this circle, depending on what heuristic
>> >we are using.
> Again, the point is making it so that the cache does not enlarge
> the race window with the inferior itself.  IOW, make the cache
> transparent WRT to chances of seeing a teared value, prologue, or
> whatever.  Between starting to handle an event and finishing
> it, a very short time passes.  Between finishing handling an
> event and the next event, an unbound amount of time passes.
> If we don't flush the cache just before handling the event,
> having the cache active has a much much wider race window width
> than without the cache active.

OK, I am convinced.  Here is the patch to flush cache in every
"overlay_cache_invalid = 1" place.  Beside this, I also invalidate
both target cache and overlay cache in infrun_thread_stop_requested_callback,
because handle_inferior_event is called in it.

-- 
Yao (齐尧)

gdb:

2013-12-10  Yao Qi  <yao@codesourcery.com>

	* infrun.c: Include "target-dcache.h".
	(prepare_for_detach): Call target_dcache_invalidate.
	(wait_for_inferior): Likewise.
	(fetch_inferior_event): Likewise.
	(infrun_thread_stop_requested_callback): Likewise.  Set
	overlay_cache_invalid to 1.
---
 gdb/infrun.c |   24 ++++++++++++++++++++++++
 1 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 3b55583..d8f9787 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -60,6 +60,7 @@
 #include "objfiles.h"
 #include "completer.h"
 #include "target-descriptions.h"
+#include "target-dcache.h"
 
 /* Prototypes for local functions */
 
@@ -2475,6 +2476,13 @@ infrun_thread_stop_requested_callback (struct thread_info *info, void *arg)
 
       old_chain = make_cleanup_restore_current_thread ();
 
+      overlay_cache_invalid = 1;
+      /* Flush target cache before starting to handle each event.
+	 Target was running and cache could be stale.  This is just a
+	 heuristic.  Running threads may modify target memory, but we
+	 don't get any event.  */
+      target_dcache_invalidate ();
+
       /* Go through handle_inferior_event/normal_stop, so we always
 	 have consistent output as if the stop event had been
 	 reported.  */
@@ -2677,6 +2685,11 @@ prepare_for_detach (void)
       memset (ecs, 0, sizeof (*ecs));
 
       overlay_cache_invalid = 1;
+      /* Flush target cache before starting to handle each event.
+	 Target was running and cache could be stale.  This is just a
+	 heuristic.  Running threads may modify target memory, but we
+	 don't get any event.  */
+      target_dcache_invalidate ();
 
       if (deprecated_target_wait_hook)
 	ecs->ptid = deprecated_target_wait_hook (pid_ptid, &ecs->ws, 0);
@@ -2740,6 +2753,12 @@ wait_for_inferior (void)
 
       overlay_cache_invalid = 1;
 
+      /* Flush target cache before starting to handle each event.
+	 Target was running and cache could be stale.  This is just a
+	 heuristic.  Running threads may modify target memory, but we
+	 don't get any event.  */
+      target_dcache_invalidate ();
+
       if (deprecated_target_wait_hook)
 	ecs->ptid = deprecated_target_wait_hook (waiton_ptid, &ecs->ws, 0);
       else
@@ -2805,6 +2824,11 @@ fetch_inferior_event (void *client_data)
     make_cleanup_restore_current_thread ();
 
   overlay_cache_invalid = 1;
+  /* Flush target cache before starting to handle each event.  Target
+     was running and cache could be stale.  This is just a heuristic.
+     Running threads may modify target memory, but we don't get any
+     event.  */
+  target_dcache_invalidate ();
 
   make_cleanup_restore_integer (&execution_direction);
   execution_direction = target_execution_direction ();
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-10  0:57                             ` Yao Qi
@ 2013-12-10 10:23                               ` Pedro Alves
  2013-12-10 12:02                                 ` Yao Qi
  0 siblings, 1 reply; 39+ messages in thread
From: Pedro Alves @ 2013-12-10 10:23 UTC (permalink / raw)
  To: Yao Qi; +Cc: Mark Kettenis, gdb-patches

On 12/10/2013 12:55 AM, Yao Qi wrote:

> 2013-12-10  Yao Qi  <yao@codesourcery.com>
> 
> 	* infrun.c: Include "target-dcache.h".
> 	(prepare_for_detach): Call target_dcache_invalidate.
> 	(wait_for_inferior): Likewise.
> 	(fetch_inferior_event): Likewise.
> 	(infrun_thread_stop_requested_callback): Likewise.  Set
> 	overlay_cache_invalid to 1.

OK.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] skip_prolgoue (amd64)
  2013-12-10 10:23                               ` Pedro Alves
@ 2013-12-10 12:02                                 ` Yao Qi
  0 siblings, 0 replies; 39+ messages in thread
From: Yao Qi @ 2013-12-10 12:02 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

On 12/10/2013 06:23 PM, Pedro Alves wrote:
> On 12/10/2013 12:55 AM, Yao Qi wrote:
>
>> 2013-12-10  Yao Qi  <yao@codesourcery.com>
>>
>> 	* infrun.c: Include "target-dcache.h".
>> 	(prepare_for_detach): Call target_dcache_invalidate.
>> 	(wait_for_inferior): Likewise.
>> 	(fetch_inferior_event): Likewise.
>> 	(infrun_thread_stop_requested_callback): Likewise.  Set
>> 	overlay_cache_invalid to 1.
>
> OK.
>

Thanks for the review.  Patch is pushed.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] Perf test case: skip-prologue
  2013-12-03  7:34   ` Yao Qi
@ 2013-12-10 12:45     ` Yao Qi
  0 siblings, 0 replies; 39+ messages in thread
From: Yao Qi @ 2013-12-10 12:45 UTC (permalink / raw)
  To: gdb-patches

Patch 1/3 and 2/3 are pushed.

On 12/03/2013 03:32 PM, Yao Qi wrote:
> +    def _test(self):
> +        for i in range(1, self.count):

We can replace "_" with "i".

> +            # Insert breakpoints on function f1 and f2.
> +            bp1 = gdb.Breakpoint("f1")
> +            bp2 = gdb.Breakpoint("f2")
> +            # Remove them.
> +            bp1.delete()
> +            bp2.delete()
> +
> +    def warm_up(self):
> +        self._test()
> +
> +    def execute_test(self):
> +        for i in range(1, 4):
> +            self.measure.measure(self._test, i)

"code-cache" should be flushed before starting measuring.

If there is no objections or comments, I'll commit it in
three days.

-- 
Yao (齐尧)

gdb/testsuite:

2013-12-10  Yao Qi  <yao@codesourcery.com>

	* gdb.perf/skip-prologue.c: New.
	* gdb.perf/skip-prologue.exp: New.
	* gdb.perf/skip-prologue.py: New.
---
 gdb/testsuite/gdb.perf/skip-prologue.c   |   48 +++++++++++++++++++++
 gdb/testsuite/gdb.perf/skip-prologue.exp |   69 ++++++++++++++++++++++++++++++
 gdb/testsuite/gdb.perf/skip-prologue.py  |   42 ++++++++++++++++++
 3 files changed, 159 insertions(+), 0 deletions(-)
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.c
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.exp
 create mode 100644 gdb/testsuite/gdb.perf/skip-prologue.py

diff --git a/gdb/testsuite/gdb.perf/skip-prologue.c b/gdb/testsuite/gdb.perf/skip-prologue.c
new file mode 100644
index 0000000..a195f55
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.c
@@ -0,0 +1,48 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright (C) 2013 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* GDB will analyze the prologue of functions f1 and f2.  */
+
+struct foo
+{
+  int a[32];
+};
+
+static int
+f1 (void)
+{
+  return 0;
+}
+
+static double
+f2 (int a, long long b, double c, struct foo f)
+{
+  f.a[0] = a + (int) b + c;
+
+  return c + 0.2;
+}
+
+int
+main (void)
+{
+  struct foo f;
+
+  f1 ();
+  f2 (0, 0, 0.1, f);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.perf/skip-prologue.exp b/gdb/testsuite/gdb.perf/skip-prologue.exp
new file mode 100644
index 0000000..53ada49
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.exp
@@ -0,0 +1,69 @@
+# Copyright (C) 2013 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test case is to test the performance of GDB when it analyzes the
+# function prologue.
+# There is one parameter in this test:
+# - SKIP_PROLOGUE_COUNT is the number that GDB inserts breakpoints on
+#   function f1 and f2 and removes them.
+
+load_lib perftest.exp
+
+if [skip_perf_tests] {
+    return 0
+}
+
+standard_testfile .c
+set executable $testfile
+set expfile $testfile.exp
+
+# make check-perf RUNTESTFLAGS='skip-prologue.exp SKIP_PROLOGUE_COUNT=500'
+if ![info exists SKIP_PROLOGUE_COUNT] {
+    set SKIP_PROLOGUE_COUNT 500
+}
+
+PerfTest::assemble {
+    global srcdir subdir srcfile binfile
+
+    if { [gdb_compile "$srcdir/$subdir/$srcfile" ${binfile} executable {debug}] != "" } {
+	return -1
+    }
+
+    return 0
+} {
+    global binfile
+    global gdb_prompt
+
+    clean_restart $binfile
+
+    if ![runto_main] {
+	fail "Can't run to main"
+	return -1
+    }
+} {
+    global SKIP_PROLOGUE_COUNT
+
+    set test "run"
+    gdb_test_multiple "python SkipPrologue\($SKIP_PROLOGUE_COUNT\).run()" $test {
+	-re "Breakpoint $decimal at \[^\n\]*\n" {
+	    # GDB prints some messages on breakpoint creation.
+	    # Consume the output to avoid internal buffer full.
+	    exp_continue
+	}
+	-re ".*$gdb_prompt $" {
+	    pass $test
+	}
+    }
+}
diff --git a/gdb/testsuite/gdb.perf/skip-prologue.py b/gdb/testsuite/gdb.perf/skip-prologue.py
new file mode 100644
index 0000000..0c71e43
--- /dev/null
+++ b/gdb/testsuite/gdb.perf/skip-prologue.py
@@ -0,0 +1,42 @@
+# Copyright (C) 2013 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test case is to test the speed of GDB when it is analyzing the
+# function prologue.
+
+from perftest import perftest
+
+class SkipPrologue(perftest.TestCaseWithBasicMeasurements):
+    def __init__(self, count):
+        super(SkipPrologue, self).__init__("skip-prologue")
+        self.count = count
+
+    def _test(self):
+        for _ in range(1, self.count):
+            # Insert breakpoints on function f1 and f2.
+            bp1 = gdb.Breakpoint("f1")
+            bp2 = gdb.Breakpoint("f2")
+            # Remove them.
+            bp1.delete()
+            bp2.delete()
+
+    def warm_up(self):
+        self._test()
+
+    def execute_test(self):
+        for i in range(1, 4):
+            gdb.execute("set code-cache off")
+            gdb.execute("set code-cache on")
+            self.measure.measure(self._test, i)
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2013-12-10 12:45 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-29 14:27 [PATCH 0/3] Use target_read_code in skip_prologue Yao Qi
2013-11-29 14:27 ` [PATCH 2/3] skip_prolgoue (amd64) Yao Qi
2013-11-29 14:38   ` Mark Kettenis
2013-11-29 18:55     ` Mark Kettenis
2013-11-30  3:40       ` Yao Qi
2013-11-30 12:01         ` Pedro Alves
2013-12-02  7:34           ` Yao Qi
2013-12-03 18:28             ` Pedro Alves
2013-12-04  2:34             ` Yao Qi
2013-12-04 12:08               ` Pedro Alves
2013-12-04 15:38                 ` Tom Tromey
2013-12-04 18:31                   ` Doug Evans
2013-12-05 11:31                   ` Pedro Alves
2013-12-05  1:21                 ` Yao Qi
2013-12-05 12:08                   ` Pedro Alves
2013-12-05 14:08                     ` Yao Qi
2013-12-05 14:37                       ` Pedro Alves
2013-12-08  8:01                 ` Yao Qi
2013-12-08  8:26                   ` Doug Evans
2013-12-09  1:45                     ` Yao Qi
2013-12-09 11:32                       ` Pedro Alves
2013-12-09 11:53                   ` Pedro Alves
2013-12-09 13:03                     ` Yao Qi
2013-12-09 13:13                       ` Pedro Alves
2013-12-09 13:58                         ` Yao Qi
2013-12-09 15:34                           ` Pedro Alves
2013-12-10  0:57                             ` Yao Qi
2013-12-10 10:23                               ` Pedro Alves
2013-12-10 12:02                                 ` Yao Qi
2013-12-04 17:42             ` Doug Evans
2013-12-04 18:00             ` Doug Evans
2013-12-04 17:54           ` Doug Evans
2013-12-05  1:39             ` Yao Qi
2013-12-05 11:47             ` Pedro Alves
2013-11-29 14:36 ` [PATCH 1/3] Use target_read_code in skip_prologue (i386) Yao Qi
2013-11-30 11:43   ` Pedro Alves
2013-11-29 14:38 ` [PATCH 3/3] Perf test case: skip-prologue Yao Qi
2013-12-03  7:34   ` Yao Qi
2013-12-10 12:45     ` Yao Qi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).