public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <mhiramat@redhat.com>
To: Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@elte.hu>,
	        Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	        lkml<linux-kernel@vger.kernel.org>
Cc: systemtap<systemtap@sources.redhat.com>,
	        DLE<dle-develop@lists.sourceforge.net>,
	        Masami Hiramatsu <mhiramat@redhat.com>,
	        Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	        Ingo Molnar <mingo@elte.hu>,
	Jim Keniston <jkenisto@us.ibm.com>,
	        Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	        Christoph Hellwig <hch@infradead.org>,
	        Steven Rostedt <rostedt@goodmis.org>,
	        Frederic Weisbecker <fweisbec@gmail.com>,
	        "H. Peter Anvin" <hpa@zytor.com>,
	Anders Kaseorg <andersk@ksplice.com>,
	        Tim Abbott <tabbott@ksplice.com>,
	Andi Kleen <andi@firstfloor.org>,
	        Jason Baron <jbaron@redhat.com>,
	        Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Subject: [PATCH -tip v5 09/10] [RFC] x86: Introduce generic jump patching 	without stop_machine
Date: Mon, 23 Nov 2009 23:19:00 -0000	[thread overview]
Message-ID: <20091123232226.22071.57201.stgit@dhcp-100-2-132.bos.redhat.com> (raw)
In-Reply-To: <20091123232115.22071.71558.stgit@dhcp-100-2-132.bos.redhat.com>

Add text_poke_fixup() which takes a fixup address to where a processor
jumps if it hits the modifying address while code modifying.
text_poke_fixup() does following steps for this purpose.

 1. Setup int3 handler for fixup.
 2. Put a breakpoint (int3) on the first byte of modifying region,
    and synchronize code on all CPUs.
 3. Modify other bytes of modifying region, and synchronize code on all CPUs.
 4. Modify the first byte of modifying region, and synchronize code
    on all CPUs.
 5. Clear int3 handler.

Thus, if some other processor execute modifying address when step2 to step4,
it will be jumped to fixup code.

This still has many limitations for modifying multi-instructions at once.
However, it is enough for 'a 5 bytes nop replacing with a jump' patching,
because;
 - Replaced instruction is just one instruction, which is executed atomically.
 - Replacing instruction is a jump, so we can set fixup address where the jump
   goes to.

Changes in v5
 - Add some comments.
 - Use smp_wmb()/smp_rmb()
 - Remove unneeded sync_core_all()

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Anders Kaseorg <andersk@ksplice.com>
Cc: Tim Abbott <tabbott@ksplice.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---

 arch/x86/include/asm/alternative.h |   11 ++++
 arch/x86/kernel/alternative.c      |  102 ++++++++++++++++++++++++++++++++++++
 kernel/kprobes.c                   |    2 -
 3 files changed, 114 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index c240efc..b48ca4d 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -160,4 +160,15 @@ static inline void apply_paravirt(struct paravirt_patch_site *start,
  */
 extern void *text_poke(void *addr, const void *opcode, size_t len);
 
+/*
+ * Setup int3 trap and fixup execution for cross-modifying on SMP case.
+ * If the other cpus execute modifying instruction, it will hit int3
+ * and go to fixup code. This just provides a minimal safety check.
+ * Additional checks/restrictions are required for completely safe
+ * cross-modifying.
+ */
+extern void *text_poke_fixup(void *addr, const void *opcode, size_t len,
+			     void *fixup);
+extern void sync_core_all(void);
+
 #endif /* _ASM_X86_ALTERNATIVE_H */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index de7353c..04576e4 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -4,6 +4,7 @@
 #include <linux/list.h>
 #include <linux/stringify.h>
 #include <linux/kprobes.h>
+#include <linux/kdebug.h>
 #include <linux/mm.h>
 #include <linux/vmalloc.h>
 #include <linux/memory.h>
@@ -552,3 +553,104 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
 	local_irq_restore(flags);
 	return addr;
 }
+
+/*
+ * On pentium series, Unsynchronized cross-modifying code
+ * operations can cause unexpected instruction execution results.
+ * So after code modified, we should synchronize it on each processor.
+ */
+static void __kprobes __local_sync_core(void *info)
+{
+	sync_core();
+}
+
+void __kprobes sync_core_all(void)
+{
+	on_each_cpu(__local_sync_core, NULL, 1);
+}
+
+/* Safely cross-code modifying with fixup address */
+static void *patch_fixup_from;
+static void *patch_fixup_addr;
+
+static int __kprobes patch_exceptions_notify(struct notifier_block *self,
+					      unsigned long val, void *data)
+{
+	struct die_args *args = data;
+	struct pt_regs *regs = args->regs;
+
+	smp_rmb();
+
+	if (likely(!patch_fixup_from))
+		return NOTIFY_DONE;
+
+	if (val != DIE_INT3 || !regs || user_mode_vm(regs) ||
+	    (unsigned long)patch_fixup_from != regs->ip)
+		return NOTIFY_DONE;
+
+	args->regs->ip = (unsigned long)patch_fixup_addr;
+	return NOTIFY_STOP;
+}
+
+/**
+ * text_poke_fixup() -- cross-modifying kernel text with fixup address.
+ * @addr:	Modifying address.
+ * @opcode:	New instruction.
+ * @len:	length of modifying bytes.
+ * @fixup:	Fixup address.
+ *
+ * Note: You must backup replaced instructions before calling this,
+ * if you need to recover it.
+ * Note: Must be called under text_mutex.
+ */
+void *__kprobes text_poke_fixup(void *addr, const void *opcode, size_t len,
+				void *fixup)
+{
+	static const unsigned char int3_insn = BREAKPOINT_INSTRUCTION;
+	static const int int3_size = sizeof(int3_insn);
+
+	/* Replacing 1 byte can be done atomically. */
+	if (unlikely(len <= 1))
+		return text_poke(addr, opcode, len);
+
+	/* Preparing fixup address */
+	patch_fixup_addr = fixup;
+	patch_fixup_from = (u8 *)addr + int3_size; /* IP address after int3 */
+	smp_wmb();
+
+	/* Cap by an int3 - expecting synchronously */
+	text_poke(addr, &int3_insn, int3_size);
+
+	/* Replace tail bytes */
+	text_poke((char *)addr + int3_size, (const char *)opcode + int3_size,
+		  len - int3_size);
+	/* Synchronous code cache */
+	sync_core_all();
+
+	/* Replace int3 with head byte - expecting synchronously */
+	text_poke(addr, opcode, int3_size);
+
+	/*
+	 * Sync core again - this is for waiting for disabled IRQ code
+	 * quiescent state, IOW, waiting for all running int3 fixup
+	 * handlers.
+	 */
+	sync_core_all();
+
+	/* Cleanup fixup address */
+	patch_fixup_from = NULL;
+	smp_wmb();
+	return addr;
+}
+
+static struct notifier_block patch_exceptions_nb = {
+	.notifier_call = patch_exceptions_notify,
+	.priority = 0x7fffffff /* we need to be notified first */
+};
+
+static int __init patch_init(void)
+{
+	return register_die_notifier(&patch_exceptions_nb);
+}
+
+arch_initcall(patch_init);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 1e862ed..22c2ae5 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1298,7 +1298,7 @@ EXPORT_SYMBOL_GPL(unregister_kprobes);
 
 static struct notifier_block kprobe_exceptions_nb = {
 	.notifier_call = kprobe_exceptions_notify,
-	.priority = 0x7fffffff /* we need to be notified first */
+	.priority = 0x7ffffff0 /* High priority, but not first.  */
 };
 
 unsigned long __weak arch_deref_entry_point(void *entry)


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

  parent reply	other threads:[~2009-11-23 23:19 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-23 23:18 [PATCH -tip v5 00/10] kprobes: Kprobes jump optimization support Masami Hiramatsu
2009-11-23 23:18 ` [PATCH -tip v5 02/10] kprobes: Introduce generic insn_slot framework Masami Hiramatsu
2009-11-23 23:18 ` [PATCH -tip v5 01/10] kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE Masami Hiramatsu
2009-11-23 23:19 ` [PATCH -tip v5 04/10] kprobes: Jump optimization sysctl interface Masami Hiramatsu
2009-11-23 23:19 ` [PATCH -tip v5 05/10] kprobes/x86: Boost probes when reentering Masami Hiramatsu
2009-11-23 23:19 ` [PATCH -tip v5 07/10] kprobes/x86: Support kprobes jump optimization on x86 Masami Hiramatsu
2009-11-24  3:15   ` Frederic Weisbecker
2009-11-24 16:27   ` Jason Baron
2009-11-24 17:46     ` Masami Hiramatsu
2009-11-25 16:12       ` Masami Hiramatsu
2009-11-24 16:42   ` H. Peter Anvin
2009-11-24 17:01     ` Masami Hiramatsu
2009-11-23 23:19 ` [PATCH -tip v5 03/10] kprobes: Introduce kprobes jump optimization Masami Hiramatsu
2009-11-24  2:44   ` Frederic Weisbecker
2009-11-24  3:31     ` Frederic Weisbecker
2009-11-24 15:32       ` Masami Hiramatsu
2009-11-24 20:14         ` Frederic Weisbecker
2009-11-24 20:59           ` Masami Hiramatsu
2009-11-25 21:09             ` Steven Rostedt
2009-11-25 21:30               ` Masami Hiramatsu
2009-11-24 21:12           ` H. Peter Anvin
2009-11-24 15:31     ` Masami Hiramatsu
2009-11-24 19:45       ` Frederic Weisbecker
2009-11-24 21:15         ` Masami Hiramatsu
2009-11-23 23:19 ` [PATCH -tip v5 06/10] kprobes/x86: Cleanup save/restore registers Masami Hiramatsu
2009-11-24  2:51   ` Frederic Weisbecker
2009-11-24 15:36     ` Masami Hiramatsu
2009-11-24 20:19       ` Frederic Weisbecker
2009-11-24 15:40     ` Frank Ch. Eigler
2009-11-24 20:20       ` Frederic Weisbecker
2009-11-23 23:19 ` [PATCH -tip v5 10/10] [RFC] kprobes/x86: Use text_poke_fixup() for jump optimization Masami Hiramatsu
2009-11-23 23:19 ` [PATCH -tip v5 08/10] kprobes: Add documents of " Masami Hiramatsu
2009-11-23 23:19 ` Masami Hiramatsu [this message]
2009-11-24  2:03 ` [PATCH -tip v5 00/10] kprobes: Kprobes jump optimization support Frederic Weisbecker
2009-11-24  3:20   ` Frederic Weisbecker
2009-11-24  7:53     ` Ingo Molnar
2009-11-24 16:03       ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091123232226.22071.57201.stgit@dhcp-100-2-132.bos.redhat.com \
    --to=mhiramat@redhat.com \
    --cc=ananth@in.ibm.com \
    --cc=andersk@ksplice.com \
    --cc=andi@firstfloor.org \
    --cc=dle-develop@lists.sourceforge.net \
    --cc=fweisbec@gmail.com \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jbaron@redhat.com \
    --cc=jkenisto@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=systemtap@sources.redhat.com \
    --cc=tabbott@ksplice.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).