public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	Frederic Weisbecker <fweisbec@gmail.com>,
		Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
		lkml <linux-kernel@vger.kernel.org>,
		systemtap <systemtap@sources.redhat.com>,
		DLE <dle-develop@lists.sourceforge.net>,
		Jim Keniston <jkenisto@us.ibm.com>,
		Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
		Christoph Hellwig <hch@infradead.org>,
		Steven Rostedt <rostedt@goodmis.org>,
		"H. Peter Anvin" <hpa@zytor.com>,
		Anders Kaseorg <andersk@ksplice.com>,
		Tim Abbott <tabbott@ksplice.com>,
	Andi Kleen <andi@firstfloor.org>,
		Jason Baron <jbaron@redhat.com>
Subject: Re: [PATCH -tip v3&10 07/18] x86: Add text_poke_smp for SMP cross 	modifying code
Date: Wed, 03 Mar 2010 00:48:00 -0000	[thread overview]
Message-ID: <20100303004814.GA15029@Krystal> (raw)
In-Reply-To: <4B8745AC.2070702@redhat.com>

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Mathieu Desnoyers wrote:
> > * Masami Hiramatsu (mhiramat@redhat.com) wrote:
> [...]
> >> +
> >> +/*
> >> + * Cross-modifying kernel text with stop_machine().
> >> + * This code originally comes from immediate value.
> >> + */
> >> +static atomic_t stop_machine_first;
> >> +static int wrote_text;
> >> +
> >> +struct text_poke_params {
> >> +	void *addr;
> >> +	const void *opcode;
> >> +	size_t len;
> >> +};
> >> +
> >> +static int __kprobes stop_machine_text_poke(void *data)
> >> +{
> >> +	struct text_poke_params *tpp = data;
> >> +
> >> +	if (atomic_dec_and_test(&stop_machine_first)) {
> >> +		text_poke(tpp->addr, tpp->opcode, tpp->len);
> >> +		smp_wmb();	/* Make sure other cpus see that this has run */
> >> +		wrote_text = 1;
> >> +	} else {
> >> +		while (!wrote_text)
> >> +			smp_rmb();
> >> +		sync_core();
> >
> > Hrm, there is a problem in there. The last loop, when wrote_text becomes
> > true, does not perform any smp_mb(), so you end up in a situation where
> > cpus in the "else" branch may never issue any memory barrier. I'd rather
> > do:
> 
> Hmm, so how about this? :)
> ---
> } else {
> 	do {
> 		smp_rmb();
> 	while (!wrote_text);
> 	sync_core();
> }
> ---
> 

The ordering we are looking for here are:

Write-side: smp_wmb() orders text_poke stores before store to wrote_text.

Read-side: order wrote_text load before subsequent execution of modified
           instructions.

Here again, strictly speaking, wrote_text load is not ordered with respect to
following instructions. So maybe it's fine on x86-TSO specifically, but I would
not count on this kind of synchronization to work in the general case.

Given the very small expected performance impact of this code path, I would
recomment using the more solid/generic alternative below. If there is really a
gain to get by creating this weird wait loop with strange memory barrier
semantics, fine, otherwise I'd be reluctant to accept your proposals as
obviously correct.

If you really, really want to go down the route of proving the correctness of
your memory barrier usage, I can recommend looking at the memory barrier formal
verification framework I did as part of my thesis. But, really, in this case,
the performance gain is just not there, so there is no point in spending time
trying to prove this.

Thanks,

Mathieu

> >
> > +static volatile int wrote_text;
> >
> > ...
> >
> > +static int __kprobes stop_machine_text_poke(void *data)
> > +{
> > +	struct text_poke_params *tpp = data;
> > +
> > +	if (atomic_dec_and_test(&stop_machine_first)) {
> > +		text_poke(tpp->addr, tpp->opcode, tpp->len);
> > +		smp_wmb();	/* order text_poke stores before store to wrote_text */
> > +		wrote_text = 1;
> > +	} else {
> > +		while (!wrote_text)
> > +			cpu_relax();
> > +		smp_mb();	/* order wrote_text load before following execution */
> > +	}
> >
> > If you don't like the "volatile int" definition of wrote_text, then we
> > should probably use the ACCESS_ONCE() macro instead.
> 
> hm, yeah, volatile will be required.
> 
> Thank you,
> 
> 
> -- 
> Masami Hiramatsu
> e-mail: mhiramat@redhat.com
> 
> 
> 

-- 
Mathieu Desnoyers
Operating System Efficiency Consultant
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2010-03-03  0:48 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-25 13:29 [PATCH -tip v3&10 00/18] perf-probe updates - optprobe, elfutils and lazy matching Masami Hiramatsu
2010-02-25 13:29 ` [PATCH -tip v3&10 02/18] kprobes: Introduce generic insn_slot framework Masami Hiramatsu
2010-02-25 15:21   ` Mathieu Desnoyers
2010-03-02  2:55     ` Masami Hiramatsu
2010-03-03  0:19       ` Masami Hiramatsu
2010-03-03  0:37         ` Mathieu Desnoyers
2010-03-03  0:36           ` Masami Hiramatsu
2010-02-25 19:29   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:29 ` [PATCH -tip v3&10 04/18] kprobes: Jump optimization sysctl interface Masami Hiramatsu
2010-02-25 19:29   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:29 ` [PATCH -tip v3&10 05/18] kprobes/x86: Boost probes when reentering Masami Hiramatsu
2010-02-25 19:30   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:29 ` [PATCH -tip v3&10 03/18] kprobes: Introduce kprobes jump optimization Masami Hiramatsu
2010-02-25 19:30   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:29 ` [PATCH -tip v3&10 01/18] kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE Masami Hiramatsu
2010-02-25 15:16   ` Mathieu Desnoyers
2010-02-25 19:29   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:29 ` [PATCH -tip v3&10 06/18] kprobes/x86: Cleanup save/restore registers Masami Hiramatsu
2010-02-25 19:30   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:30 ` [PATCH -tip v3&10 12/18] perf probe: Fix bugs in line range finder Masami Hiramatsu
2010-02-25 19:32   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:30 ` [PATCH -tip v3&10 07/18] x86: Add text_poke_smp for SMP cross modifying code Masami Hiramatsu
2010-02-25 15:38   ` Mathieu Desnoyers
2010-02-26  3:50     ` Masami Hiramatsu
2010-03-03  0:48       ` Mathieu Desnoyers [this message]
2010-03-03  0:57         ` Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:30 ` [PATCH -tip v3&10 09/18] kprobes: Add documents of jump optimization Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:30 ` [PATCH -tip v3&10 11/18] perf probe: Update perf probe document Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:30 ` [PATCH -tip v3&10 08/18] kprobes/x86: Support kprobes jump optimization on x86 Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:30 ` [PATCH -tip v3&10 10/18] perf probe: Do not show --line option without dwarf support Masami Hiramatsu
2010-02-25 19:31   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:31 ` [PATCH -tip v3&10 13/18] perf probe: Rename probe finder functions Masami Hiramatsu
2010-02-25 19:32   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:31 ` [PATCH -tip v3&10 17/18] perf probe: show more lines after last line Masami Hiramatsu
2010-02-25 19:33   ` [tip:perf/probes] perf probe: Show " tip-bot for Masami Hiramatsu
2010-02-25 13:31 ` [PATCH -tip v3&10 14/18] perf probe: Use elfutils-libdw for analyzing debuginfo Masami Hiramatsu
2010-02-25 19:32   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:31 ` [PATCH -tip v3&10 15/18] perf probe: Use libdw callback routines Masami Hiramatsu
2010-02-25 19:33   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 13:31 ` [PATCH -tip v3&10 18/18] perf probe: Add lazy line matching support Masami Hiramatsu
2010-02-25 19:33   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu
2010-02-25 14:12 ` [PATCH -tip v3&10 16/18] perf probe: Check function address range strictly in line finder Masami Hiramatsu
2010-02-25 19:33   ` [tip:perf/probes] " tip-bot for Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100303004814.GA15029@Krystal \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ananth@in.ibm.com \
    --cc=andersk@ksplice.com \
    --cc=andi@firstfloor.org \
    --cc=dle-develop@lists.sourceforge.net \
    --cc=fweisbec@gmail.com \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jbaron@redhat.com \
    --cc=jkenisto@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=systemtap@sources.redhat.com \
    --cc=tabbott@ksplice.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).