public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: Richard J Moore <richardj_moore@uk.ibm.com>
To: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>,
	Mathieu Desnoyers <compudj@krystal.dyndns.org>,
	Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>,
	Karim Yaghmour <karim@opersys.com>,
	Masami Hiramatsu <masami.hiramatsu@gmail.com>,
	michel.dagenais@polymtl.ca, Roland McGrath <roland@redhat.com>,
	Satoshi Oshima <soshima@redhat.com>,
	sugita@sdl.hitachi.co.jp, systemtap@sources.redhat.com
Subject: Re: Hitachi djprobe mechanism
Date: Mon, 01 Aug 2005 08:44:00 -0000	[thread overview]
Message-ID: <OF331D042E.CCADD212-ON41257050.002DC63A-41257050.002F8168@uk.ibm.com> (raw)
In-Reply-To: <20050731220304.GJ3726@bragg.suse.de>





There is another issue to consider when looking into using probes other
then int3:

Intel erratum 54 - Unsynchronized Cross-modifying code - refers to the
practice of modifying code on one processor where another has prefetched
the unmodified version of the code. Intel states that unpredictable general
protection faults may result if a synchronizing instruction (iret, int,
int3, cpuid, etc ) is not executed on the second processor before it
executes the pre-fetched out-of-date copy of the instruction.

When we became aware of this I had a long discussion with Intel's
microarchitecture guys. It turns out that the reason for this erratum
(which incidentally Intel does not intend to fix) is because the trace
cache - the stream of micorops resulting from instruction interpretation -
cannot guaranteed to be valid. Reading between the lines I assume this
issue arises because of optimization done in the trace cache, where it is
no longer possible to identify the original instruction boundaries. If the
CPU discoverers that the trace cache has been invalidated because of
unsynchronized cross-modification then instruction execution will be
aborted with a GPF. Further discussion with Intel revealed that replacing
the first opcode byte with an int3 would not be subject to this erratum.

So, is cmpxchg reliable? One has to guarantee more than mere atomicity.



- -
Richard J Moore
IBM Advanced Linux Response Team - Linux Technology Centre
MOBEX: 264807; Mobile (+44) (0)7739-875237
Office: (+44) (0)1962-817072


                                                                           
             Andi Kleen                                                    
             <ak@suse.de>                                                  
                                                                        To 
             31/07/2005              Mathieu Desnoyers                     
             23:03                   <compudj@krystal.dyndns.org>          
                                                                        cc 
                                     Andi Kleen <ak@suse.de>, Karim        
                                     Yaghmour <karim@opersys.com>, Masami  
                                     Hiramatsu                             
                                     <masami.hiramatsu@gmail.com>, Masami  
                                     Hiramatsu                             
                                     <hiramatu@sdl.hitachi.co.jp>, Roland  
                                     McGrath <roland@redhat.com>, Richard  
                                     J Moore/UK/IBM@IBMGB,                 
                                     systemtap@sources.redhat.com,         
                                     sugita@sdl.hitachi.co.jp, Satoshi     
                                     Oshima <soshima@redhat.com>,          
                                     michel.dagenais@polymtl.ca            
                                                                       bcc 
                                                                           
                                                                   Subject 
                                     Re: Hitachi djprobe mechanism         
                                                                           
                                                                           




On Sat, Jul 30, 2005 at 12:47:47PM -0400, Mathieu Desnoyers wrote:
> * Andi Kleen (ak@suse.de) wrote:
> > > As I see it, the write in memory is atomic, but not the instruction
fetching. In
> > > that case, the reader would see an inconsistent last jmp address
byte.
> >
> > Yes, you're right. cmpxchg only helps when the replaced instruction
> > is >= the new instruction. For smaller instructions only a IPI to
> > stop all CPUs works.
> >
>
> It was not exactly the point of my comment. If we try to overwrite an
existing
> instruction, without any marker, two cases may show up :
>
> * the instruction to replace is >= the jmp instruction (5 bytes)
>
> It has been suggested that using cmpxchg8 would solve this problem.
cmpxchg8
> does indeed commit 8 bytes of data to memory atomically, even on 32 bits
> architectures.
>
> My question is related to the instruction we want to replace : how is it
read by
> the CPU ? If it's 5 bytes in size, il has to be read in two chunks by the
cpu in
> a 32 bits arch. Does the CPU lock the memory bus between those two read ?

32bit ISA has nothing to do how the CPU fetches instructions
("32bit" x86s usually have a much wider memory interface)

In general these things are done on cache lines between 32 and 128 bytes
depending on the CPU. Of course cache lines can be crossed by instructions,
but the
CPU should handle that atomically.

However is no guarantee afaik for that in the architecture though so you
cannot
really rely on it. If let's say the 386 had this behaviour then it is
probably
safe to assume later x86s implement it too for compatibility (modulo bugs)

In practice it's more complicated. The CPU fetches the instruction
some time before actually executing it into its pipeline, and then sniffs
the bus for any modifications of it and then cancels and reexecutes the
instruction if needed.

However when you look at CPU errata sheets you will find quite a lot
of bugs in this area, so I would not really rely on frequent patching for
production.

I think just using the IPI is much simpler and easier.


> * the instruction to replace is < the jmp instruction (4 bytes or less)
>
> If our goal is to overwrite code which has not been surrounded by a
marker, an
> IPI wouldn't save us here. The marker is necessary in order to disable
> interruptions and make the IPI meaningful.

You lost me here.


>
>
> > Actually there may be tricks possible to first int3 (or equivalent
single
> > byte replacement on other archs) the second instruction,
> > then the first, then wait for a RCU period of all CPUs to quiescence
and then
> > write the longer jump. But an IPI is probably easier because it doesn't
need
> > a full disassembler for this and setting probes should not be
performance
> > critical.
> >
>
> Well, in fact, there is still a problem. (on no, not again!) ;)  The RCU
does
> require the reader to disable preemption, otherwise there is no guarantee
that
> they won't be scheduled out in the middle of the critical section, and
the RCU
> does only guarantee that a non schedulable reader will have finished by
the time
> the RCU period is over.
>
> How do you plan to disable unvolountary preemption around the
instructions you
> want to overwrite ?


One way would be to just search the task list for any tasks blocked with an
IP
inside the patched region. If yes rewait for another quiescent period.

-Andi


  parent reply	other threads:[~2005-08-01  8:44 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-27 21:05 Keshavamurthy, Anil S
2005-07-28  1:51 ` Karim Yaghmour
2005-07-28  2:10   ` Karim Yaghmour
2005-07-28 16:23     ` Masami Hiramatsu
2005-07-28 16:28       ` Karim Yaghmour
2005-07-28 17:36         ` Mathieu Desnoyers
     [not found]           ` <20050728110717.A30199@unix-os.sc.intel.com>
2005-07-28 18:33             ` Mathieu Desnoyers
     [not found]               ` <20050728133456.A32210@unix-os.sc.intel.com>
2005-07-28 23:53                 ` Richard J Moore
2005-07-29  5:59                 ` Mathieu Desnoyers
2005-07-29  7:55                   ` Andi Kleen
2005-07-29  8:44                     ` Richard J Moore
2005-07-29  8:46                       ` Andi Kleen
2005-07-29 15:51                     ` Mathieu Desnoyers
2005-07-30 15:55                       ` Andi Kleen
2005-07-30 16:54                         ` Mathieu Desnoyers
2005-07-31 22:03                           ` Andi Kleen
2005-07-31 23:11                             ` Mathieu Desnoyers
2005-08-01 15:37                               ` Andi Kleen
2005-08-01  8:44                             ` Richard J Moore [this message]
2005-08-01 13:21                               ` Mathieu Desnoyers
2005-08-01 19:57                               ` Satoshi Oshima
2005-08-01 20:21                                 ` Karim Yaghmour
2005-08-01 22:12                                   ` Satoshi Oshima
2005-08-01 22:54                                     ` Karim Yaghmour
2005-08-02 18:42                                       ` Satoshi Oshima
2005-08-03 14:50                                         ` Karim Yaghmour
2005-08-04  1:19                                         ` Mathieu Desnoyers
2005-08-04  3:31                                           ` Mathieu Desnoyers
2005-08-02  9:42                                   ` Mathieu Lacage
2005-08-02 15:09                                     ` Karim Yaghmour
2005-10-07 15:35                                     ` Richard J Moore
2005-10-08 18:33                                       ` mathieu lacage
2005-10-08 21:59                                         ` Richard J Moore
2005-10-08 23:24                                           ` Roland McGrath
2005-10-22 11:49                                             ` mathieu lacage
2005-10-22 22:09                                               ` Roland McGrath
2005-10-24  6:33                                                 ` Mathieu Lacage
2005-10-24 19:48                                                   ` Roland McGrath
     [not found]                                             ` <43621B0D.70204@sophia.inria.fr>
2005-11-07 10:04                                               ` mathieu lacage
2005-11-07 10:06                                                 ` mathieu lacage
2005-11-08  9:49                                             ` Richard J Moore
2005-10-09 16:47                                           ` mathieu lacage
2005-08-02 15:33                                   ` Mathieu Lacage
2005-08-02 15:36                                     ` Mathieu Lacage
2005-08-02 16:12                                     ` Karim Yaghmour
2005-08-02 16:30                                       ` Mathieu Lacage
2005-08-02 16:46                                         ` Karim Yaghmour
2005-08-04 17:09                                         ` Mathieu Lacage
2005-08-03 14:46                                 ` Andi Kleen
2005-07-29 16:06                   ` Frank Ch. Eigler
2005-07-29 18:24                     ` sugita
2005-07-28 18:13       ` Richard J Moore
  -- strict thread matches above, loose matches on Subject: below --
2005-08-01 22:49 Keshavamurthy, Anil S
2005-08-01 23:05 ` Karim Yaghmour
2005-08-01 23:18   ` Karim Yaghmour
2005-08-01 22:41 Keshavamurthy, Anil S
2005-08-02  3:21 ` Roland McGrath
2005-08-02  3:35   ` Karim Yaghmour
2005-08-01 20:46 Keshavamurthy, Anil S
2005-08-01 21:08 ` Karim Yaghmour
2005-08-01 16:14 Keshavamurthy, Anil S
2005-08-01 20:31 ` Roland McGrath
2005-08-04  0:28   ` Mathieu Desnoyers
2005-08-04 10:01     ` Andi Kleen
2005-08-05 16:25       ` Mathieu Desnoyers
2005-08-05 16:39         ` Andi Kleen
2005-08-01 15:50 Keshavamurthy, Anil S
2005-08-01 16:03 ` Mathieu Desnoyers
2005-07-29  0:18 Keshavamurthy, Anil S
2005-07-29  1:48 ` Karim Yaghmour
2005-07-29  3:41   ` Mathieu Desnoyers
2005-07-29  3:47     ` Karim Yaghmour
2005-07-29  1:53 ` Frank Ch. Eigler
2005-08-01  9:02   ` Mathieu Lacage
2005-08-01 13:18     ` Mathieu Desnoyers
2005-08-02  7:07       ` Mathieu Lacage
2005-07-22 18:09 Frank Ch. Eigler
2005-07-21 22:32 Richard J Moore
2005-07-21 22:52 ` Roland McGrath
2005-07-22  2:52   ` Richard J Moore
2005-07-26  7:14   ` Masami Hiramatsu
2005-07-26  7:53     ` Roland McGrath
2005-07-27 13:02       ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OF331D042E.CCADD212-ON41257050.002DC63A-41257050.002F8168@uk.ibm.com \
    --to=richardj_moore@uk.ibm.com \
    --cc=ak@suse.de \
    --cc=compudj@krystal.dyndns.org \
    --cc=hiramatu@sdl.hitachi.co.jp \
    --cc=karim@opersys.com \
    --cc=masami.hiramatsu@gmail.com \
    --cc=michel.dagenais@polymtl.ca \
    --cc=roland@redhat.com \
    --cc=soshima@redhat.com \
    --cc=sugita@sdl.hitachi.co.jp \
    --cc=systemtap@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).