From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-116112-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 3561 invoked by alias); 23 Sep 2014 18:11:40 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 3551 invoked by uid 89); 23 Sep 2014 18:11:39 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2
X-HELO: relay1.mentorg.com
Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 23 Sep 2014 18:11:37 +0000
Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com)	by relay1.mentorg.com with esmtp 	id 1XWUYn-0001pW-5b from Maciej_Rozycki@mentor.com ; Tue, 23 Sep 2014 11:11:33 -0700
Received: from localhost (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server (TLS) id 14.3.181.6; Tue, 23 Sep 2014 19:11:31 +0100
Date: Tue, 23 Sep 2014 18:11:00 -0000
From: "Maciej W. Rozycki" <macro@codesourcery.com>
To: Pedro Alves <palves@redhat.com>
CC: <gdb-patches@sourceware.org>, Luis Machado <lgustavo@codesourcery.com>
Subject: Re: [PATCH] Avoid software breakpoint's instruction shadow inconsistency
In-Reply-To: <5421B1B3.7010106@redhat.com>
Message-ID: <alpine.DEB.1.10.1409231850520.27075@tp.orcam.me.uk>
References: <alpine.DEB.1.10.1409140447170.27075@tp.orcam.me.uk> <5421B1B3.7010106@redhat.com>
User-Agent: Alpine 1.10 (DEB 962 2008-03-14)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
X-SW-Source: 2014-09/txt/msg00703.txt.bz2

On Tue, 23 Sep 2014, Pedro Alves wrote:

> >  This change:
> > 
> > commit b775012e845380ed4c7421a1b87caf7bfae39f5f
> > Author: Luis Machado <luisgpm@br.ibm.com>
> > Date:   Fri Feb 24 15:10:59 2012 +0000
> > 
> >     2012-02-24  Luis Machado  <lgustavo@codesourcery.com>
> > 
> >     	* remote.c (remote_supports_cond_breakpoints): New forward
> >     	declaration.
> > [...]
> > 
> > changed the way breakpoints are inserted and removed such that 
> > `insert_bp_location' can now be called with the breakpoint being handled 
> > already in place, while previously the call was only ever made for 
> > breakpoints that have not been put in place.  This in turn caused an issue 
> > for software breakpoints and targets for which a breakpoint's 
> > `placed_address' may not be the same as the original requested address.
> > 
> >  The issue is `insert_bp_location' overwrites the previously adjusted 
> > value in `placed_address' with the original address, that is only replaced 
> > back with the correct adjusted address later on when 
> > `gdbarch_breakpoint_from_pc' is called.  Meanwhile there's a window where 
> > the value in `placed_address' does not correspond to data stored in 
> > `shadow_contents', leading to incorrect instruction bytes being supplied 
> > when `one_breakpoint_xfer_memory' is called to supply the instruction 
> > overlaid by the breakpoint.
> > 
> >  And this is exactly what happens on the MIPS target with software 
> > breakpoints placed in microMIPS code.  There not only `placed_address' is 
> > not the original address because of the ISA bit, but also 
> > `mips_breakpoint_from_pc' has to read the original instruction to 
> > determine which one of the two software breakpoint instruction encodings 
> > to choose.  The 16-bit encoding is used to replace 16-bit instructions and 
> > similarly the 32-bit one is used with 32-bit instructions, to satisfy 
> > branch delay slot size requirements.
> > 
> >  The mismatch between `placed_address' and the address data in 
> > `shadow_contents' has been obtained from leads to the wrong encoding being 
> > used in some cases, which in the case of a 32-bit software breakpoint 
> > instruction replacing a 16-bit instruction causes corruption to the 
> > adjacent following instruction and leads the debug session astray if 
> > execution reaches there e.g. with a jump.
> > 
> >  To address this problem I propose the change below, that adds a 
> > `reqstd_address' field to `struct bp_target_info' and leaves 
> > `placed_address' unchanged once it has been set.  This ensures data in 
> > `shadow_contents' is always consistent with `placed_address'.
> > 
> >  This approach also has this good side effect that all the places that 
> > examine the breakpoint's address see a consistent value, either 
> > `reqstd_address' or `placed_address', as required.  Currently some places 
> > see either the original or the adjusted address in `placed_address', 
> > depending on whether they have been called before 
> > `gdbarch_remote_breakpoint_from_pc' or afterwards.  This is in particular 
> > true for subsequent calls to `gdbarch_remote_breakpoint_from_pc' itself, 
> > e.g. from `one_breakpoint_xfer_memory'.
> 
> ITYM gdbarch_breakpoint_from_pc, as there's no call to
> gdbarch_remote_breakpoint_from_pc in one_breakpoint_xfer_memory.

 Indeed, a cut & paste typo there, sorry.

> It's totally fine to call gdbarch_breakpoint_from_pc on an already
> adjusted address.  That method naturally has to be idempotent.  It'll
> just return without adjusting anything, as the input address must already
> be a fine address to place a breakpoint, otherwise the first call that
> actually adjusted the address when the breakpoint was first
> inserted wouldn't have returned it in the first place.

 The thing is it can't, because on MIPS targets the ISA bit can be the 
only place where the breakpoint type requirement is encoded -- if you set 
a breakpoint by address in code that has no symbol information, e.g.:

(gdb) break *0x87654321

is not the same as:

(gdb) break *0x87654320

and consequently if there's no symbol information available for either of 
0x87654321 or 0x87654320, then `mips_breakpoint_from_pc' will return a 
microMIPS (or MIPS16, as determined elsewhere) breakpoint for the value of 
0x87654321 in `placed_address' and a standard MIPS breakpoint for the 
value of 0x87654320 there.  So the value of 0x87654321 has to be stored 
somewhere and subsequent calls to `mips_breakpoint_from_pc' have to see it 
again (and convert to 0x87654320 in `placed_address').

 Please note that this is not a theoretical or corner case, because we can 
easily use breakpoints in code with no symbol information (e.g. 
system-installed shared libraries; dynamic symbols associated with 
exported entry points will likely not cover all the code) when 
single-stepping with a software watchpoint enabled.

> Might be worth it to add an assertion to the effect, just for
> code clarity.

 So not an option, sorry.

> > This is also important for places
> > like `find_single_step_breakpoint' where a breakpoint's address is 
> > compared to the raw value of $pc.
> > 
> 
> AFAICS, insert_single_step_breakpoint also doesn't do
> gdbarch_breakpoint_from_pc, so there doesn't seem to be a mismatch.

 The problem is `find_single_step_breakpoint' is called in the ordinary 
breakpoint removal path too, so that if a single-step breakpoint is placed 
at the same address, it is retained.  I saw this place do bad things in 
testing my change before I adjusted it to use `reqstd_address'.

> In any case, find_single_step_breakpoint and insert_single_step_breakpoint
> and related deprecated bits are scheduled for deletion
> very soon: https://sourceware.org/ml/gdb-patches/2014-09/msg00242.html

 The relevant parts of my change can easily be removed if they do not make 
it beforehand.

> Before adding more fields, I'd like to first consider something like this:
> 
> static int
> insert_bp_location (struct bp_location *bl,
> 		    struct ui_file *tmp_error_stream,
> 		    int *disabled_breaks,
> 		    int *hw_breakpoint_error,
> 		    int *hw_bp_error_explained_already)
> {
>   enum errors bp_err = GDB_NO_ERROR;
>   const char *bp_err_message = NULL;
>   volatile struct gdb_exception e;
> 
>   if (!should_be_inserted (bl) || (bl->inserted && !bl->needs_update))
>     return 0;
> 
>   /* Note we don't initialize bl->target_info, as that wipes out
>      the breakpoint location's shadow_contents if the breakpoint
>      is still inserted at that location.  This in turn breaks
>      target_read_memory which depends on these buffers when
>      a memory read is requested at the breakpoint location:
>      Once the target_info has been wiped, we fail to see that
>      we have a breakpoint inserted at that address and thus
>      read the breakpoint instead of returning the data saved in
>      the breakpoint location's shadow contents.  */
> -  bl->target_info.placed_address = bl->address;
> -  bl->target_info.placed_address_space = bl->pspace->aspace;
> -  bl->target_info.length = bl->length;
> + if (!bl->inserted)
> + {
> +  bl->target_info.placed_address = bl->address;
> +  bl->target_info.placed_address_space = bl->pspace->aspace;
> +  bl->target_info.length = bl->length;
> + }
> 
> Doesn't what work?  Note the comment above already talking about
> related concerns.  If we're reinserting a breakpoint to update
> its conditions, we're certainly inserting it at the exact
> same address, so no need to touch placed_address at all.

 That's actually what I tried first (being lazy and trying to save myself 
from extra work) before I realised that wouldn't work due to the issue 
above.

  Maciej