From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 82387 invoked by alias); 11 Nov 2018 09:06:39 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 82373 invoked by uid 89); 11 Nov 2018 09:06:38 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-22.7 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=traced, descriptions, ptwrite, sk:handle_ X-HELO: mail-lj1-f182.google.com Received: from mail-lj1-f182.google.com (HELO mail-lj1-f182.google.com) (209.85.208.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 11 Nov 2018 09:06:36 +0000 Received: by mail-lj1-f182.google.com with SMTP id v15-v6so5001016ljh.13 for ; Sun, 11 Nov 2018 01:06:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Lil2SdVU3SATCy7Ifaia6b1eTC3UQQaeKQuCFXTIuA8=; b=uHNkvsFljK482NaMVeho66/rGjCQ4hyXNspEHPtuKJe6u6jo5IKttk3tw1pRE5w2Ju b8sONiTSv76tsWd7CVJNcxpMN0agV3BegIvdbzWPrwa4AzDkjalKr8zxMN1TtSaWiKJ+ kji0JqjmZt5meaoVoDj2e7VLO2wA0S4zq6Nk6zDTbIR8Ozq9gIcqw+QEwneFSDesmHS7 9DMfVa24wGi13O18rf8xsmVZQv6AfJ8aQs4hoL89QJFUaOrW7GFE4DqMJjtf3Yx9Hnpn f1l0LNb0OVPrgKuG7J3xyJraCP9TRq5dWOy5q8AhjINNWjvdQJoaet49PEq0wR6OqisO kDUg== MIME-Version: 1.0 References: <20181104063235.6914-1-andi@firstfloor.org> <20181104063235.6914-2-andi@firstfloor.org> <20181109181837.GB6218@tassilo.jf.intel.com> In-Reply-To: <20181109181837.GB6218@tassilo.jf.intel.com> From: Richard Biener Date: Sun, 11 Nov 2018 09:06:00 -0000 Message-ID: Subject: Re: [PATCH 2/3] Add a pass to automatically add ptwrite instrumentation To: Andi Kleen Cc: Andi Kleen , GCC Patches Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2018-11/txt/msg00844.txt.bz2 On Fri, Nov 9, 2018 at 7:18 PM Andi Kleen wrote: > > Hi Richard, > > On Fri, Nov 09, 2018 at 04:27:22PM +0100, Richard Biener wrote: > > > Passes bootstrap and test suite on x86_64-linux, also > > > bootstrapped and tested gcc itself with full -fvartrace > > > and -fvartrace-locals instrumentation. > > > > So how is this supposed to be used? I guess in a > > edit-debug cycle and not for production code? > > It can actually be used for production code. > > When processor trace is disabled the PTWRITE > instructions acts as nops. So it's only increasing > the code foot print. Since the instrumentation > should only log values which are already computed > it normally doesn't cause any other code. > > Even when it is enabled the primary overhead is the > additional memory bandwidth, since the CPU can > do the logging in parallel to other code. As long > as the instrumentation is not too excessive to generate > too much memory bandwidth, it might be actually > quite reasonable to even keep the logging on for > production code, and use it as a "flight recorder", > which is dumped on failures. I see. > This would also be the model in gdb, once we have support > in it. You would run the program in the debugger > and it just logs the data to a memory buffer, > but when stopping the value history can be examined. Hmm, so the debugger still needs to relate the ptwrite instruction with the actual variable the data is for. I suppose practically this means that var-tracking needs to be able to compute a location list for a variable that happens to overlap with the stored value? That is, usually debuggers look for a location list of a variable and find, say, %rax. But for ptwrite the debugger needs to examine all active location lists for, say, %rax and figure out that it contains the value for variable 'a'? When there isn't any such relation between the ptwrite stored value and any variable the ptwrite is useless, right? > There's also some ongoing work to add (optional) support > for PT to Linux crash dumps, so eventually that will > work without having to always run the debugger. > > Today it can be done by running perf in the background > to record the PT, however there the setup is a bit > more complicated. > > The primary use case I was envisioning was to set > the attribute on some critical functions/structures/types > of interest and then have a very overhead logging > option for them (generally cheaper than > equivalent software instrumentation). And then > they automatically gets logged without the programmer > needing to add lots of instrumentation code to > catch every instance. So think of it as a > "hardware accelerated printf" > > > > > What do you actually write with PTWRITE? I suppose > > you need to keep a ID to something mapping somewhere > > so you can make sense of the perf records? > > PTWRITE writes 32bit/64bit values. The CPU reports the > IP of PTWRITE in the log, either explicitely, or implicitely if branch > trace is enabled too. The IP can then be used to look up > the DWARF scope for that IP. Then the decoder > decodes the operand of PTWRITE and maps it back using > the dwarf information. So it all works using > existing debugger infrastructure, and a quite simple > instruction decoder. > > I'll clarify that in the description. > > > > > Few comments inline below, but I'm not sure if this > > whole thing is interesting for GCC (as opposed to being > > a instrumentation plugin) > > I'm biased, but I think automatic data tracing is a very exciting > use case, so hopefully it can be considered for mainstream gcc. > > > > handle_no_profile_instrument_function_attribute, > > > NULL }, > > > @@ -767,6 +775,21 @@ handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int, > > > return NULL_TREE; > > > } > > > > > > +/* Handle "vartrace"/"no_vartrace" attributes; arguments as in > > > + struct attribute_spec.handler. */ > > > + > > > +static tree > > > +handle_vartrace_attribute (tree *node, tree, tree, int flags, > > > + bool *) > > > +{ > > > + if (TYPE_P (*node) && !(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) > > > + *node = build_variant_type_copy (*node); > > > > I don't think you want the attribute on types. As far as I understood your > > descriptions it should only be on variables and functions. > > The idea was that it's possible to trace all instances of a type, > especially structure members. Otherwise it will be harder for > the programmer to hunt down every instance. > > For example if I have a structure that is used over a program, > and one member gets the wrong value. > > I can do then in the header: > > struct foo { > int member __attribute__(("vartrace")); > }; > > and then recompile the program. Every instance of writing to > member will then be automatically instrumented (assuming > the program stays type safe) > > Makes sense? OK. The user documentation should be more elaborate here. > [BTW I considered adding an address trace > too for pointer writes to hunt down the non type safe > instances and possibly some other use cases. > That might be possible follow on work] > > > > + > > > #undef TARGET_GIMPLIFY_VA_ARG_EXPR > > > #define TARGET_GIMPLIFY_VA_ARG_EXPR ix86_gimplify_va_arg > > > > > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > > > index 1eca009e255..08286aa4591 100644 > > > --- a/gcc/doc/extend.texi > > > +++ b/gcc/doc/extend.texi > > > @@ -3193,6 +3193,13 @@ the standard C library can be guaranteed not to throw an exception > > > with the notable exceptions of @code{qsort} and @code{bsearch} that > > > take function pointer arguments. > > > > > > +@item no_vartrace > > > +@cindex @code{no_vartrace} function or variable attribute > > > +Disable data tracing for the function or variable or structured field > > > +marked with this attribute. Applies to types. Currently implemented > > > +for x86 when the @option{ptwrite} target option is enabled for systems > > > +that support the @code{PTWRITE} instruction. > > > > How does it apply to types? > > Same as it would apply to variables or functions. > > So when the whole file or the whole function is traced instances > of the marked type will not be traced. > > > > @@ -11933,6 +11933,10 @@ Address Sanitizer shadow memory address. NULL if Address Sanitizer is not > > > supported by the target. > > > @end deftypefn > > > > > > +@deftypefn {Target Hook} tree TARGET_VARTRACE_FUNC (tree @var{type}) > > > +Return a builtin to call to trace variables or NULL if not supported by the target. > > > > Please elaborate on the required signature of the builtin, its > > arguments and semantics. > > Is this really expected to be similar enough across architectures to make this a > > middle-end feature rather than a target specific isntrumentation thing > > in md-reorg or so? > > I'm not aware of any other architecture having a PTWRITE equivalent today, > but I would assume if one adds one it would look similar. There > are already other architectures that have a Processor Trace equivalent, > like ARM and MIPS. > > Yes could move it into config/i386, but I assumed the concept > was generic enough for the generic middle end?` > > > > @hook TARGET_ATOMIC_TEST_AND_SET_TRUEVAL > > > diff --git a/gcc/passes.def b/gcc/passes.def > > > index 24f212c8e31..518cb4ef6f7 100644 > > > --- a/gcc/passes.def > > > +++ b/gcc/passes.def > > > @@ -179,6 +179,7 @@ along with GCC; see the file COPYING3. If not see > > > NEXT_PASS (pass_oacc_device_lower); > > > NEXT_PASS (pass_omp_device_lower); > > > NEXT_PASS (pass_omp_target_link); > > > + NEXT_PASS (pass_vartrace); > > > > Wow, that's early. Reasoning for the placement before post-IPA optimizations? > > I was hoping my instrumentation would be optimized too, > and also do the instrumentation nearer the original > code. But yes perhaps it should be later and that might > help with the occasionally redundant PTWRITEs which > are generated today. > > Any suggestions where it should be? I would have picked a location right before RTL expansion. OTOH given the likely restriction with regarding to useful debug info suggested above a place between var-tracking and late debug generation seems best? That also avoids register allocation side-effects (the ptwrite is probably an UNSPEC?). There's conveniently the md-reorg pass in that area but if other targets can do sth similar then a common pass working might be good as well. I hope you don't mind if this eventually slips to GCC 10 given as you say there is no HW available right now. (still waiting for a CPU with CET ...) Thanks, Richard. > Thanks for the useful comments. I'll work on that and repost. > > -Andi >