From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-502458-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 26999 invoked by alias); 6 Jun 2019 09:45:25 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 26723 invoked by uid 89); 6 Jun 2019 09:45:25 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy=
X-HELO: foss.arm.com
Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 06 Jun 2019 09:45:23 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 48EDB341;	Thu,  6 Jun 2019 02:45:19 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.39])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9D8893F690;	Thu,  6 Jun 2019 02:45:18 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Mail-Followup-To: Richard Biener <richard.guenther@gmail.com>,GCC Patches <gcc-patches@gcc.gnu.org>, richard.sandiford@arm.com
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [0/3] Improve debug info for addressable vars
References: <mpto93htief.fsf@arm.com>	<CAFiYyc2gQQxTuWK2MecXbTpkMNtRe_UeNX30+-QgSn44Tdparw@mail.gmail.com>	<mptpnnsnlyj.fsf@arm.com>	<CAFiYyc0wB-OM5FMyJ72J+6u7_WrDpp02+=0J4S6GH9h=BZUwYA@mail.gmail.com>
Date: Thu, 06 Jun 2019 09:45:00 -0000
In-Reply-To: <CAFiYyc0wB-OM5FMyJ72J+6u7_WrDpp02+=0J4S6GH9h=BZUwYA@mail.gmail.com>	(Richard Biener's message of "Thu, 6 Jun 2019 09:41:01 +0200")
Message-ID: <mptblzbhwsy.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-SW-Source: 2019-06/txt/msg00325.txt.bz2

Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Jun 5, 2019 at 4:30 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Richard Biener <richard.guenther@gmail.com> writes:
>> > On Sat, Jun 1, 2019 at 5:49 PM Richard Sandiford
>> > <richard.sandiford@arm.com> wrote:
>> >>
>> >> Taking the address of a variable stops us doing var-tracking on it,
>> >> so that we just use the DECL_RTL instead.  This can easily cause wrong
>> >> debug info for regions of code that would have had correct debug info
>> >> if the variable weren't addressable.  E.g.:
>> >>
>> >> {
>> >>   int base;
>> >>   get_start (&base);
>> >>   x[i1] = base;
>> >>   base += 1; // No need to store this
>> >>   x[i2] = base; // ...so the debug info for "base" is wrong here
>> >> }
>> >>
>> >> or (the motivating example):
>> >>
>> >> {
>> >>   int base;
>> >>   get_start (&base);
>> >>   for (int i = 0; i < n; ++i)
>> >>     {
>> >>       x[i] = base;
>> >>       base += y[i]; // Can apply LSM here, so the debug info for "base"
>> >>                     // in the loop is wrong
>> >>     }
>> >>   consume (&base);
>> >> }
>> >>
>> >> This patch series lets us use the DECL_RTL location for some parts of a
>> >> variable's lifetime and debug-bind locations for other parts:
>> >>
>> >> 1) Gimple uses "VAR s=> VAR" to bind VAR to its DECL_RTL.  The binding
>> >>    holds until overridden.
>> >>
>> >> 2) RTL does the same thing using:
>> >>
>> >>      (var_location VAR (decl_rtl_ref VAR))
>> >>
>> >>    where DECL_RTL_REF is a new rtx code that captures the DECL_RTL
>> >>    by reference rather than by value.
>> >>
>> >>    We can't just use "(var_location VAR (mem X))" for this, because
>> >>    that would bind VAR to the value that (mem X) has at that exact point.
>> >>    VAR would therefore get reset by any possible change to (mem X),
>> >>    whereas here we want it to track (possibly indirect) updates instead.
>> >>
>> >> 3) The gimplifier decides which variables should get the new treatment
>> >>    and emits "VAR s=> VAR" to mark the start of VAR's lifetime.
>> >>    Clobbers continue to mark the end of VAR's lifetime.
>> >>
>> >> 4) Stores to VAR implicitly reestablish the link between VAR and its
>> >>    DECL_RTL.  This is simpler (and IMO more robust) than inserting an
>> >>    explicit "VAR s=> VAR" at every write.
>> >>
>> >> 5) gsi_remove tries to insert "VAR => X" in place of a deleted "VAR = X",
>> >>    falling back to a "VAR => NULL" reset if that fails.
>> >>
>> >> Patch 1 handles the new rtl code, patch 2 adds the gimple framework,
>> >> and patch 3 uses it for LSM.
>> >
>> > So I wonder how it handles
>> >
>> > void __attribute__((noinline)) foo(int *p) { *p = 42; }
>> > int x;
>> > int main()
>> > {
>> >   int base = 1;
>> >   foo (&base);
>> >   base = 2;
>> >   *(x ? &x : &base) = 1; // (*)
>> >   return 0;
>> > }
>> >
>> > here we DSE the base = 2 store leaving a
>> >
>> > # DEBUG base = 2
>> >
>> > stmt?  But there's an indirect store that also stores
>> > to base - what will the debug info say at/after (*)?  Will
>> > it claim that base is 2?  At least I do not see that
>> > the connection with bases DECL_RTL is re-established?
>>
>> Yeah, true.
>>
>> > There's a clobber of base before return 0 so you eventually
>> > have to add some dummy stmt you can print base after
>> > the indirect store.
>> >
>> > That said, doesn't "aliasing" create another source of wrong-debug
>> > with your approach that might be even worse?
>>
>> Not sure about even worse, but maybe different.  In the example above
>> the patches fix the debug info after "base = 2" but break it after the
>> following statement.
>>
>> But there's no real need for the compiler to store to base in (*) either.
>
> Indeed partial dead code elim code sink the store into both arms and then
> remove the store in one of them.
>
>> We could end up with "if (...) x = 1;" instead.  So AFAICT there's no
>> guarantee that we'll get correct debug info at the return statement even
>> as things stand.
>>
>> For memory variables, I think we're always at the mercy of dead stores
>> being optimised away, and the patch isn't trying to fix that.
>
> Hmm, but you _do_ insert the debug stmts when we remove stores...

Right, but that's for examples like the ones in patch 2, where we have:

   base += 1;
   ... use base ...;

and base doesn't need to be stored back to memory after the addition.
The difference is that here the user didn't write dead code (because
the result of the addition is used), whereas in your example they did.

I.e. the reason for replacing a dead "VAR = X" with "# DEBUG VAR => X"
is that we should only be deleting the store if:

(a) all later uses of "VAR" can be/have been replaced by "X" or
(b) the calculation of "X" was never needed in the first place

(a) is something we should handle as well as we can, and I think
inserting the debug statement is the right thing to do there.
(b) seems like a lost cause.  E.g. for:

{
  int base;
  ...
  if (...)
    x = &base;
  *x = 1; // [A]
  *x = 2; // [B]
}

I don't think we're ever going to be able to make the store of 1 visible
at [B] in optimised code.

IMO it wouldn't be helpful to invalidate the debug info for all local
and global state that could alias "x" at [B] simply because [A] is
being deleted.  At least, I know I'd be frustrated if everything
suddenly became "optimised out" there. :-)

Since the original program never needed to do [A] in the first place,
I think it's OK if the effect of [A] isn't visible in the debug info.

>>  Since
>> both writes to base are dead in the above, I don't think we can guarantee
>> correct debug info without compromising optimisation for the sake of
>> debuggability.  (FWIW, I have a WIP patch to add an option for that,
>> hope to post an RFC soon.)
>>
>> I can't think of a case in which the patches introduce wrong debug
>> info for code that isn't dead.
>
> All the recent slew of wrong-debug bugs were exactly for dead code...

But dead user code?  Or dead gimple statements for necessary user code?

> I don't think that the difference of dead store vs dead SSA assignment
> is different from the user perception.  Now - I think that var-tracking
> could fix things up here once it sees aliasing stores.  The important
> difference to before is that now it suddenly deals with variables that
> are aliased while before that could not happen (besides spills/reloads).
> Can you quickly see what it would take to take this into account?
> For my testcase the debugger should say "optimized out" because
> we can't tell whether the store will actuall touch 'base', so refering
> to its memory location is wrong as well (since we elided the last store).

I think this will hurt cases without dead user code though, since
var-tracking can't be as exact as the gimple optimisers (no access
to PTA for one thing).

E.g. the gimple optimisers might be able to prove that: "*x" cannot
alias "base" in:

{
  base += ...;
  ... use base ...;
  *x += 1;
  ...
}

and so could delete the store to base.  But with the above, var-tracking
could needlessly invalidate the debug statement after "*x = ..." if it
doesn't have access to the same information.

Also, I think with that approach we'd often end up being conservatively
correct and resetting the debug info at the end of the block, since it
would likely be too expensive to track global availability for each
individual memory variable.

Thanks,
Richard