public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Eduard-Mihai Burtescu" <eddyb@lyken.rs>
To: "Nikhil Benesch" <nikhil.benesch@gmail.com>
Cc: "Ian Lance Taylor" <iant@google.com>,
	"Ian Lance Taylor" <ian@airs.com>,
	gcc-patches <gcc-patches@gnu.org>
Subject: Re: [PATCH] Support the new ("v0") mangling scheme in rust-demangle.
Date: Sun, 01 Nov 2020 13:57:37 +0200	[thread overview]
Message-ID: <81a4b570-a43e-4dab-a678-452fa9805699@www.fastmail.com> (raw)
In-Reply-To: <a043b615-d422-78c4-0b59-52cc9b2af2dc@gmail.com>

Reading the diff patch, the v0 changes look great. I wouldn't be too worried
about the "printable character" aspect, there are similar Unicode-related
issues elsewhere, e.g. the "non-control ASCII" comment in decode_legacy_escape
(I suppose we could make it also go through the "print a non-control ASCII
character or some escape sequence" logic you added, if you think that helps).

However, I'm not sure about the legacy changes. Or rather, the .llvm. one, it's
not really Rust-specific, it's only in the rustc-demangle crate for convenience,
but C++ code compiled with Clang could run into the same problem - ideally,
stripping that suffix could be done uniformly in cplus-dem.c, but I decided
against making that change myself, for now.

I'm especially not comfortable removing the fast path, since that was the
condition under which I was able to make Rust demangling be attempted first,
before C++, in order to implement the Rust legacy demangling standalone,
separately from C++ demangling, so that it could be together with the v0 one.

It should be possible to keep the fast path if stripping .llvm.* suffixes is
done before either Rust or C++ demangling is attempted, but even if that would
be nice to have, IMO it should be a separate patch and not block v0 demangling.

As for the dataset, it doesn't include .llvm. because it's collected by rustc
itself, before LLVM had any chance to add any suffixes. This was done in order
to have several different mangling formats dumped at once for every symbol.
(It also contains symbols for e.g. functions that LLVM would've inlined away)

I can test the patch and upload the dataset tomorrow, but if you want to get
something committed sooner (is there a deadline for the next release?), feel
free to land the v0 changes (snprintf + const values) without the legacy ones.

Thanks,
- Eddy B.

On Sun, Nov 1, 2020, at 11:18, Nikhil Benesch wrote:
> 
> 
> On 10/29/20 12:16 AM, Nikhil Benesch wrote:
> > On Wed, Oct 28, 2020 at 7:53 PM Eduard-Mihai Burtescu <eddyb@lyken.rs> wrote:
> >> I agree that landing the extra changes later on top should be fine anyway,
> >> though when I make the sprintf -> snprintf change, I could provide the extra
> >> changes as well (either as a combined patch, or as a second tiny patch).
> >>
> >> Sadly I don't think I can get to either until next week, hope that's okay.
> > 
> > I can make the sprintf -> snprintf change as early as tomorrow.
> > 
> > I suspect I can also make the substantive const generics change, based
> > on the Rust implementation, though that's less of a promise.
> 
> Attached is an updated patch with both of the aforementioned changes. 
> The demangling of constants matches the demangling in rustc-demangle 
> library as best as I could manage. The strategy of demangling constant 
> chars via `format!("{:?}", char)` in Rust makes life hard for us in C, 
> because there is no easy way to replicate Rust's debug formatting for 
> chars. (Unless GCC has a Unicode library handy somewhere.)
> 
> In the course of this work I noticed several inconsistencies in how 
> rustc-demangle and libiberty were demangling some legacy Rust symbols. I 
> fixed those and added test cases, though the fixes required removing 
> some of the fast checks for whether a given symbol is a valid Rust symbol.
> 
> For ease of review, eddyb, I also attached the diff from your last diff 
> to mine. Since I don't have your collection of symbols I generated my 
> own by running nm on the materialized binary from
> https://github.com/MaterializeInc/materialize.
> 
> Let me know how I can help. I'm happy to address review feedback myself, 
> or I'm happy to hand things back over to you, eddyb.
> 
> Nikhil
> 
> Attachments:
> * rust-demangle-diff.patch
> * rust-demangle.patch

  reply	other threads:[~2020-11-01 11:58 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-13 20:28 Eduard-Mihai Burtescu
2020-03-23  1:34 ` Eduard-Mihai Burtescu
2020-10-28 21:22   ` Nikhil Benesch
2020-10-28 21:25     ` Nikhil Benesch
2020-10-28 21:47       ` Eduard-Mihai Burtescu
2020-10-28 22:21         ` Nikhil Benesch
2020-10-28 22:45           ` Ian Lance Taylor
2020-10-28 23:52             ` Eduard-Mihai Burtescu
2020-10-29  4:16               ` Nikhil Benesch
2020-11-01  9:18                 ` Nikhil Benesch
2020-11-01 11:57                   ` Eduard-Mihai Burtescu [this message]
2020-11-01 17:26                     ` Nikhil Benesch
2020-11-06 17:09                       ` Jeff Law
2020-11-13  6:42                         ` Nikhil Benesch
2020-11-13 13:34                           ` Eduard-Mihai Burtescu
2021-04-20 15:55                           ` Andreas Schwab
2021-04-21 11:24                             ` Jakub Jelinek
2020-04-06 21:52 ` Eduard-Mihai Burtescu
2020-04-13  2:52   ` Eduard-Mihai Burtescu
2020-04-23 18:52     ` Eduard-Mihai Burtescu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=81a4b570-a43e-4dab-a678-452fa9805699@www.fastmail.com \
    --to=eddyb@lyken.rs \
    --cc=gcc-patches@gnu.org \
    --cc=ian@airs.com \
    --cc=iant@google.com \
    --cc=nikhil.benesch@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).