From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id D18E6385841D for ; Mon, 20 Dec 2021 11:51:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D18E6385841D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=lyken.rs Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=lyken.rs Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id D507732007D7; Mon, 20 Dec 2021 06:51:13 -0500 (EST) Received: from imap42 ([10.202.2.92]) by compute3.internal (MEProxy); Mon, 20 Dec 2021 06:51:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lyken.rs; h= mime-version:message-id:in-reply-to:references:date:from:to:cc :subject:content-type; s=fm2; bh=GKfVUNLIpnsO/VtqOnLEXgiEd2Pk2qD AtlyRII8PVdU=; b=ROfp3uydfzsL+hcPO0iH9o2EyFltQVl04v5ndlUAv4W4elu OLpOeYscDrjYCcHnOYIJFKaPVmB6irWQrvCPAUB1XeOmffaVp3VnCb9STUvWRlDI IDMquGxefwrpgTWRLdpqHEv2B6OVeBYdcYGqtLloM6sYw3OiThfaVpA6agaG+d3e Sg2n45t7bOSjWCNvKX/xY7UG3e6csP15UBlQ7DxnGdGi6mJpGTRUbM+1Reawk70F M/oPKiTDNtSJ5YaJ7Bi7trkn6jq5o9ycm4qQylbsBOfzx43YANhikN4CLOZ3yZlt beNC/p4U7TQWjqT0TCHC+mCtrGzBv0tGX75kBsQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=GKfVUN LIpnsO/VtqOnLEXgiEd2Pk2qDAtlyRII8PVdU=; b=l0hh1rO1aalWQgRv9CiA/h 7EVj+8v50cr1IxIrNAifxk+tU1ABYFw4TxR+OvYlw9ZGr8wKI0SJLEMqIqk7bEFe umE0T0bQ8dA3qPTF2PjGD6RTleFggOk7WbigkTuhNOD396F6fNwsyLiHEbmqTq0u bNCpUgHidW339hqr468SHBx+S/ziXEMYizwA8ccN4lFth3mR/FGs4YNbzxwoTFpI iOztEpG+YNCKc7DpE/+5wnZ5xeXcrqtBDV2inqdgTicFcnNNxP8TGaw+2DfNMnkb fxE2cCPhO/neKdki9Jdg4duE8bnOI3c2tC2fuh7ytS/eovK4OK4bHaZiEnkiepAQ == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvuddruddtvddgfeegucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvffutgesthdtredtreertdenucfhrhhomhepfdfgughu rghrugdqofhihhgrihcuuehurhhtvghstghufdcuoegvugguhigssehlhihkvghnrdhrsh eqnecuggftrfgrthhtvghrnhepleevleevleefffeggedtheejueffgfffgeetfeffgeet jeehtedvgfekiedtteeunecuffhomhgrihhnpehgihhthhhusgdrtghomhdplhhisgdrrh hsnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepvggu ugihsgeslhihkhgvnhdrrhhs X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id EABEC2180078; Mon, 20 Dec 2021 06:51:12 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.5.0-alpha0-4524-g5e5d2efdba-fm-20211214.001-g5e5d2efd Mime-Version: 1.0 Message-Id: <5b524f65-6bfd-44cf-bea5-9f19e1ced18f@www.fastmail.com> In-Reply-To: References: <20211202171713.15454-1-mark@klomp.org> <003e2dd7-5efa-4e81-a578-1d031ab5eee5@www.fastmail.com> Date: Mon, 20 Dec 2021 13:50:52 +0200 From: "Eduard-Mihai Burtescu" To: "Mark Wielaard" Cc: gcc-patches@gcc.gnu.org, "Nick Nethercote" Subject: Re: [PATCH] libiberty rust-demangle, ignore .suffix Content-Type: text/plain X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Dec 2021 11:51:19 -0000 Apologies for the delay, the email fell through the cracks somehow. The updated patch looks like it would work alright, only needs a couple tests, e.g.: https://github.com/rust-lang/rustc-demangle/blob/2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/src/lib.rs#L413-L422 https://github.com/rust-lang/rustc-demangle/blob/2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/src/v0.rs#L1442-L1444 Thanks, - Eddy B. On Tue, Dec 7, 2021, at 21:16, Mark Wielaard wrote: > Hi Eddy, > > On Fri, 2021-12-03 at 01:14 +0200, Eduard-Mihai Burtescu wrote: >> On Fri, Dec 3, 2021, at 00:07, Mark Wielaard wrote: >> > On Thu, Dec 02, 2021 at 07:35:17PM +0200, Eduard-Mihai Burtescu >> > wrote: >> > > That also means that for consistency, suffixes like these should >> > > be >> > > handled uniformly for both v0 and legacy (as rustc-demangle >> > > does), >> > > since LLVM doesn't distinguish. >> > >> > The problem with the legacy mangling is that dot '.' is a valid >> > character. That is why the patch only handles the v0 mangling case >> > (where dot '.' isn't valid). >> >> Thought so, that's an annoying complication - however, see later down >> why that's still not a blocker to the way rustc-demangle handles it. >> >> > > You may even be able to get Clang to generate C++ mangled symbols >> > > with ".llvm." suffixes, with enough application of LTO. This is >> > > not >> > > unlike GCC ".clone" suffixes, AFAIK. Sadly I don't think there's >> > > a >> > > way to handle both as "outside the symbol", without hardcoding >> > > ".llvm." in the implementation. >> > >> > We could use the scheme used by c++ where the .suffix is added as " >> > [clone .suffix]", it even handles multiple dots, where something >> > like >> > _Z3fooi.part.9.165493.constprop.775.31805 >> > demangles to >> > foo(int) [clone .part.9.165493] [clone .constprop.775.31805] >> > >> > I just don't think that is very useful and a little confusing. >> >> Calling it "clone" is a bit weird, but I just checked what rustc- >> demangle >> does for printing suffixes back out and it's not great either: >> - ".llvm." (and everything after it) is completely removed >> - any left-over suffixes (after demangling), if they start with ".", >> are >> not considered errors, but printed out verbatim after the >> demangling >> >> > > I don't recall the libiberty demangling API having any provisions >> > > for the demangler deciding that a mangled symbol "stops early", >> > > which would maybe allow for a more general solution. >> > >> > No, there indeed is no interface. We might introduce a new option >> > flag >> > for treating '.' as end of symbol. But do we really need that >> > flexibility? >> >> That's not what I meant - a v0 or legacy symbol is self-terminating >> in >> its parsing (or at the very least there are not dots allowed outside >> of a length-prefixed identifier), so that when you see the start of >> a valid mangled symbol, you can always find its end in the string, >> even when that end is half-way through (and is followed by suffixes >> or any other unrelated noise). >> >> What I was imagining is a way to return to the caller the number of >> chars from the start of the original string, that were demangled, >> letting the caller do something else with the rest of that string. >> (see below for how rustc-demangle already does something similar) >> >> > > Despite all that, if it helps in practice, I would still not mind >> > > this patch landing in its current form, I just wanted to share my >> > > perspective on the larger issue. >> > >> > Thanks for that. Do you happen to know what other rust demanglers >> > do? >> >> rustc-demangle's internal API returns a pair of the demangler and the >> "leftover" parts of the original string, after the end of the symbol. >> You can see here how that suffix is further checked, and kept: >> https://github.com/rust-lang/rustc-demangle/blob/2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/src/lib.rs#L108-L138 > > Yes, returning a struct that returns the style detected, the demangled > string and the left over chars makes sense. But we don't have an > interface like that at the moment, and I am not sure we (currently) > have users who want this. > >> As mentioned above, ".llvm." is handled differently, just above the snippet >> linked - perhaps it was deemed too common to let it pollute the output. > > But that also makes it a slightly odd interface. I would imagine that > people would be interested in the .llvm. part. Now that just gets > dropped. > > Since we don't have an interface to return the suffix (and I find the > choice of dropping .llvm. but not other suffixes odd), I think we > should just simply always drop the .suffix. I understand now how to do > that for legacy symbols too, thanks for the hints. > > See the attached update to the patch. What do you think? > > Thanks, > > Mark > > Attachments: > * 0001-libiberty-rust-demangle-ignore-.suffix.patch