From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9739 invoked by alias); 3 May 2019 11:15:37 -0000 Mailing-List: contact binutils-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sourceware.org Received: (qmail 9697 invoked by uid 89); 3 May 2019 11:15:36 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=AWL,BAYES_50,SPF_HELO_PASS,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 spammy=literally, upgraded, expertise, isoc X-HELO: userp2130.oracle.com Received: from userp2130.oracle.com (HELO userp2130.oracle.com) (156.151.31.86) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 03 May 2019 11:15:34 +0000 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x43B9MJj028369; Fri, 3 May 2019 11:15:17 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : references : date : in-reply-to : message-id : mime-version : content-type; s=corp-2018-07-02; bh=R02DdmhA5j1azosITRau0XsqKIEvohio3fZmoTXtJA8=; b=AUU5Kvn4o4vZis7aZ/vSiVkSE69KpDmLxJRCbh5X/2BHRtDGEFsiXPIvmaPcAsWxc+mB s77x7bpsRe5GaL3EHQKuMxc8HAQ9W9kXisDdelzyV3eiPRQOpqQO+9Q116OMH0CBj6xG npwe8SMIhOIFeQdFBlBg9wQtJV0kAYF+I++PGQoOpOe+AehcZ5hnrwbM1v4rfTOA83T2 ZIEluWEESLi4UNvJPlXnhTxdqzSMw4o4i87/zWPJilnNVsEM8iemb/jDWI1odHapqKrn Zy1mkEnVodicmXMEnAGjdunwtqWa1JsTezq5eHxjNKfNeI8BcktZY/ceU9TwbbCwBP5b Zw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2130.oracle.com with ESMTP id 2s6xhyp2h9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 03 May 2019 11:15:17 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x43BF7pt073927; Fri, 3 May 2019 11:15:16 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3020.oracle.com with ESMTP id 2s6xhhjmaq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 03 May 2019 11:15:16 +0000 Received: from abhmp0020.oracle.com (abhmp0020.oracle.com [141.146.116.26]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x43BFF4x001205; Fri, 3 May 2019 11:15:15 GMT Received: from loom (/81.187.191.129) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 03 May 2019 04:15:14 -0700 From: Nick Alcock To: Jim Wilson Cc: Nick Clifton , Binutils Subject: Re: [PATCH 01/19] include: new header ctf.h: file format description References: <20190430225706.159422-1-nick.alcock@oracle.com> <20190430225706.159422-2-nick.alcock@oracle.com> Date: Fri, 03 May 2019 11:15:00 -0000 In-Reply-To: (Jim Wilson's message of "Wed, 1 May 2019 14:29:25 -0700") Message-ID: <87ef5fwzzz.fsf@esperi.org.uk> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2019-05/txt/msg00110.txt.bz2 [Sorry about the response delay: two-day family thing.] On 1 May 2019, Jim Wilson spake thusly: > On Wed, May 1, 2019 at 9:57 AM Nick Clifton wrote: >> > +/* CTF format description. >> > + Copyright (C) 2004-2019 Free Software Foundation, Inc. >> >> Copyright starting from 2004, really ? > > Looks like CTF is part of dtrace which Oracle inherited from Sun. > Wikipedia tells me that the first release of dtrace was in Jan 2005, > so a 2004 copyright looks right if this is the original sources from > Sun subsequently modified by Oracle. Exactly. >> > +/* CTF - Compact ANSI-C Type Format >> ANSI-C ? Isn't everyone using ISO-C these days ? > > I was going to say the same thing. Historical naming wart. I'm happy to adjust it. (The original headers were inconsistent here and sometimes said ANSI-C and sometimes just C and sometimes just 'Compact' with no language at all! Only the last is definitely wrong.) >> Also - does this format explicitly exclude other languages like C++ or Go or Rust ? > > Apparently doesn't explicitly exclude them, it just doesn't explicitly > include them, and with only 64 possible type classes, it looks like > you could run out without some clever encoding for other languages. I had some ideas half an hour ago which should allow substantially more format flexibility without making the libctf codebase horrifically unreadable (in fact it should increase the readability of the codebase by dropping most of the casts in it): this would let us have not only an even more compact version of the ctf_stype_t for common C cases, but also a longer ctt_info word for non-C cases with, oh, is 2^32 type classes enough, or should I go to 2^64? ;) there will obviously be a slight cost in space, but not a large one. At this point I am mostly worried about the complexity of speccing things like C++ out. I'm fairly sure the format can expand to handle them in future (without breaking existing users) but I'm not so sure my brain can! A bigger question where multi-language support is concerned is whether we need to handle more than one language in a given hierarchy of CTF sections: in effect, allowing for multi-language translation units. This would mean we could deduplicate types together for different languages, but I doubt this would be useful for many language pairs (which would have largely distinct language-specific type kinds). It would increase compactness a bit more to say "dammit, if you have two languages in your project you should have two CTF section hierarchies", and come up with names like .ctf.cpp and .ctf.rust or something for the other languages. If we might handle additional languages in a one-language-per-container, we might want to reserve a word in the header to indicate language even though we don't plan to add any other languages yet, just to make it possible to add them in future without another backward-compatibility break. >> > +#define CTF_VERSION_3 4 >> > +#define CTF_VERSION CTF_VERSION_3 /* Current version. */ >> >> Hang on - so the value of CTF_VERSION_3 is 4 ? Does this mean that the >> full version number is 3.4, or 4.0 or just 4 ? I am a bit confused... > > Looks like there was a version 1+ which took number 2. > https://github.com/oracle/libdtrace-ctf/blob/master/include/sys/ctf.h#L149 The history is... complicated, and all my fault, I'm afraid. When we took libctf into the DTrace for Linux project, it was already at v2: v1 then was an ancient Sun-era thing which had literally nothing but the version number surviving in the codebase, much like you see above. I reset it to v1, but after a few years its limitations became fairly extreme: it only allowed 2^16 types in one program, only 998 members in any one structure or union, we were running out of type kinds, etc. So I introduced a v3... but v3 boosted the set of types to 2^32, thus changed the boundary between parent and child type IDs, since type-parenthood is indicated by the most significant bit in the type ID. We upgrade old formats to new ones in memory aggressively at open time to avoid duplicating codepaths for old formats, so this change in parent/child boundary would have required the backward-compatibility code to *renumber all the types* at the same time. This seemed excessive, given that CTF containers are read-only after creation, so an upgraded container couldn't ever have enough types in it for that renumbering to be necessary: but we needed to note the fact that the parent/child boundary was lower in some persistent form, in case the user opened an old container (upgrading it in the process), then wrote it back out again: we had to preserve the knowledge that this had once been a v1 container, with a v1 parent/child boundary, *somewhere*. So as a backward-compatibility hack I decreed that v1 when upgraded to v2 would gain the CTF_VERSION_1_UPGRADED_3 version number, which was interpreted as 'just like v2, except the parent/child boundary is like v1'. If I'd been starting from scratch, a family of feature flags or something might have been neater... but this works and the maintenance burden is minimal (one conditional to note the existeince of CTF_VERSION_1_UPGRADED_3 and set the parent/child boundary appropriately). > I don't have any expertise with CTF, I was just curious, so did a > little looking around for more info and found the version number > encoding. I also found a FreeBSD man page which has some useful intro > data. > https://www.freebsd.org/cgi/man.cgi?query=ctf&sektion=5&manpath=freebsd-release-ports Yep, that's the old v1 format all right (Sun format v2). Too small for some real projects, even in the presence of aggressive deduplication.