From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7797 invoked by alias); 30 Sep 2015 22:23:34 -0000 Mailing-List: contact kawa-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: kawa-owner@sourceware.org Received: (qmail 7788 invoked by uid 89); 30 Sep 2015 22:23:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: aibo.runbox.com Received: from aibo.runbox.com (HELO aibo.runbox.com) (91.220.196.211) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Wed, 30 Sep 2015 22:23:32 +0000 Received: from [10.9.9.209] (helo=mailfront04.runbox.com) by bars.runbox.com with esmtp (Exim 4.71) (envelope-from ) id 1ZhPmb-0007CI-MR for kawa@sourceware.org; Thu, 01 Oct 2015 00:23:29 +0200 Received: from 70-36-239-58.dsl.dynamic.fusionbroadband.com ([70.36.239.58] helo=toshie.bothner.com) by mailfront04.runbox.com with esmtpsa (uid:757155 ) (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.76) id 1ZhPmB-0001HG-U6 for kawa@sourceware.org; Thu, 01 Oct 2015 00:23:04 +0200 To: Kawa mailing list From: Per Bothner Subject: generated class names now "mangled" differently Message-ID: <560C60C3.2080701@bothner.com> Date: Wed, 30 Sep 2015 22:23:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2015-q3/txt/msg00098.txt.bz2 Summary: The way Kawa generates class names has changed, so fewer characters are "mangled" (encoded) when dealing with disallowed characters. The Kawa compiler generates classes that execute on the JVM. Generated classes include those defined by define-simple-class and define-class; ones generated by a define-library; and the "module class" generated for each source file. Each class has to have a name, and there are certainly restrictions on the characters in the name; the names come from Scheme symbols and lists (for define-library), or generated from a source file name. These sources have almost no restrictions on allowed characters. So we have to encode (or "mangle") disallowed characters in source names to ones allowed in class names. Until now. disallowed characters were converted to a 3-character string starting with '$'. The reason for using '$' is that it is valid for Java identifiers, so you could usually refers to the generated classes via ugly but valid Java identifiers. (There was an exception for reserved words, such as |package|.) However, the benefit of generating Java identifiers is minor (you could always use reflection), and there are two main problems: (1) The generated class names are ugly; unnecessarily so in the case of characters that are allowed for the JVM but not allowed for Java. (2) If we mangle a valid JVM characters unnecessarily we're more likely to get inconsistencies between source file names and class names. This is especially bad when it comes to package names, since since now they might end up in an unexpected directory. So I've decided to "mangle less" - i.e.only those characters that are disallowed in class names, not all those disallowed in Java names. And since I changed the mangling, I decided to switch to one proposed by John Rose: https://blogs.oracle.com/jrose/entry/symbolic_freedom_in_the_vm I believe (but haven't had it confirmed) that other languages and tools use this mangling, so it made sense to chose it. This change results in a binary incompatibility, so you need to re-compile everything. However, you shouldn't need to change the source, unless you did something unusual. Note this change only affects class and package names, as well as .class files. It does not change how variable and procedure names are mapped to field and method names. If it seems to make sense we might change field name mangling in the future. Method names are unlikely to change, because one of Kawa's convenience features is the equivalence between foo-bar-baz and fooBarBaz - or getFooBarBaz in the case of properties. -- --Per Bothner per@bothner.com http://per.bothner.com/