* Symbol name character restrictions
@ 2023-04-10 14:39 Joshua Saxby
2023-04-10 14:43 ` Marc Nieper-Wißkirchen
0 siblings, 1 reply; 4+ messages in thread
From: Joshua Saxby @ 2023-04-10 14:39 UTC (permalink / raw)
To: jit
[-- Attachment #1: Type: text/plain, Size: 1526 bytes --]
Dear All,
I noticed that currently libgccjit restricts symbol names for generated
functions (and I assume all other symbols) to match the rules for C symbol
names, that is, alphanumeric and underscores.
From the source for gcc_jit_context_new_function() (
https://github.com/gcc-mirror/gcc/blob/725bcdeec60771cb9ee387978716028b64ea1b7f/gcc/jit/libgccjit.cc#L1173-L1177
):
/* The assembler can only handle certain names, so for now, enforce
C's rules for identifiers upon the name, using ISALPHA and ISALNUM
from safe-ctype.h to ignore the current locale.
Eventually we'll need some way to interact with e.g. C++ name
mangling. */
I've seen some suggestions elsewhere that some assemblers can handle
symbols with wider varieties of symbols than these, I have struggled to
find any documentation of the exact restrictions on symbol-naming in the
assembler itself (I could assume it's identical to C symbol naming rules,
but I like to be sure), any pointers to where I could find such a
specification? Also, any plans to follow up on the hinted extension toward
the end of that comment, RE C++ name mangling?
Best Regards,
*J.S.*
*My PGP Public Key Identity*
pub 4096R/*B7A947E4* 2016-11-16 [expires: 2025-12-31]
Key fingerprint = *E2C4 514F F0FA 52D1 896A B1D6 3D42 BFD9 B7A9 47E4*
uid Joshua Saxby <joshua.a.saxby+UMvLnvbsOxBHaeiCHvbdunpz@gmail.com>
uid Joshua Saxby (saxbophone) <joshua.a.saxby@gmail.com>
sub 4096R/0A445946 2016-11-16 [expires: 2025-12-31]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Symbol name character restrictions
2023-04-10 14:39 Symbol name character restrictions Joshua Saxby
@ 2023-04-10 14:43 ` Marc Nieper-Wißkirchen
2023-04-10 14:51 ` Joshua Saxby
0 siblings, 1 reply; 4+ messages in thread
From: Marc Nieper-Wißkirchen @ 2023-04-10 14:43 UTC (permalink / raw)
To: Joshua Saxby; +Cc: jit
According to the documentation of the GNU assembler at
https://sourceware.org/binutils/docs/as/Symbol-Intro.html, any
characters except for the NUL character are allowed in symbol names.
Am Mo., 10. Apr. 2023 um 16:40 Uhr schrieb Joshua Saxby via Jit
<jit@gcc.gnu.org>:
>
> Dear All,
>
> I noticed that currently libgccjit restricts symbol names for generated
> functions (and I assume all other symbols) to match the rules for C symbol
> names, that is, alphanumeric and underscores.
>
> From the source for gcc_jit_context_new_function() (
> https://github.com/gcc-mirror/gcc/blob/725bcdeec60771cb9ee387978716028b64ea1b7f/gcc/jit/libgccjit.cc#L1173-L1177
> ):
>
> /* The assembler can only handle certain names, so for now, enforce
> C's rules for identifiers upon the name, using ISALPHA and ISALNUM
> from safe-ctype.h to ignore the current locale.
> Eventually we'll need some way to interact with e.g. C++ name
> mangling. */
>
> I've seen some suggestions elsewhere that some assemblers can handle
> symbols with wider varieties of symbols than these, I have struggled to
> find any documentation of the exact restrictions on symbol-naming in the
> assembler itself (I could assume it's identical to C symbol naming rules,
> but I like to be sure), any pointers to where I could find such a
> specification? Also, any plans to follow up on the hinted extension toward
> the end of that comment, RE C++ name mangling?
>
> Best Regards,
>
> *J.S.*
>
>
>
> *My PGP Public Key Identity*
>
> pub 4096R/*B7A947E4* 2016-11-16 [expires: 2025-12-31]
> Key fingerprint = *E2C4 514F F0FA 52D1 896A B1D6 3D42 BFD9 B7A9 47E4*
> uid Joshua Saxby <joshua.a.saxby+UMvLnvbsOxBHaeiCHvbdunpz@gmail.com>
> uid Joshua Saxby (saxbophone) <joshua.a.saxby@gmail.com>
> sub 4096R/0A445946 2016-11-16 [expires: 2025-12-31]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Symbol name character restrictions
2023-04-10 14:43 ` Marc Nieper-Wißkirchen
@ 2023-04-10 14:51 ` Joshua Saxby
2023-04-13 15:14 ` Joshua Saxby
0 siblings, 1 reply; 4+ messages in thread
From: Joshua Saxby @ 2023-04-10 14:51 UTC (permalink / raw)
To: Marc Nieper-Wißkirchen; +Cc: jit
[-- Attachment #1: Type: text/plain, Size: 2887 bytes --]
Thanks for that info Marc, I can't believe I missed it!
I had a feeling that assemblers/object files were pretty permissive to
symbol names at least in principle.
This seems to contradict that comment from libgccjit source that I brought
up earlier, I wonder what other technical limitations (if any) are there in
the character set that jit's symbols can support?
Thanks,
*J.S.*
*My PGP Public Key Identity*
pub 4096R/*B7A947E4* 2016-11-16 [expires: 2017-05-15]
Key fingerprint = *E2C4 514F F0FA 52D1 896A B1D6 3D42 BFD9 B7A9 47E4*
uid Joshua Saxby <joshua.a.saxby+UMvLnvbsOxBHaeiCHvbdunpz@gmail.com>
uid Joshua Saxby (saxbophone) <joshua.a.saxby@gmail.com>
sub 4096R/0A445946 2016-11-16 [expires: 2017-05-15]
On Mon, 10 Apr 2023 at 15:43, Marc Nieper-Wißkirchen <marc.nieper@gmail.com>
wrote:
> According to the documentation of the GNU assembler at
> https://sourceware.org/binutils/docs/as/Symbol-Intro.html, any
> characters except for the NUL character are allowed in symbol names.
>
> Am Mo., 10. Apr. 2023 um 16:40 Uhr schrieb Joshua Saxby via Jit
> <jit@gcc.gnu.org>:
> >
> > Dear All,
> >
> > I noticed that currently libgccjit restricts symbol names for generated
> > functions (and I assume all other symbols) to match the rules for C
> symbol
> > names, that is, alphanumeric and underscores.
> >
> > From the source for gcc_jit_context_new_function() (
> >
> https://github.com/gcc-mirror/gcc/blob/725bcdeec60771cb9ee387978716028b64ea1b7f/gcc/jit/libgccjit.cc#L1173-L1177
> > ):
> >
> > /* The assembler can only handle certain names, so for now, enforce
> > C's rules for identifiers upon the name, using ISALPHA and ISALNUM
> > from safe-ctype.h to ignore the current locale.
> > Eventually we'll need some way to interact with e.g. C++ name
> > mangling. */
> >
> > I've seen some suggestions elsewhere that some assemblers can handle
> > symbols with wider varieties of symbols than these, I have struggled to
> > find any documentation of the exact restrictions on symbol-naming in the
> > assembler itself (I could assume it's identical to C symbol naming rules,
> > but I like to be sure), any pointers to where I could find such a
> > specification? Also, any plans to follow up on the hinted extension
> toward
> > the end of that comment, RE C++ name mangling?
> >
> > Best Regards,
> >
> > *J.S.*
> >
> >
> >
> > *My PGP Public Key Identity*
> >
> > pub 4096R/*B7A947E4* 2016-11-16 [expires: 2025-12-31]
> > Key fingerprint = *E2C4 514F F0FA 52D1 896A B1D6 3D42 BFD9 B7A9
> 47E4*
> > uid Joshua Saxby <
> joshua.a.saxby+UMvLnvbsOxBHaeiCHvbdunpz@gmail.com>
> > uid Joshua Saxby (saxbophone) <
> joshua.a.saxby@gmail.com>
> > sub 4096R/0A445946 2016-11-16 [expires: 2025-12-31]
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Symbol name character restrictions
2023-04-10 14:51 ` Joshua Saxby
@ 2023-04-13 15:14 ` Joshua Saxby
0 siblings, 0 replies; 4+ messages in thread
From: Joshua Saxby @ 2023-04-13 15:14 UTC (permalink / raw)
To: Marc Nieper-Wißkirchen; +Cc: jit
[-- Attachment #1: Type: text/plain, Size: 3941 bytes --]
I've done some further digging, and it appears this feature was added to
GNU as about 8 years ago:
https://github.com/bminor/binutils-gdb/commit/d02603dc201f80cd9d2a1f4b1a16110b1e04222b
(commit: d02603d "Allow symbol and label names to be enclosed in double
quotes.")
I guess libgccjit predates this and the change wasn't propagated to it. I
think I will do some hacking on libgccjit to remove the check on symbol
names locally and see if it works.
Cheers,
*J.S.*
*My PGP Public Key Identity*
pub 4096R/*B7A947E4* 2016-11-16 [expires: 2017-05-15]
Key fingerprint = *E2C4 514F F0FA 52D1 896A B1D6 3D42 BFD9 B7A9 47E4*
uid Joshua Saxby <joshua.a.saxby+UMvLnvbsOxBHaeiCHvbdunpz@gmail.com>
uid Joshua Saxby (saxbophone) <joshua.a.saxby@gmail.com>
sub 4096R/0A445946 2016-11-16 [expires: 2017-05-15]
On Mon, 10 Apr 2023 at 15:51, Joshua Saxby <joshua.a.saxby@gmail.com> wrote:
> Thanks for that info Marc, I can't believe I missed it!
>
> I had a feeling that assemblers/object files were pretty permissive to
> symbol names at least in principle.
>
> This seems to contradict that comment from libgccjit source that I brought
> up earlier, I wonder what other technical limitations (if any) are there in
> the character set that jit's symbols can support?
>
> Thanks,
> *J.S.*
>
>
>
> *My PGP Public Key Identity*
>
> pub 4096R/*B7A947E4* 2016-11-16 [expires: 2017-05-15]
> Key fingerprint = *E2C4 514F F0FA 52D1 896A B1D6 3D42 BFD9 B7A9
> 47E4*
> uid Joshua Saxby <joshua.a.saxby+UMvLnvbsOxBHaeiCHvbdunpz@gmail.com>
> uid Joshua Saxby (saxbophone) <joshua.a.saxby@gmail.com>
> sub 4096R/0A445946 2016-11-16 [expires: 2017-05-15]
>
>
>
>
> On Mon, 10 Apr 2023 at 15:43, Marc Nieper-Wißkirchen <
> marc.nieper@gmail.com> wrote:
>
>> According to the documentation of the GNU assembler at
>> https://sourceware.org/binutils/docs/as/Symbol-Intro.html, any
>> characters except for the NUL character are allowed in symbol names.
>>
>> Am Mo., 10. Apr. 2023 um 16:40 Uhr schrieb Joshua Saxby via Jit
>> <jit@gcc.gnu.org>:
>> >
>> > Dear All,
>> >
>> > I noticed that currently libgccjit restricts symbol names for generated
>> > functions (and I assume all other symbols) to match the rules for C
>> symbol
>> > names, that is, alphanumeric and underscores.
>> >
>> > From the source for gcc_jit_context_new_function() (
>> >
>> https://github.com/gcc-mirror/gcc/blob/725bcdeec60771cb9ee387978716028b64ea1b7f/gcc/jit/libgccjit.cc#L1173-L1177
>> > ):
>> >
>> > /* The assembler can only handle certain names, so for now, enforce
>> > C's rules for identifiers upon the name, using ISALPHA and ISALNUM
>> > from safe-ctype.h to ignore the current locale.
>> > Eventually we'll need some way to interact with e.g. C++ name
>> > mangling. */
>> >
>> > I've seen some suggestions elsewhere that some assemblers can handle
>> > symbols with wider varieties of symbols than these, I have struggled to
>> > find any documentation of the exact restrictions on symbol-naming in the
>> > assembler itself (I could assume it's identical to C symbol naming
>> rules,
>> > but I like to be sure), any pointers to where I could find such a
>> > specification? Also, any plans to follow up on the hinted extension
>> toward
>> > the end of that comment, RE C++ name mangling?
>> >
>> > Best Regards,
>> >
>> > *J.S.*
>> >
>> >
>> >
>> > *My PGP Public Key Identity*
>> >
>> > pub 4096R/*B7A947E4* 2016-11-16 [expires: 2025-12-31]
>> > Key fingerprint = *E2C4 514F F0FA 52D1 896A B1D6 3D42 BFD9 B7A9
>> 47E4*
>> > uid Joshua Saxby <
>> joshua.a.saxby+UMvLnvbsOxBHaeiCHvbdunpz@gmail.com>
>> > uid Joshua Saxby (saxbophone) <
>> joshua.a.saxby@gmail.com>
>> > sub 4096R/0A445946 2016-11-16 [expires: 2025-12-31]
>>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-04-13 15:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-10 14:39 Symbol name character restrictions Joshua Saxby
2023-04-10 14:43 ` Marc Nieper-Wißkirchen
2023-04-10 14:51 ` Joshua Saxby
2023-04-13 15:14 ` Joshua Saxby
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).