public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Using localedef to define custom charmaps
@ 2018-06-07 15:31 Florian Weimer
  2018-06-11 16:49 ` Carlos O'Donell
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2018-06-07 15:31 UTC (permalink / raw)
  To: GNU C Library

Is it currently possible to use localedef to define custom charmaps?

The command line interface suggests this possibility.

But I don't see anything storing the tables in locale data, and it seems 
that setting a locale in glibc always loads the corresponding 
gconv/iconv module.

If that's really true, then why do we have charmap files?  Only some of 
them are used for generating iconv/gconv modules.  Is it about POSIX 
conformance at the command line interface level?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using localedef to define custom charmaps
  2018-06-07 15:31 Using localedef to define custom charmaps Florian Weimer
@ 2018-06-11 16:49 ` Carlos O'Donell
  2018-06-11 16:51   ` Florian Weimer
  0 siblings, 1 reply; 8+ messages in thread
From: Carlos O'Donell @ 2018-06-11 16:49 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library

On 06/07/2018 11:31 AM, Florian Weimer wrote:
> Is it currently possible to use localedef to define custom charmaps?

Sure, just write one and put it into the default install path.

> The command line interface suggests this possibility.

Yes.

> But I don't see anything storing the tables in locale data, and it
> seems that setting a locale in glibc always loads the corresponding
> gconv/iconv module
> If that's really true, then why do we have charmap files?  Only some
> of them are used for generating iconv/gconv modules.  Is it about
> POSIX conformance at the command line interface level?

Well, we need the charmap file to translate from the internal
representation to the character set we are using?

However, once you've done the work of using the charmap to build
the output locale data ... why do you need it anymore?

Conversions will be handled by gconv converters after that.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using localedef to define custom charmaps
  2018-06-11 16:49 ` Carlos O'Donell
@ 2018-06-11 16:51   ` Florian Weimer
  2018-06-11 17:53     ` Carlos O'Donell
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2018-06-11 16:51 UTC (permalink / raw)
  To: Carlos O'Donell, GNU C Library

On 06/11/2018 06:49 PM, Carlos O'Donell wrote:

> However, once you've done the work of using the charmap to build
> the output locale data ... why do you need it anymore?
> 
> Conversions will be handled by gconv converters after that.

What if there's no gconv converter for my charmap?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using localedef to define custom charmaps
  2018-06-11 16:51   ` Florian Weimer
@ 2018-06-11 17:53     ` Carlos O'Donell
  2018-06-11 18:01       ` Florian Weimer
  0 siblings, 1 reply; 8+ messages in thread
From: Carlos O'Donell @ 2018-06-11 17:53 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library

On 06/11/2018 12:51 PM, Florian Weimer wrote:
> On 06/11/2018 06:49 PM, Carlos O'Donell wrote:
> 
>> However, once you've done the work of using the charmap to build
>> the output locale data ... why do you need it anymore?
>>
>> Conversions will be handled by gconv converters after that.
> 
> What if there's no gconv converter for my charmap?

Are we talking in the abstract here?

We should write converters for the character maps we support.

It would be great to have "symmetric" support in this way for
all the character maps.

I imagine some simple character maps could be automatically
driven by the existing data.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using localedef to define custom charmaps
  2018-06-11 17:53     ` Carlos O'Donell
@ 2018-06-11 18:01       ` Florian Weimer
  2018-06-11 19:36         ` Carlos O'Donell
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2018-06-11 18:01 UTC (permalink / raw)
  To: Carlos O'Donell, GNU C Library

On 06/11/2018 07:53 PM, Carlos O'Donell wrote:
> On 06/11/2018 12:51 PM, Florian Weimer wrote:
>> On 06/11/2018 06:49 PM, Carlos O'Donell wrote:
>>
>>> However, once you've done the work of using the charmap to build
>>> the output locale data ... why do you need it anymore?
>>>
>>> Conversions will be handled by gconv converters after that.
>>
>> What if there's no gconv converter for my charmap?
> 
> Are we talking in the abstract here?

Not quite.  I suggested a custom charmap as a way to customize 
transliteration for a character which is present in the destination 
character set:

   https://sourceware.org/bugzilla/show_bug.cgi?id=23076

If there are no custom charmaps, then obviously this will not work.

> I imagine some simple character maps could be automatically
> driven by the existing data.

We auto-generate gconv modules from charmaps for most (if not all) 8-bit 
character sets.  This is a bit unfortunate because the code is 
identical, and the overhead of loading the data via dlopen is quite 
large (mostly in terms of private RSS consumption).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using localedef to define custom charmaps
  2018-06-11 18:01       ` Florian Weimer
@ 2018-06-11 19:36         ` Carlos O'Donell
  2018-06-12 14:03           ` Florian Weimer
  0 siblings, 1 reply; 8+ messages in thread
From: Carlos O'Donell @ 2018-06-11 19:36 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library

On 06/11/2018 02:00 PM, Florian Weimer wrote:
> On 06/11/2018 07:53 PM, Carlos O'Donell wrote:
>> On 06/11/2018 12:51 PM, Florian Weimer wrote:
>>> On 06/11/2018 06:49 PM, Carlos O'Donell wrote:
>>> 
>>>> However, once you've done the work of using the charmap to
>>>> build the output locale data ... why do you need it anymore?
>>>> 
>>>> Conversions will be handled by gconv converters after that.
>>> 
>>> What if there's no gconv converter for my charmap?
>> 
>> Are we talking in the abstract here?
> 
> Not quite.  I suggested a custom charmap as a way to customize
> transliteration for a character which is present in the destination
> character set:
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=23076
> 
> If there are no custom charmaps, then obviously this will not work.

Right.

I gave my suggestion in that bug :-)

>> I imagine some simple character maps could be automatically driven
>> by the existing data.
> 
> We auto-generate gconv modules from charmaps for most (if not all)
> 8-bit character sets.  This is a bit unfortunate because the code is
> identical, and the overhead of loading the data via dlopen is quite
> large (mostly in terms of private RSS consumption).

That's an implementation detail that can be changed?

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using localedef to define custom charmaps
  2018-06-11 19:36         ` Carlos O'Donell
@ 2018-06-12 14:03           ` Florian Weimer
  2018-06-12 14:10             ` Carlos O'Donell
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2018-06-12 14:03 UTC (permalink / raw)
  To: Carlos O'Donell, GNU C Library

On 06/11/2018 09:36 PM, Carlos O'Donell wrote:
>>> I imagine some simple character maps could be automatically driven
>>> by the existing data.
>> We auto-generate gconv modules from charmaps for most (if not all)
>> 8-bit character sets.  This is a bit unfortunate because the code is
>> identical, and the overhead of loading the data via dlopen is quite
>> large (mostly in terms of private RSS consumption).

> That's an implementation detail that can be changed?

Yes, it's one of the few things that actually can be changed (i.e., 
which charsets are implemented as external modules loaded via dlopen, 
and which ones are internal and implemented by other means).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using localedef to define custom charmaps
  2018-06-12 14:03           ` Florian Weimer
@ 2018-06-12 14:10             ` Carlos O'Donell
  0 siblings, 0 replies; 8+ messages in thread
From: Carlos O'Donell @ 2018-06-12 14:10 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library

On 06/12/2018 10:03 AM, Florian Weimer wrote:
> On 06/11/2018 09:36 PM, Carlos O'Donell wrote:
>>>> I imagine some simple character maps could be automatically
>>>> driven by the existing data.
>>> We auto-generate gconv modules from charmaps for most (if not
>>> all) 8-bit character sets.  This is a bit unfortunate because the
>>> code is identical, and the overhead of loading the data via
>>> dlopen is quite large (mostly in terms of private RSS
>>> consumption).
> 
>> That's an implementation detail that can be changed?
> 
> Yes, it's one of the few things that actually can be changed (i.e.,
> which charsets are implemented as external modules loaded via dlopen,
> and which ones are internal and implemented by other means).

The work never ends ;-)

c.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-06-12 14:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-07 15:31 Using localedef to define custom charmaps Florian Weimer
2018-06-11 16:49 ` Carlos O'Donell
2018-06-11 16:51   ` Florian Weimer
2018-06-11 17:53     ` Carlos O'Donell
2018-06-11 18:01       ` Florian Weimer
2018-06-11 19:36         ` Carlos O'Donell
2018-06-12 14:03           ` Florian Weimer
2018-06-12 14:10             ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).