public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
* lang_term and lang_lib in LC_ADDRESS
@ 2014-04-29  9:34 Marko Myllynen
  2014-04-29 15:02 ` Keld Simonsen
  0 siblings, 1 reply; 7+ messages in thread
From: Marko Myllynen @ 2014-04-29  9:34 UTC (permalink / raw)
  To: Keld Simonsen; +Cc: libc-locales, mtk.manpages

Hi Keld,

do you happen to know/remember the story behind lang_term and lang_lib
in LC_ADDRESS? Some sources say they both should be three-letter ISO
639-2 codes but some of the more dependable glibc locales (e.g. de_DE)
use "deu" and "ger" for them.

Thanks,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lang_term and lang_lib in LC_ADDRESS
  2014-04-29  9:34 lang_term and lang_lib in LC_ADDRESS Marko Myllynen
@ 2014-04-29 15:02 ` Keld Simonsen
  2014-05-06 14:06   ` Marko Myllynen
  0 siblings, 1 reply; 7+ messages in thread
From: Keld Simonsen @ 2014-04-29 15:02 UTC (permalink / raw)
  To: Marko Myllynen; +Cc: libc-locales, mtk.manpages

On Tue, Apr 29, 2014 at 12:33:36PM +0300, Marko Myllynen wrote:
> Hi Keld,
> 
> do you happen to know/remember the story behind lang_term and lang_lib
> in LC_ADDRESS? Some sources say they both should be three-letter ISO
> 639-2 codes but some of the more dependable glibc locales (e.g. de_DE)
> use "deu" and "ger" for them.

lang_term reflects ISO 639-2/T (terminology) codes, while
lang_lib reflects ISO 639-2/B (bibliographic) codes.
lang_term is preferred over lang_lib codes for locale names.
There are 20 specific ISO 639-2/B codes.

Reference: https://en.wikipedia.org/wiki/ISO_639-2

Best regards
keld

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lang_term and lang_lib in LC_ADDRESS
  2014-04-29 15:02 ` Keld Simonsen
@ 2014-05-06 14:06   ` Marko Myllynen
  2014-05-06 15:51     ` Chris Leonard
  0 siblings, 1 reply; 7+ messages in thread
From: Marko Myllynen @ 2014-05-06 14:06 UTC (permalink / raw)
  To: Keld Simonsen; +Cc: libc-locales, mtk.manpages

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

Hi,

On 2014-04-29 18:02, Keld Simonsen wrote:
> On Tue, Apr 29, 2014 at 12:33:36PM +0300, Marko Myllynen wrote:
>>
>> do you happen to know/remember the story behind lang_term and lang_lib
>> in LC_ADDRESS? Some sources say they both should be three-letter ISO
>> 639-2 codes but some of the more dependable glibc locales (e.g. de_DE)
>> use "deu" and "ger" for them.
> 
> lang_term reflects ISO 639-2/T (terminology) codes, while
> lang_lib reflects ISO 639-2/B (bibliographic) codes.
> lang_term is preferred over lang_lib codes for locale names.
> There are 20 specific ISO 639-2/B codes.
> 
> Reference: https://en.wikipedia.org/wiki/ISO_639-2

thanks, Library of Congress hosts ISO 639-2 online [1,2] so I was able
to ran a (scripted) check for glibc which revealed few inconsistencies.

How does the attach patch look like?

1) http://www.loc.gov/standards/iso639-2/langhome.html
2) http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt

Thanks,

-- 
Marko Myllynen

[-- Attachment #2: 0001-Fix-lang_lib-lang_term-as-per-ISO-639-2.patch --]
[-- Type: text/plain, Size: 5377 bytes --]

From d9873716e6436720a33324f3776d93f65aea87f2 Mon Sep 17 00:00:00 2001
From: Marko Myllynen <myllynen@redhat.com>
Date: Tue, 6 May 2014 16:54:19 +0300
Subject: [PATCH] Fix lang_lib/lang_term as per ISO 639-2

lang_lib (which reflects ISO 639-2/B (bibliographic) codes) and
lang_term (which reflects ISO 639-2/T (terminology) codes) should
be identical except for those languages for which ISO 639-2
specifies separate bibliographic/terminology values.

Source: http://www.loc.gov/standards/iso639-2/langhome.html.
---
 localedata/locales/eu_ES       |    4 ++--
 localedata/locales/km_KH       |    6 ++++--
 localedata/locales/lb_LU       |    4 +++-
 localedata/locales/lo_LA       |    2 +-
 localedata/locales/my_MM       |    4 +++-
 localedata/locales/sr_ME       |    4 +++-
 localedata/locales/sr_RS       |    4 +++-
 localedata/locales/sr_RS@latin |    4 +++-
 localedata/locales/tr_TR       |    2 +-
 9 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/localedata/locales/eu_ES b/localedata/locales/eu_ES
index 1ea5c4e..12491d2 100644
--- a/localedata/locales/eu_ES
+++ b/localedata/locales/eu_ES
@@ -175,6 +175,6 @@ lang_name    "<U0045><U0075><U0073><U006B><U0061><U0072><U0061>"
 lang_ab      "<U0065><U0075>"
 % eus
 lang_term    "<U0065><U0075><U0073>"
-% eus
-lang_lib    "<U0065><U0075><U0073>"
+% baq
+lang_lib    "<U0062><U0061><U0071>"
 END LC_ADDRESS
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 5563659..24d3b62 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -1901,6 +1901,8 @@ country_car   "<U004C><U0041><U004F>"
 % ភាសាខ្មែរ (Khmer)
 lang_name     "<U1797><U17B6><U179F><U17B6><U1781><U17D2><U1798><U17C2><U179A>"
 lang_ab       "<U006C><U006F>"
-lang_term     "<U006c><U0061><U006F>"
-lang_lib      "<U006C><U0061><U006F>"
+% khm
+lang_term     "<U006B><U0068><U006D>"
+% khm
+lang_lib      "<U006B><U0068><U006D>"
 END LC_ADDRESS
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index a74e162..1140979 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -179,8 +179,10 @@ country_isbn  2
 lang_name     "<U004C><U00EB><U0074><U007A><U0065><U0062><U0075><U0065>/
 <U0072><U0067><U0065><U0073><U0063><U0068>"
 lang_ab       "<U006C><U0062>"
+% ltz
 lang_term     "<U006C><U0074><U007A>"
-lang_lib      "<U006C><U0075><U0078>"
+% ltz
+lang_lib      "<U006C><U0074><U007A>"
 END LC_ADDRESS
 
 LC_TELEPHONE
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index c584877..a57fb28 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -778,6 +778,6 @@ country_car   "<U004C><U0041><U004F>"
 %country_isbn  ""
 lang_name     "<U0EA5><U0EB2><U0EA7>"
 lang_ab       "<U006C><U006F>"
-lang_term     "<U006c><U0061><U006F>"
+lang_term     "<U006C><U0061><U006F>"
 lang_lib      "<U006C><U0061><U006F>"
 END LC_ADDRESS
diff --git a/localedata/locales/my_MM b/localedata/locales/my_MM
index d9a2db1..de67b80 100644
--- a/localedata/locales/my_MM
+++ b/localedata/locales/my_MM
@@ -317,6 +317,8 @@ country_ab2     "<U004D><U004D>"
 country_car  "<U0042><U0041>"
 lang_ab         "<U006D><U0079>"
 lang_name       "<U1017><U1019><U102C>"
+% mya
 lang_term   "<U006D><U0079><U0061>"
-lang_lib    "<U006D><U0079><U0061>"
+% bur
+lang_lib    "<U0062><U0075><U0072>"
 END LC_ADDRESS
diff --git a/localedata/locales/sr_ME b/localedata/locales/sr_ME
index c0aa4a4..4f243dc 100644
--- a/localedata/locales/sr_ME
+++ b/localedata/locales/sr_ME
@@ -151,8 +151,10 @@ country_num   499
 country_car   "<U004D><U004E><U0045>"
 country_isbn  "<U0038><U0036>"
 lang_name     "<U0441><U0440><U043F><U0441><U043A><U0438>"
+% srp
 lang_term     "<U0073><U0072><U0070>"
-lang_lib      "<U0073><U0063><U0063>"
+% srp
+lang_lib      "<U0073><U0072><U0070>"
 lang_ab	      "<U0073><U0072>"
 END LC_ADDRESS
 
diff --git a/localedata/locales/sr_RS b/localedata/locales/sr_RS
index b2b8577..2ae085b 100644
--- a/localedata/locales/sr_RS
+++ b/localedata/locales/sr_RS
@@ -342,8 +342,10 @@ country_car   "<U0053><U0052><U0042>"
 % FIXME: ISBN code is what? "86" that preceedes all the numbers?
 country_isbn  "<U0038><U0036>"
 lang_name     "<U0441><U0440><U043F><U0441><U043A><U0438>"
+% srp
 lang_term     "<U0073><U0072><U0070>"
-lang_lib      "<U0073><U0063><U0063>"
+% srp
+lang_lib      "<U0073><U0072><U0070>"
 lang_ab	      "<U0073><U0072>"
 END LC_ADDRESS
 
diff --git a/localedata/locales/sr_RS@latin b/localedata/locales/sr_RS@latin
index 7b28302..da6628b 100644
--- a/localedata/locales/sr_RS@latin
+++ b/localedata/locales/sr_RS@latin
@@ -160,8 +160,10 @@ country_num   688
 country_car   "<U0053><U0052><U0042>"
 country_isbn  "<U0038><U0036>"
 lang_name     "<U0073><U0072><U0070><U0073><U006B><U0069>"
+% srp
 lang_term     "<U0073><U0072><U0070>"
-lang_lib      "<U0073><U0063><U0063>"
+% srp
+lang_lib      "<U0073><U0072><U0070>"
 lang_ab	      "<U0073><U0072>"
 END LC_ADDRESS
 
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f54be2c..c1fb36f 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -3598,7 +3598,7 @@ lang_name	"<U0054><U0075><U0072><U006B><U0069><U0073><U0068>"
 % tur
 lang_term	"<U0074><U0075><U0072>"
 % tur
-lang_lib	"<U0074><U0072>"
+lang_lib	"<U0074><U0075><U0072>"
 %tr
 lang_ab		"<U0074><U0072>"
 END LC_ADDRESS
-- 
1.7.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lang_term and lang_lib in LC_ADDRESS
  2014-05-06 14:06   ` Marko Myllynen
@ 2014-05-06 15:51     ` Chris Leonard
  2014-05-06 16:03       ` Marko Myllynen
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Leonard @ 2014-05-06 15:51 UTC (permalink / raw)
  To: myllynen; +Cc: Keld Simonsen, libc-locales, mtk.manpages

Shouldn't this patch also change the commented entries and not just
the Unicode lines?

On Tue, May 6, 2014 at 10:06 AM, Marko Myllynen <myllynen@redhat.com> wrote:
> Hi,
>
> On 2014-04-29 18:02, Keld Simonsen wrote:
>> On Tue, Apr 29, 2014 at 12:33:36PM +0300, Marko Myllynen wrote:
>>>
>>> do you happen to know/remember the story behind lang_term and lang_lib
>>> in LC_ADDRESS? Some sources say they both should be three-letter ISO
>>> 639-2 codes but some of the more dependable glibc locales (e.g. de_DE)
>>> use "deu" and "ger" for them.
>>
>> lang_term reflects ISO 639-2/T (terminology) codes, while
>> lang_lib reflects ISO 639-2/B (bibliographic) codes.
>> lang_term is preferred over lang_lib codes for locale names.
>> There are 20 specific ISO 639-2/B codes.
>>
>> Reference: https://en.wikipedia.org/wiki/ISO_639-2
>
> thanks, Library of Congress hosts ISO 639-2 online [1,2] so I was able
> to ran a (scripted) check for glibc which revealed few inconsistencies.
>
> How does the attach patch look like?
>
> 1) http://www.loc.gov/standards/iso639-2/langhome.html
> 2) http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt
>
> Thanks,
>
> --
> Marko Myllynen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lang_term and lang_lib in LC_ADDRESS
  2014-05-06 15:51     ` Chris Leonard
@ 2014-05-06 16:03       ` Marko Myllynen
  2014-05-06 16:23         ` Chris Leonard
  0 siblings, 1 reply; 7+ messages in thread
From: Marko Myllynen @ 2014-05-06 16:03 UTC (permalink / raw)
  To: Chris Leonard; +Cc: Keld Simonsen, libc-locales, mtk.manpages

Hi,

On 2014-05-06 18:51, Chris Leonard wrote:
> Shouldn't this patch also change the commented entries and not just
> the Unicode lines?

sorry, you have to be more specific. I added comments where I did
changes and at least in the files I touched comments match the Unicode
lines.

Cheers,

-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lang_term and lang_lib in LC_ADDRESS
  2014-05-06 16:03       ` Marko Myllynen
@ 2014-05-06 16:23         ` Chris Leonard
  2014-05-22  8:28           ` Marko Myllynen
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Leonard @ 2014-05-06 16:23 UTC (permalink / raw)
  To: myllynen; +Cc: Keld Simonsen, libc-locales, mtk.manpages

Sorry my error in reading the patch markup.

cjl

On Tue, May 6, 2014 at 12:03 PM, Marko Myllynen <myllynen@redhat.com> wrote:
> Hi,
>
> On 2014-05-06 18:51, Chris Leonard wrote:
>> Shouldn't this patch also change the commented entries and not just
>> the Unicode lines?
>
> sorry, you have to be more specific. I added comments where I did
> changes and at least in the files I touched comments match the Unicode
> lines.
>
> Cheers,
>
> --
> Marko Myllynen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: lang_term and lang_lib in LC_ADDRESS
  2014-05-06 16:23         ` Chris Leonard
@ 2014-05-22  8:28           ` Marko Myllynen
  0 siblings, 0 replies; 7+ messages in thread
From: Marko Myllynen @ 2014-05-22  8:28 UTC (permalink / raw)
  To: Chris Leonard; +Cc: Keld Simonsen, libc-locales, mtk.manpages

Hi,

since there were no objections I filed a bug to get this fix included:

https://sourceware.org/bugzilla/show_bug.cgi?id=16973

Thanks,

On 2014-05-06 19:23, Chris Leonard wrote:
> Sorry my error in reading the patch markup.
> 
> cjl
> 
> On Tue, May 6, 2014 at 12:03 PM, Marko Myllynen <myllynen@redhat.com> wrote:
>> Hi,
>>
>> On 2014-05-06 18:51, Chris Leonard wrote:
>>> Shouldn't this patch also change the commented entries and not just
>>> the Unicode lines?
>>
>> sorry, you have to be more specific. I added comments where I did
>> changes and at least in the files I touched comments match the Unicode
>> lines.
>>
>> Cheers,
>>
>> --
>> Marko Myllynen


-- 
Marko Myllynen

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-05-22  8:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-29  9:34 lang_term and lang_lib in LC_ADDRESS Marko Myllynen
2014-04-29 15:02 ` Keld Simonsen
2014-05-06 14:06   ` Marko Myllynen
2014-05-06 15:51     ` Chris Leonard
2014-05-06 16:03       ` Marko Myllynen
2014-05-06 16:23         ` Chris Leonard
2014-05-22  8:28           ` Marko Myllynen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).