From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 106175 invoked by alias); 4 May 2015 10:42:17 -0000 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org Received: (qmail 106120 invoked by uid 48); 4 May 2015 10:42:13 -0000 From: "maiku.fabian at gmail dot com" To: glibc-bugs@sourceware.org Subject: [Bug localedata/16061] Review / update transliteration data Date: Mon, 04 May 2015 10:42:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: localedata X-Bugzilla-Version: 2.18 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: maiku.fabian at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: security- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-05/txt/msg00005.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=3D16061 --- Comment #5 from Mike FABIAN --- (In reply to Marko Myllynen from comment #4) > (In reply to Mike FABIAN from comment #2) > > (In reply to Marko Myllynen from comment #0) > >=20 > > C-translit.h.in seems to be manually edited and not generated from > > Unicode data. >=20 > Based on earlier changelog comments it seems that C-translit.h.in was > updated manually for Unicode 3.2.0, should it now be updated for Unicode > 7.0.0 by some means? Probably, but how? > > is apparently manually edited and not generated. > >=20 > > locales/translit_cjk_variants > >=20 > > is not generated from Unicode data either but from a UniVariants.Z > > file which can still be found here: > >=20 > > http://kanji.zinbun.kyoto-u.ac.jp/~yasuoka/ftp/CJKtable/UniVariants.Z > >=20 > > It is from 2002-08-15 and I have no idea how it has been created. > > So I did not touch /translit_cjk_variants. >=20 > Perhaps we could add a note about its origins to the file. There is already a note in the comment section of that file. > Also, shouldn't =C3=98 and =C3=86 be handled in the same way? What do you mean by =E2=80=9Chandled in the same way=E2=80=9D?=20 > Looking at translit_neutral in more detail, I think it's actually wrong > place for letters, it should contain non-letters only and if specific rul= es > are needed for letters like =C3=98 or =C3=86, those should be added direc= tly in locale > files (so the patch discussed in bug 15593 should have not been applied to > translit_neutral after all). This would also mean that the special rules = in > the generator for cases like EM DASH and EN DASH should probably end up to > translit_neutral not translit_combining. My guess is that the purpose of translit_neutral is to contain transliterations which are locale =E2=80=9Cneutral=E2=80=9D, i.e. are the s= ame for all locales. So I see no reason not to include letters. > > > but some characters (like U+00D6, =C3=96) have decomposition defined = in > > > Unicode but not in glibc. > >=20 > > glibc had this already in translit_combining: > >=20 > > (was already there, not added by my patch, it is generated from > > UnicodeData.txt by decomposing to U+004F U+0308 and then stripping the > > combining character U+0308). >=20 > Yes, I think what I meant to say was that the decomposition to U+004F U+0= 308 > was missing but as you point out it is defined in some locales where it > would be needed. Btw, I wonder should U+00D6 actually decompose to U+004F > U+00A8 after U+004F U+0308 in those locales? =C3=96 -> O=C2=A8 Why? Is that a reasonable transliteration? It throws away less information but I think it is common practice to transliterate =C3=96 just as O in English for example. --=20 You are receiving this mail because: You are on the CC list for the bug. >>From glibc-bugs-return-28141-listarch-glibc-bugs=sources.redhat.com@sourceware.org Mon May 04 11:37:33 2015 Return-Path: Delivered-To: listarch-glibc-bugs@sources.redhat.com Received: (qmail 13399 invoked by alias); 4 May 2015 11:37:33 -0000 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org Delivered-To: mailing list glibc-bugs@sourceware.org Received: (qmail 13319 invoked by uid 48); 4 May 2015 11:37:29 -0000 From: "myllynen at redhat dot com" To: glibc-bugs@sourceware.org Subject: [Bug localedata/16061] Review / update transliteration data Date: Mon, 04 May 2015 11:37:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: localedata X-Bugzilla-Version: 2.18 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: myllynen at redhat dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: security- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-05/txt/msg00006.txt.bz2 Content-length: 3660 https://sourceware.org/bugzilla/show_bug.cgi?id=3D16061 --- Comment #6 from Marko Myllynen --- (In reply to Mike FABIAN from comment #5) > (In reply to Marko Myllynen from comment #4) > > (In reply to Mike FABIAN from comment #2) > > > (In reply to Marko Myllynen from comment #0) > > >=20 > > > C-translit.h.in seems to be manually edited and not generated from > > > Unicode data. > >=20 > > Based on earlier changelog comments it seems that C-translit.h.in was > > updated manually for Unicode 3.2.0, should it now be updated for Unicode > > 7.0.0 by some means? >=20 > Probably, but how? Good question - do you see it feasible to use the generator to also produce C-translit.h.in (sans the previous individual additions)? > > Perhaps we could add a note about its origins to the file. >=20 > There is already a note in the comment section of that file. Ah, not sure how I missed that. > > Also, shouldn't =C3=98 and =C3=86 be handled in the same way? >=20 > What do you mean by =E2=80=9Chandled in the same way=E2=80=9D?=20 After applying the patch we would have different kind of rules for =C3=98 (= U+00D6) and =C3=86 (U+00C6): locales/translit_combining: locales/translit_neutral: "" locales/translit_combining: "" locales/translit_neutral: "" > > Looking at translit_neutral in more detail, I think it's actually wrong > > place for letters, it should contain non-letters only and if specific r= ules > > are needed for letters like =C3=98 or =C3=86, those should be added dir= ectly in locale > > files (so the patch discussed in bug 15593 should have not been applied= to > > translit_neutral after all). This would also mean that the special rule= s in > > the generator for cases like EM DASH and EN DASH should probably end up= to > > translit_neutral not translit_combining. >=20 > My guess is that the purpose of translit_neutral is to contain > transliterations which are locale =E2=80=9Cneutral=E2=80=9D, i.e. are the= same for > all locales. So I see no reason not to include letters. Yeah, outright excluding *all* letters might be too harsh for cases where i= t's clear what the result should be but from the discussion in bug 15593 and the above handling of =C3=98 I got an impression translit_neutral is probably n= ot the right place for it? If a letter is being added to translit_combining by the generator isn't it then better to have it there than in the manually created translit_neutral? I see that i18n includes translit_neutral, not sure does = that impose some requirements in any way. > > > > but some characters (like U+00D6, =C3=96) have decomposition define= d in > > > > Unicode but not in glibc. > > >=20 > > > glibc had this already in translit_combining: > > >=20 > > > (was already there, not added by my patch, it is generated from > > > UnicodeData.txt by decomposing to U+004F U+0308 and then stripping the > > > combining character U+0308). > >=20 > > Yes, I think what I meant to say was that the decomposition to U+004F U= +0308 > > was missing but as you point out it is defined in some locales where it > > would be needed. Btw, I wonder should U+00D6 actually decompose to U+00= 4F > > U+00A8 after U+004F U+0308 in those locales? >=20 > =C3=96 -> O=C2=A8 >=20 > Why? Is that a reasonable transliteration? It throws away less > information but I think it is common practice to transliterate =C3=96 > just as O in English for example. I was merely speculating on this, perhaps we can forget this part. Thanks. --=20 You are receiving this mail because: You are on the CC list for the bug. >>From glibc-bugs-return-28142-listarch-glibc-bugs=sources.redhat.com@sourceware.org Mon May 04 12:06:58 2015 Return-Path: Delivered-To: listarch-glibc-bugs@sources.redhat.com Received: (qmail 56503 invoked by alias); 4 May 2015 12:06:58 -0000 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org Delivered-To: mailing list glibc-bugs@sourceware.org Received: (qmail 56442 invoked by uid 48); 4 May 2015 12:06:54 -0000 From: "polymorphm at gmail dot com" To: glibc-bugs@sourceware.org Subject: [Bug libc/13502] SEGFAULT in fork() when pthread_atfork() was called from a library loaded/unloaded with dlopen/dlclose Date: Mon, 04 May 2015 12:06:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: libc X-Bugzilla-Version: 2.12 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: polymorphm at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: security- X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-05/txt/msg00007.txt.bz2 Content-length: 398 https://sourceware.org/bugzilla/show_bug.cgi?id=13502 Andrej Antonov changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |polymorphm at gmail dot com -- You are receiving this mail because: You are on the CC list for the bug.