From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 116319 invoked by alias); 3 Dec 2017 13:16:21 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 116214 invoked by uid 55); 3 Dec 2017 13:16:17 -0000 From: "keld at keldix dot com" To: libc-locales@sourceware.org Subject: [Bug localedata/17750] wrong collation order of diacritics in most locales Date: Sun, 03 Dec 2017 13:16:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: localedata X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: keld at keldix dot com X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: aoliva at sourceware dot org X-Bugzilla-Target-Milestone: 2.27 X-Bugzilla-Flags: security- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2017-q4/txt/msg00297.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=3D17750 --- Comment #25 from keld at keldix dot com --- On Thu, Nov 30, 2017 at 09:09:25AM +0000, egmont at gmail dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=3D17750 >=20 > --- Comment #24 from Egmont Koblinger --- > (In reply to keld@keldix.com from comment #21) >=20 > > Well, in Finnish and other Nordic languages like Danish, Swedish and > > Norwegian, =C3=B6 and =C3=A4 etc > > are not considered accented letters, but genuine separated letters, so = that > > is why=20 > > there are few strings with more than one accented letter. >=20 > To clarify: If they sort German words containing =C3=A4 and =C3=B6, they'= re sorted among > the same letters of their own language, right? And what about French acce= nts, > are they on the other hand mixed together with their unaccented counterpa= rts? Yes, German =C3=B6 and =C3=A4 are treated exactly as the Swedish letters. And French accented letters like =C3=A9 and =C3=A8 are treated as 'e' but w= ith an accent. =C3=A9 is actually much used in Swedish proper. > > German umlaut letters are much the same in Finnish (and Swedish) and = =C3=A4 and =C3=B6 > > are > > then the same as the genuine Finnish/Swedish letters. >=20 > What about German =C3=BC? =C3=BC is treated as an y AFAIK, but as with an accent. Danish =C3=A6 and = =C3=B8 are treated as =C3=A4 and =C3=B6 but as if they have an accent. > (In reply to keld@keldix.com from comment #22) >=20 > > [...] Then there is a spec from Danish Standard > > that is more elaborate [...] with the backwards diacrit spec. >=20 > I'm shocked to hear that there's not only one language but more languages= that > use backwards diacritics, something that IMO no sane man with any tiny bi= t of > common sense would ever decide on :-) Well, it is because the last accented character in French are more important when pronounciated. I agree the it is a bit coulter-intuitive, but I do fav= our the actual habits in the real world over what is logic. > (In reply to keld@keldix.com from comment #23) >=20 > > That is what I am suggesting, at least for Canada. > > The same reasoning could be done for Dutch in Belgium, and then also the > > Netherlands. >=20 > If this is indeed what's correct for these languages / what people living= there > prefer then it's okay for me. I'm just hoping that the kinda de-facto sta= ndard > en_US will stay with forward diacrits. I _guess_ Spanish is more frequent= ly > used there than French, plus again, I can't imagine how anyone ever could= have > come up with this braindamaged idea of backward diacrit sorting so I'd > personally prefer en_US not to have this craziness :-) the kind of defacto i18n locale has forward diacrits. i18n is the standard locale of ISO TR 30112. I think both Spanish and German needs forward diacrits, and Spanish being a bigger language than French would give that we should use forward diacrit as the default. Best regards Keld --=20 You are receiving this mail because: You are on the CC list for the bug.