public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell <carlos@redhat.com>
To: libc-alpha@sourceware.org, Sebastian Rasmussen <sebras@gmail.com>,
	Mike FABIAN <mfabian@redhat.com>
Subject: [PATCH] Update sv_SE to treate 'W' as a distinct character (Bug 25036)
Date: Thu, 18 Mar 2021 21:43:18 -0400	[thread overview]
Message-ID: <20210319014318.2565491-1-carlos@redhat.com> (raw)

From: Sebastian Rasmussen <sebras@gmail.com>

The 13th edition of Svenska Akademiens ordlista lists 'W' as a
distinct letter that sorts after 'V'. We adjust the sv_SE locale
(and tests) to match this updated and "reformed" language change.
This harmonizes us with CLDR 1.5.0 (2007) for sv_SE sorting of
the letter 'W'.

No regressions on x86_64, and locale sorting tests all pass.

Co-authored-by: Carlos O'Donell <carlos@redhat.com>
---
 localedata/locales/sv_SE       | 26 +++++++++-----------------
 localedata/sv_SE.ISO-8859-1.in |  4 ++--
 localedata/sv_SE.UTF-8.in      |  4 ++--
 3 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index b0901726db..f54c73226d 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -61,22 +61,25 @@ LC_COLLATE
 copy "iso14651_t1"
 
 % CLDR collation rules for Swedish:
-% (see: https://unicode.org/cldr/trac/browser/trunk/common/collation/sv.xml)
+% (https://github.com/unicode-org/cldr/blob/master/common/collation/sv.xml)
 %
-% <collation type="standard">
+% We use the new "reformed" rules from the 13th edition of Svenska Akademiens
+% ordlista where 'W' is considered a distinct character sorting after 'V'.
+% This matches CLDR 1.5.0 released in 2007.
+%
+% <defaultCollation>reformed</defaultCollation>
+% <collation type="reformed">
 %   <cr><![CDATA[
 %     &D<<đ<<<Đ<<ð<<<Ð
 %     &t<<<þ/h
 %     &T<<<Þ/H
-%     &v<<<V<<w<<<W
 %     &Y<<ü<<<Ü<<ű<<<Ű
 %     &[before 1]ǀ<å<<<Å<ä<<<Ä<<æ<<<Æ<<ę<<<Ę<ö<<<Ö<<ø<<<Ø<<ő<<<Ő<<œ<<<Œ<<ô<<<Ô
 %   ]]></cr>
 % </collation>
 %
-% And CLDR also lists the following
-% index characters:
-% (see: https://unicode.org/cldr/trac/browser/trunk/common/main/sv.xml)
+% And CLDR also lists the following index characters:
+% (https://github.com/unicode-org/cldr/blob/master/common/main/sv.xml)
 %
 % <exemplarCharacters type="index">[A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Å Ä Ö]</exemplarCharacters>
 %
@@ -103,17 +106,6 @@ reorder-after <AFTER-Z>
 <U00DE> "<S0074><S0068>";"<BASE><BASE>";"<COMPATCAP><COMPATCAP>";IGNORE % Þ
 <U00FE> "<S0074><S0068>";"<BASE><BASE>";"<COMPAT><COMPAT>";IGNORE % þ
 
-% The letter w is normally not present in the Swedish alphabet. It
-% exists in some names in Swedish and foreign words, but is accounted
-% for as a variant of 'v'.  Words and names with 'w' are in Swedish
-% ordered alphabetically among the words and names with 'v'. If two
-% words or names are only to be distinguished by 'v' or % 'w', 'v' is
-% placed before 'w'.
-
-% &v<<<V<<w<<<W
-<U0057> <S0076>;"<BASE><VRNT1>";"<CAP><MIN>";IGNORE % W
-<U0077> <S0076>;"<BASE><VRNT1>";"<MIN><MIN>";IGNORE % w
-
 % &Y<<ü<<<Ü<<ű<<<Ű
 <U00DC> <S0079>;"<BASE><TREMA>";"<CAP><MIN>";IGNORE % Ü
 <U00FC> <S0079>;"<BASE><TREMA>";"<MIN><MIN>";IGNORE % ü
diff --git a/localedata/sv_SE.ISO-8859-1.in b/localedata/sv_SE.ISO-8859-1.in
index 967c761370..94552ea80a 100644
--- a/localedata/sv_SE.ISO-8859-1.in
+++ b/localedata/sv_SE.ISO-8859-1.in
@@ -42,10 +42,10 @@ u
 U
 v
 V
-w
-W
 va
 Va
+w
+W
 x
 X
 y
diff --git a/localedata/sv_SE.UTF-8.in b/localedata/sv_SE.UTF-8.in
index 6db46e6271..80a093e709 100644
--- a/localedata/sv_SE.UTF-8.in
+++ b/localedata/sv_SE.UTF-8.in
@@ -65,10 +65,10 @@ U
 Ů
 v
 V
-w
-W
 va
 Va
+w
+W
 x
 X
 y
-- 
2.26.2


             reply	other threads:[~2021-03-19  1:43 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19  1:43 Carlos O'Donell [this message]
2021-04-06 14:23 ` Carlos O'Donell
2021-04-06 16:55   ` Carlos O'Donell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210319014318.2565491-1-carlos@redhat.com \
    --to=carlos@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=mfabian@redhat.com \
    --cc=sebras@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).