public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Update sv_SE to treate 'W' as a distinct character (Bug 25036)
@ 2021-03-19  1:43 Carlos O'Donell
  2021-04-06 14:23 ` Carlos O'Donell
  0 siblings, 1 reply; 3+ messages in thread
From: Carlos O'Donell @ 2021-03-19  1:43 UTC (permalink / raw)
  To: libc-alpha, Sebastian Rasmussen, Mike FABIAN

From: Sebastian Rasmussen <sebras@gmail.com>

The 13th edition of Svenska Akademiens ordlista lists 'W' as a
distinct letter that sorts after 'V'. We adjust the sv_SE locale
(and tests) to match this updated and "reformed" language change.
This harmonizes us with CLDR 1.5.0 (2007) for sv_SE sorting of
the letter 'W'.

No regressions on x86_64, and locale sorting tests all pass.

Co-authored-by: Carlos O'Donell <carlos@redhat.com>
---
 localedata/locales/sv_SE       | 26 +++++++++-----------------
 localedata/sv_SE.ISO-8859-1.in |  4 ++--
 localedata/sv_SE.UTF-8.in      |  4 ++--
 3 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index b0901726db..f54c73226d 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -61,22 +61,25 @@ LC_COLLATE
 copy "iso14651_t1"
 
 % CLDR collation rules for Swedish:
-% (see: https://unicode.org/cldr/trac/browser/trunk/common/collation/sv.xml)
+% (https://github.com/unicode-org/cldr/blob/master/common/collation/sv.xml)
 %
-% <collation type="standard">
+% We use the new "reformed" rules from the 13th edition of Svenska Akademiens
+% ordlista where 'W' is considered a distinct character sorting after 'V'.
+% This matches CLDR 1.5.0 released in 2007.
+%
+% <defaultCollation>reformed</defaultCollation>
+% <collation type="reformed">
 %   <cr><![CDATA[
 %     &D<<đ<<<Đ<<ð<<<Ð
 %     &t<<<þ/h
 %     &T<<<Þ/H
-%     &v<<<V<<w<<<W
 %     &Y<<ü<<<Ü<<ű<<<Ű
 %     &[before 1]ǀ<å<<<Å<ä<<<Ä<<æ<<<Æ<<ę<<<Ę<ö<<<Ö<<ø<<<Ø<<ő<<<Ő<<œ<<<Œ<<ô<<<Ô
 %   ]]></cr>
 % </collation>
 %
-% And CLDR also lists the following
-% index characters:
-% (see: https://unicode.org/cldr/trac/browser/trunk/common/main/sv.xml)
+% And CLDR also lists the following index characters:
+% (https://github.com/unicode-org/cldr/blob/master/common/main/sv.xml)
 %
 % <exemplarCharacters type="index">[A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Å Ä Ö]</exemplarCharacters>
 %
@@ -103,17 +106,6 @@ reorder-after <AFTER-Z>
 <U00DE> "<S0074><S0068>";"<BASE><BASE>";"<COMPATCAP><COMPATCAP>";IGNORE % Þ
 <U00FE> "<S0074><S0068>";"<BASE><BASE>";"<COMPAT><COMPAT>";IGNORE % þ
 
-% The letter w is normally not present in the Swedish alphabet. It
-% exists in some names in Swedish and foreign words, but is accounted
-% for as a variant of 'v'.  Words and names with 'w' are in Swedish
-% ordered alphabetically among the words and names with 'v'. If two
-% words or names are only to be distinguished by 'v' or % 'w', 'v' is
-% placed before 'w'.
-
-% &v<<<V<<w<<<W
-<U0057> <S0076>;"<BASE><VRNT1>";"<CAP><MIN>";IGNORE % W
-<U0077> <S0076>;"<BASE><VRNT1>";"<MIN><MIN>";IGNORE % w
-
 % &Y<<ü<<<Ü<<ű<<<Ű
 <U00DC> <S0079>;"<BASE><TREMA>";"<CAP><MIN>";IGNORE % Ü
 <U00FC> <S0079>;"<BASE><TREMA>";"<MIN><MIN>";IGNORE % ü
diff --git a/localedata/sv_SE.ISO-8859-1.in b/localedata/sv_SE.ISO-8859-1.in
index 967c761370..94552ea80a 100644
--- a/localedata/sv_SE.ISO-8859-1.in
+++ b/localedata/sv_SE.ISO-8859-1.in
@@ -42,10 +42,10 @@ u
 U
 v
 V
-w
-W
 va
 Va
+w
+W
 x
 X
 y
diff --git a/localedata/sv_SE.UTF-8.in b/localedata/sv_SE.UTF-8.in
index 6db46e6271..80a093e709 100644
--- a/localedata/sv_SE.UTF-8.in
+++ b/localedata/sv_SE.UTF-8.in
@@ -65,10 +65,10 @@ U
 Ů
 v
 V
-w
-W
 va
 Va
+w
+W
 x
 X
 y
-- 
2.26.2


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-06 16:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-19  1:43 [PATCH] Update sv_SE to treate 'W' as a distinct character (Bug 25036) Carlos O'Donell
2021-04-06 14:23 ` Carlos O'Donell
2021-04-06 16:55   ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).