From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 3D4153857816 for ; Tue, 6 Apr 2021 14:23:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3D4153857816 Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-427-y0710W99POeDxLv3Dh3ddQ-1; Tue, 06 Apr 2021 10:23:43 -0400 X-MC-Unique: y0710W99POeDxLv3Dh3ddQ-1 Received: by mail-qv1-f69.google.com with SMTP id s8so3463200qve.16 for ; Tue, 06 Apr 2021 07:23:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=vTJzgwNMuev/ZLjBsVIUvSrH9VpcSk/sBBnG8qEZK0k=; b=saGnyFAGTc4z8ReVr0Rphpnt7EvQvocS7fsYuTVdHAjZuK7plCk2c7rLJQIUHJ0lGp UVFJUSBKeHONqxPCb8gBs9LERh0Pcth/zrZKx785tENf6Rg1LzLl7fgm+PfUhHDuR224 7NX8UsuCojTPkqqozWOJbjZfzDPRYDhRfO+v1SDV2rvKA06mA+eIgglQXCxuoGoh9tXw XaVeO62ruiwpNNUwbEd43B7/GvISWz1ErUMId22vIW29Nzwufr/k+LLYUJTHnfRYik3Q +pDCZu8q2nM/CMH1AmGoq/4miKRaJCDe2qeYQYevNlXN52r8UR39Unnk+Yqxw49Hth2o CTcA== X-Gm-Message-State: AOAM531hdPI7TjSZeeSCnezx/WHiPf/PXFdUZwjTqvN82h4WE9aPdD+q AuU5qVxI6iH9Mj+/vvQy9XlZSjepmLHNX8CNrsDecmguF46YXGgXsA9PjSvSvYMG2lPTrxoccVV WoSBGVyIxs2+wCPGrJX7s X-Received: by 2002:a37:74d:: with SMTP id 74mr30380975qkh.85.1617719023102; Tue, 06 Apr 2021 07:23:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyMo9NrbrrJCKiVbbxdW8Sg3w2qYc8QiS0WD0znt+XZCDyyQjKcLlUjWznsVP5O2jR0wQOXEg== X-Received: by 2002:a37:74d:: with SMTP id 74mr30380957qkh.85.1617719022837; Tue, 06 Apr 2021 07:23:42 -0700 (PDT) Received: from [192.168.1.16] (198-84-214-74.cpe.teksavvy.com. [198.84.214.74]) by smtp.gmail.com with ESMTPSA id b9sm13926232qtx.38.2021.04.06.07.23.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 06 Apr 2021 07:23:42 -0700 (PDT) Subject: Re: [PATCH] Update sv_SE to treate 'W' as a distinct character (Bug 25036) To: libc-alpha@sourceware.org, Sebastian Rasmussen , Mike FABIAN References: <20210319014318.2565491-1-carlos@redhat.com> From: Carlos O'Donell Organization: Red Hat Message-ID: <9cf1e9c6-dc26-4005-9d4d-d48170cd745e@redhat.com> Date: Tue, 6 Apr 2021 10:23:40 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20210319014318.2565491-1-carlos@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 14:23:50 -0000 On 3/18/21 9:43 PM, Carlos O'Donell wrote: > From: Sebastian Rasmussen > > The 13th edition of Svenska Akademiens ordlista lists 'W' as a > distinct letter that sorts after 'V'. We adjust the sv_SE locale > (and tests) to match this updated and "reformed" language change. > This harmonizes us with CLDR 1.5.0 (2007) for sv_SE sorting of > the letter 'W'. I will be committing this patch shortly to resolve this issue. I haven't seen any objections and the general consensus is to harmonize with CLDR which has already made these changes. General feedback from native speakers is that this is the correct way forward for the sv_SE locale. > No regressions on x86_64, and locale sorting tests all pass. > > Co-authored-by: Carlos O'Donell > --- > localedata/locales/sv_SE | 26 +++++++++----------------- > localedata/sv_SE.ISO-8859-1.in | 4 ++-- > localedata/sv_SE.UTF-8.in | 4 ++-- > 3 files changed, 13 insertions(+), 21 deletions(-) > > diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE > index b0901726db..f54c73226d 100644 > --- a/localedata/locales/sv_SE > +++ b/localedata/locales/sv_SE > @@ -61,22 +61,25 @@ LC_COLLATE > copy "iso14651_t1" > > % CLDR collation rules for Swedish: > -% (see: https://unicode.org/cldr/trac/browser/trunk/common/collation/sv.xml) > +% (https://github.com/unicode-org/cldr/blob/master/common/collation/sv.xml) > % > -% > +% We use the new "reformed" rules from the 13th edition of Svenska Akademiens > +% ordlista where 'W' is considered a distinct character sorting after 'V'. > +% This matches CLDR 1.5.0 released in 2007. > +% > +% reformed > +% > % % &D<<đ<<<Đ<<ð<<<Ð > % &t<<<þ/h > % &T<<<Þ/H > -% &v<< % &Y< % &[before 1]ǀ<å<<<Å<ä<<<Ä<<æ<<<Æ< % ]]> > % > % > -% And CLDR also lists the following > -% index characters: > -% (see: https://unicode.org/cldr/trac/browser/trunk/common/main/sv.xml) > +% And CLDR also lists the following index characters: > +% (https://github.com/unicode-org/cldr/blob/master/common/main/sv.xml) > % > % [A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Å Ä Ö] > % > @@ -103,17 +106,6 @@ reorder-after > "";"";"";IGNORE % Þ > "";"";"";IGNORE % þ > > -% The letter w is normally not present in the Swedish alphabet. It > -% exists in some names in Swedish and foreign words, but is accounted > -% for as a variant of 'v'. Words and names with 'w' are in Swedish > -% ordered alphabetically among the words and names with 'v'. If two > -% words or names are only to be distinguished by 'v' or % 'w', 'v' is > -% placed before 'w'. > - > -% &v<< - ;"";"";IGNORE % W > - ;"";"";IGNORE % w > - > % &Y< ;"";"";IGNORE % Ü > ;"";"";IGNORE % ü > diff --git a/localedata/sv_SE.ISO-8859-1.in b/localedata/sv_SE.ISO-8859-1.in > index 967c761370..94552ea80a 100644 > --- a/localedata/sv_SE.ISO-8859-1.in > +++ b/localedata/sv_SE.ISO-8859-1.in > @@ -42,10 +42,10 @@ u > U > v > V > -w > -W > va > Va > +w > +W > x > X > y > diff --git a/localedata/sv_SE.UTF-8.in b/localedata/sv_SE.UTF-8.in > index 6db46e6271..80a093e709 100644 > --- a/localedata/sv_SE.UTF-8.in > +++ b/localedata/sv_SE.UTF-8.in > @@ -65,10 +65,10 @@ U > Ů > v > V > -w > -W > va > Va > +w > +W > x > X > y > -- Cheers, Carlos.