From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 9E3DF395B403 for ; Mon, 19 Jul 2021 15:37:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9E3DF395B403 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-12-WopzaYuYO4mW1fDi5LuuhQ-1; Mon, 19 Jul 2021 11:37:12 -0400 X-MC-Unique: WopzaYuYO4mW1fDi5LuuhQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DB98319057AA; Mon, 19 Jul 2021 15:37:11 +0000 (UTC) Received: from oldenburg.str.redhat.com (ovpn-112-73.phx2.redhat.com [10.3.112.73]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3F43219CBC; Mon, 19 Jul 2021 15:37:07 +0000 (UTC) From: Florian Weimer To: Carlos O'Donell Cc: Paul Eggert , libc-alpha@sourceware.org Subject: Re: C.UTF-8 review References: <87o8b5ds5q.fsf@oldenburg.str.redhat.com> <5edf60ff-1c3e-7086-b78f-5707c20e33d2@cs.ucla.edu> <878s287y6o.fsf@oldenburg.str.redhat.com> <3a2dab05-ee68-2593-cf9b-707eb1552ec5@redhat.com> Date: Mon, 19 Jul 2021 17:37:04 +0200 In-Reply-To: <3a2dab05-ee68-2593-cf9b-707eb1552ec5@redhat.com> (Carlos O'Donell's message of "Mon, 19 Jul 2021 11:33:08 -0400") Message-ID: <87eebuuxgf.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jul 2021 15:37:15 -0000 * Carlos O'Donell: > On 7/15/21 4:56 AM, Florian Weimer wrote: >> * Paul Eggert: >>=20 >>> Dumb question. Will C.UTF-8 have the same worst-case strcoll >>> performance that en_US.UTF-8 does? I'm asking because I wonder whether >>> we can recommend C.UTF-8 as a workaround for the strcoll performance >>> bug, in cases where plain C is not appropriate. >>> >>> https://sourceware.org/bugzilla/show_bug.cgi?id=3D18441 >>> >>> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D49340 >>=20 >> I'm not sure if that advice is correct =E2=80=A6 > > Could you expand on this a bit more please? Using C.UTF-8 with collation rules is unlikely to provide the intended speed benefit. Thanks, Florian