On 1/11/22 11:05, Martin Liška wrote: > On 1/11/22 10:38, Jakub Jelinek wrote: >> On Tue, Jan 11, 2022 at 10:27:19AM +0100, Martin Liška wrote: >>> On 1/10/22 17:14, Martin Liška wrote: >>>> Are you fine with the suggested changes? >>> >>> Hello. >>> >>> Jakub had comments so I'm sending v2 where I added few parsing >>> exceptions. Now it reports: >> >> I'm still surprised by what the sort is doing, >> ( echo Chene; echo Chêne; echo Chfne ) | LC_ALL=en_US.UTF-8 sort >> Chene >> Chêne >> Chfne >> That is on glibc 2.32.  On glibc 2.34.9000 I get a different order though, >> Chêne last. >> That partly ruins the idea of the checking script when the sorting isn't >> the same for many people, either the script will be failing for many people >> or various people will be changing the order there and back all the time. >> >>     Jakub >> > > Or we can utilize https://pypi.org/project/Unidecode python package that provides: > > In [7]: unidecode.unidecode('Jääskeläinen') > Out[7]: 'Jaaskelainen' > > and sort it by that. > > Martin I'm going to push the change and re-order 2 names and we should be done. Martin