From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 103570 invoked by alias); 1 Jun 2016 21:51:17 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 103557 invoked by uid 89); 1 Jun 2016 21:51:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.0 required=5.0 tests=AWL,BAYES_05,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=literally, dirty, UD:view.php, viewphp X-HELO: aev204.rev.netart.pl Date: Wed, 01 Jun 2016 21:51:00 -0000 From: Rafal Luzynski Reply-To: Rafal Luzynski To: libc-alpha@sourceware.org Message-ID: <323322572.685262.92369107-bdae-4a8b-b71f-99b919bc0cf0.open-xchange@poczta.nazwa.pl> In-Reply-To: <20160601104220.GA1077@altlinux.org> References: <1155243857.420233.60a90901-4334-4cea-aa99-f76884316a10.open-xchange@poczta.nazwa.pl> <20160329143132.GA28928@altlinux.org> <666336576.426212.9ea90152-1d54-4eec-8ffa-81bfd328d92b.open-xchange@poczta.nazwa.pl> <20160601104220.GA1077@altlinux.org> Subject: Re: [RFC][PATCH v2 3/6] Implement the %OB specifier - alternative month names (bug 10871) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Originating-Client: com.openexchange.ox.gui.dhtml X-SW-Source: 2016-06/txt/msg00018.txt.bz2 Hello Dmitry, Thank you for your feedback. Please find my answers below: 1.06.2016 12:42 "Dmitry V. Levin" wrote: > > On Wed, Mar 30, 2016 at 01:31:21AM +0200, Rafal Luzynski wrote: > > 29.03.2016 16:31 "Dmitry V. Levin" wrote: > > > On Fri, Mar 25, 2016 at 01:55:13AM +0100, Rafal Luzynski wrote: > > > [...] > > > > This means that all applications using %B to retrieve the month > > > > name standalone should use %OB from now. > > > > > > Such applications as cal(1) would not be able to print month names > > > properly in a way that would work with different glibc versions. > > > Looks like this is a change incompatible in both ways. > > > > Yes, that's exactly what will happen. Such applications must be > > updated. They must start using strftime("%OB") and nl_langinfo(ALTMON_...). > > They must either detect the glibc version at runtime and choose > > the correct format specifier or require the minimum glibc version > > at build time. I'm willing to contact the upstream developers and > > provide the instructions how to change their applications. > > In glibc, we don't make changes this way. If an incompatible ABI change > is introduced, the old ABI remains for compatibility with software linked > with it. That's why I was thinking [1] about other solutions (this link discusses all pros and cons but is a little outdated). Shortly: 1. The simplest solution to ensure the full compatibility would be to leave strftime("%B",...) and nl_langinfo(MON_...) unchanged and provide the genitive forms via the new symbols strftime("%OB",...) and nl_langinfo(ALTMON_...). Sounds nice but this means that glibc would never be compatible with *BSD [2] (you may say you don't care) and with POSIX [3] (I guess glibc would rather choose to be compatible.) This would also mean that such programs as cal(1) will not be broken but all other programs will be broken. I don't mean that the patch will break them; I mean they are broken already and such fix will not fix them, all these programs will have to switch to "%OB" and ALTMON_. Just think how many programs display the month names standalone, how many display full date (day + month at least) and how many don't use dates so they don't care. Worse, the meaning of "%OB" and ALTMON_ in *BSD and Linux would be conversed. 2. I also thought about a smart heuristic algorithm detecting if "%B" is used together with a day number (so a genitive form is needed) or standalone (so a nominative form is needed). This leads to the questions like: what does it mean that a day number and a month name are close to each other? what is the order (day-month vs. month-day) required by the language? what to do if a software uses a reversed day-month order against the language rules? That led me to proposing a day-month-order locale parameter whose meaning is difficult to explain and the implementation is tricky. Also this solution would not work for the programs like date(1) which iterate over whole format string, split it and call strftime() with each format specifier separately, out of the context. I'm afraid that providing full backward compatibility is impossible. If an application calls strftime("%B",...) or nl_langinfo(MON_...) we have no way to tell if this particular application actually meant "%OB" and ALTMON_... or it is indeed correct for it to call "%B" and MON_. However, please note that in most cases the current call of strftime("%B",...) and nl_langinfo(MON_...) produces incorrect results and will automagically start producing correct results if glibc accepts my recent solution. I believe there are fewer programs where strftime("%B") and nl_langinfo(MON_...) are correct now and they will become incorrect, they will need switching to "%OB" and ALTMON_. > With regards to runtime checks, could you give an example of such a check? I meant something like: #include const char *ver_string = gnu_get_libc_version(); /* Parse ver_string into ver_major and ver_minor */ if (ver_major > 2 || ver_major == 2 && ver_minor >= 24) { /* New glibc including my patch */ strftime("%B",...); /* output: month name in a genitive form or nominative for those languages which don't need a genitive form */ strftime("%OB",...); /* output: month name in a nominative form */ strftime("%Om",...); /* not useful in any European language */ strftime("%d %B",...); /* usually the correct date format */ } else { /* Old glibc */ strftime("%B",...); /* output: month name in a nominative form even if it should be genitive */ strftime("%OB",...); /* output: "%OB" string literally */ strftime("%Om",...); /* output: correct genitive form but only in Ukrainian locale (dirty hack) */ strftime("%d %B",...); /* many programs do it but the result is incorrect in many languages */ } But this is a solution only for closed source software distributed in a binary form. For those distributed in source form one may expect that the current glibc version may be detected at compile time (__GLIBC__ and __GLIBC_MINOR__) and a binary package is provided for every specific distro containing a specific glibc version. In case of nl_langinfo it's easier: /* In case we have the old headers but want to support new glibc */ #ifndef ALTMON_1 #define ALTMON_1 ((nl_item) (((int) _NL_TIME_CODESET) + 1) #endif /* The same for all other months */ char *june = nl_langinfo(ALTMON_6); if (!*june) june = nl_langinfo(MON_6); This will always be valid and always obligatory because nl_langinfo(ALTMON_...) will return an empty string both on old systems where ALTMON_... is not valid and on new systems for the locales which do not need to distinguish between nominative and genitive form (like English) or have not yet updated their locale data. I'm aware that none of my solutions is perfect, I'm just trying to minimize fallout. I'm open to any other solution which leads to correct results, is backward compatible, portable, and (preferably but not obligatorily) simple. Regards, Rafal Luzynski Links: [1] https://sourceware.org/bugzilla/show_bug.cgi?id=10871#c7 [2] https://www.freebsd.org/cgi/man.cgi?query=strftime&sektion=3 [3] http://austingroupbugs.net/view.php?id=258