From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24261 invoked by alias); 28 Oct 2015 02:57:12 -0000 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org Received: (qmail 24204 invoked by uid 48); 28 Oct 2015 02:57:08 -0000 From: "digitalfreak at lingonborough dot com" To: glibc-bugs@sourceware.org Subject: [Bug localedata/10871] ru_RU: 'mon' array should contain both nominative and genitive cases Date: Wed, 28 Oct 2015 02:57:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: localedata X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: critical X-Bugzilla-Who: digitalfreak at lingonborough dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: libc-locales at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: security- X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-10/txt/msg00314.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=10871 Rafal Luzynski changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |digitalfreak@lingonborough. | |com --- Comment #7 from Rafal Luzynski --- I'll be happy to provide a complete solution for this problem but some API design questions must be answered first. Please note that CLDR mentions only "standalone" version of the month name which is probably always nominative, and "format" version which may be the same as "standalone" (e.g., in English) but may be genitive in some languages, it may also be another case in some other languages. For simplicity I will refer to these cases as nominative/genitive keeping in mind that CLDR refers to them as standalone/format. Also there may be languages which use other forms than nominative/genitive but I think there are probably always at most two forms since CLDR has decided to consider only two. I. strftime() - http://linux.die.net/man/3/strftime This function supports only one format which provides the full month name: %B. At the moment there is no way for this function to provide multiple forms of the full month name. Here are the API designs which would provide a full month name: 1. Do not change the API, implement an internal algorithm which would analyze a full format string and determine whether %B should format the month name in a nominative or genitive case. The simplest algorithm would check if %d or %e conversion specifiers are also present in the same format string, retrieve a genitive case if they are, nominative otherwise. More advanced version could check if the day and month conversion specifiers are adjacent, if they are separated with other conversion specifiers, with space/punctuation/other characters, if there are other letters concatenated with %B (which would mean that the caller already tries to provide a workaround for this bug), if the day/month order is correct (this is true only if day/month order is correct and month/day order is incorrect in all these languages). Pros: - once implemented correctly it will automagically fix all affected applications, - even if the implementation will not be perfect for some languages the result will not be worse than the one currently existing: it will not break any currently correct application, - if it turns out that this solution is completely wrong it will be easy to revert it and provide another one because we don't change the API. Cons: - may be difficult to implement, - it is questionable if a perfect algorithm exists for all affected languages, even if we check it for all languages mentioned in the comment 6 there may be other languages which we don't know about and which also require the nominative/genitive case but use different rules, - it is questionable how to handle the format strings which are incorrect from grammatical point of view: please note that strftime() API does not and should not say that there are illegal combination of the conversion specifiers. 2. Follow the specification already used in *BSD family (which also includes OS X and iOS): https://www.freebsd.org/cgi/man.cgi?query=strftime&sektion=3. They implement the %OB conversion specifier which retrieves the nominative case while %B specifier retrieves the genitive case (sic!) Pros: - full portability between glibc and *BSD, - simple and deterministic implementation, - full programmer's control on whether they want a nominative or genitive case, - will automagically fix all dates using %B conversion specifiers and displaying the nominative case which is incorrect (full dates). Cons: - at the same time will break formatting of all dates using %B conversion specifiers where the nominative case is required and is correctly provided now, the application developer may not even be aware that the application became broken in some languages, - therefore will require urgent intervention from some application developers, - it will be difficult or even impossible to provide a backward compatible solution which would detect if the current runtime version of glibc requires %OB or %B for the month name in nominative case, - one may question if the *BSD decision to retrieve a genitive case from %B is correct since it causes so much trouble. 3. Mimic the *BSD specification but implement it conversely: let %B retrieve the nominative case (as it currently does) and let the new %OB specifier retrieve the genitive case. See also: http://austingroupbugs.net/view.php?id=258 - this seems to has accepted this solution. Pros: - simple and deterministic implementation, - full programmer's control on whether they want a nominative or genitive case, - full backward compatibility, - will not break any existing application. Cons: - portability with *BSD family will never be possible (format specifiers war), - will require intervention from the application developers but it will not be urgent because it will apply only the cases where they use %B explicitly and this is already incorrect. I would choose the first solution: not to change the API and try to provide a smart algorithm which would determine if the month name retrieved by %B should be nominative or genitive but I will listen to your opinion. II. nl_langinfo() - http://linux.die.net/man/3/nl_langinfo Although strftime() does not call nl_langinfo() directly both these functions use the same backend database. We will need the new constants to be defined in langinfo.h, for example ALTMON_{1-12} and their wide-character equivalents _NL_WALTMON_{1-12}. This means it will affect the API of nl_langinfo() by adding new valid argument values. Please note that I am talking in the context of https://bugzilla.gnome.org/show_bug.cgi?id=749206 and the implementation of g_date_time_printf() does call nl_langinfo() to retrieve the month names. I hope it is valid to add these new symbols after _NL_TIME_CODESET and name them ALTMON_{1-12} and _NL_WALTMON_{1-12}. -- You are receiving this mail because: You are on the CC list for the bug.