From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from omta002.cacentral1.a.cloudfilter.net (omta002.cacentral1.a.cloudfilter.net [3.97.99.33]) by sourceware.org (Postfix) with ESMTPS id 2A0DA3858C27 for ; Mon, 14 Feb 2022 17:10:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2A0DA3858C27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=SystematicSw.ab.ca Authentication-Results: sourceware.org; spf=none smtp.mailfrom=systematicsw.ab.ca Received: from shw-obgw-4001a.ext.cloudfilter.net ([10.228.9.142]) by cmsmtp with ESMTP id JecbnK9xyyr5HJernnLgU7; Mon, 14 Feb 2022 17:10:23 +0000 Received: from [10.0.0.5] ([184.64.124.72]) by cmsmtp with ESMTP id JernnlE85ebQ7Jernn93qN; Mon, 14 Feb 2022 17:10:23 +0000 X-Authority-Analysis: v=2.4 cv=fZK+dmcF c=1 sm=1 tr=0 ts=620a8cff a=oHm12aVswOWz6TMtn9zYKg==:117 a=oHm12aVswOWz6TMtn9zYKg==:17 a=IkcTkHD0fZMA:10 a=CCpqsmhAAAAA:8 a=mDV3o1hIAAAA:8 a=GcyzOjIWAAAA:8 a=I0CVDw5ZAAAA:8 a=N6dHsvTjAAAA:20 a=NEAV23lmAAAA:8 a=uZvujYp8AAAA:8 a=te1EGT4yAAAA:8 a=ejknC5xS72zp2OFXFO8A:9 a=QEXdDO2ut3YA:10 a=Gy5p1UE1U9UA:10 a=x8iypR0hzvIA:10 a=7KOiVEw-WwsA:10 a=AomDDAB-fV4A:10 a=tpO3km1f0-YA:10 a=LZNa0TB5dTIA:10 a=ul9cdbp4aOFLsgKbc677:22 a=_FVE-zBwftR9WsbkzFJk:22 a=hQL3dl6oAZ8NdCsdz28n:22 a=YdXdGVBxRxTCRzIkH2Jn:22 a=SLzB8X_8jTLwj6mN0q5r:22 a=RRElR4r2U1jGY2dU47NL:22 Message-ID: <758cfb47-ac13-fb88-877e-63a1d4327429@SystematicSw.ab.ca> Date: Mon, 14 Feb 2022 10:10:23 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.0 Reply-To: newlib@sourceware.org Subject: Re: Support non-POSIX TZ strings Content-Language: en-CA To: newlib@sourceware.org References: From: Brian Inglis Organization: Systematic Software In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CMAE-Envelope: MS4xfAi4y8cRJzZosYMltJLyv1e9oAwuINPsuMtBFtLFGpWxfgPHPjkNZP9imVn6bUG7ezsWcnkLV9LJFg4gP4g3rUpLCPXLArBvCzArn1azpM1ApxD0W4Df FlSNeh7pFOTY54hdQNF8AOcfDZs3r9LLg3SJ6jk/PtU09d6C0RjftVPvldbPNNOMwJE+NCFiRnnQjl0cCML4tbzdTL7mspDvUIA= X-Spam-Status: No, score=-1163.3 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, NICE_REPLY_A, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Feb 2022 17:10:26 -0000 On 2022-02-14 06:21, jdoubleu wrote: > Hello, > > I stumbled upon an issue with some TZ strings not handled as expected by newlib's tzset() function. > The tzset functions expects the string stored in the TZ environment variable to follow the POSIX format as described here: https://sourceware.org/newlib/libc.html#tzset (or https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html ). > > However, the glibc implementation extends the format and additionally allows ‘<[+|-]hh[:mm[:ss]]>’ in the format (compare https://www.man7.org/linux/man-pages/man3/tzset.3.html ). It seems like the timezone database (zoneinfo) provided by the IANA (https://www.iana.org/time-zones ) adopted that format; or at least the zic compiler generates these strings in the zoneinfo files for most systems. > > That leads to the timezone for "America/Argentina/Buenos_Aires” to be "<-03>3”, as can be seen in this dump https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.csv or a linux system: `tail -n 1 /usr/share/zoneinfo/America/Argentina/Buenos_Aires`. > > Some more background information can be found here https://github.com/esp8266/Arduino/issues/8423 and here https://github.com/esp8266/Arduino/issues/7690 . > > One way to approach this is for the user to just replace the incompatible part of the string with a valid timezone identifier, as proposed by https://github.com/esp8266/Arduino/pull/7699 . > Since the timezone identifier (e.g. `PST`, `PDT`, `CET`, …) is not really used elsewhere by newlib, this should not be a problem, as far as I can imagine. > > On the other hand, some ports implemented a proper parsing: https://github.com/earlephilhower/newlib-xtensa/pull/14 . > > Now my question is whether the extended format should be support by newlib? Is this desired behaviour and would you accept code contributions for that matter? Not sure what point you are trying to make and your terminology is non-standard, but we should start with the actual POSIX spec under TZ: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 and the current implementation: https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c which does not handle "<" ">" quoted POSIX +/-numeric time zone *abbreviations*, now common in the TZ database. The BSD or TZcode implementations could probably be adapted to update newlib tzset to avoid reinvention e.g. https://github.com/eggert/tz/blob/main/newtzset.3 https://github.com/eggert/tz/blob/main/localtime.c#L1081 thru https://github.com/eggert/tz/blob/main/localtime.c#L1400 [The original (American) English language time zone abbreviations were often made up by the (American) TZ database maintainers and mailing list users, and never used or published in the locale (e.g Germany used German language time zone abbreviations like MEZ/MESZ not MET, similarly for other European countries, see CLDR time zone abbreviations), only by (American) English and mailing list users. These made up (American) English language time zone abbreviations were tracked down and replaced by the current TZ database maintainers after the POSIX spec was expanded, but none are considered canonical, and CLDR locale time zone abbreviations, as supported by ICU, are preferred (see announcements on the home page https://unicode.org/). ICU4X (https://github.com/unicode-org/icu4x) is being developed to support "resource constrained" environments, but as the language bindings include Rust, Objective C, C++, whether that will be usable with embedded libraries such as newlib, musl, uclibc, dietlibc, picolibc, might be ascertained by starting a discussion as encouraged on the project site.] -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in binary units and prefixes, physical quantities in SI.]