From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from omta001.cacentral1.a.cloudfilter.net (omta001.cacentral1.a.cloudfilter.net [3.97.99.32]) by sourceware.org (Postfix) with ESMTPS id 79AEA385780D for ; Thu, 14 Apr 2022 16:31:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 79AEA385780D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=SystematicSw.ab.ca Authentication-Results: sourceware.org; spf=none smtp.mailfrom=systematicsw.ab.ca Received: from shw-obgw-4003a.ext.cloudfilter.net ([10.228.9.183]) by cmsmtp with ESMTP id f06UnmMlA43Sgf2NSn09DD; Thu, 14 Apr 2022 16:31:26 +0000 Received: from [10.0.0.5] ([184.64.124.72]) by cmsmtp with ESMTP id f2NSnsnFiQV6mf2NSnHks5; Thu, 14 Apr 2022 16:31:26 +0000 X-Authority-Analysis: v=2.4 cv=PbTsOwtd c=1 sm=1 tr=0 ts=62584c5e a=oHm12aVswOWz6TMtn9zYKg==:117 a=oHm12aVswOWz6TMtn9zYKg==:17 a=IkcTkHD0fZMA:10 a=PeOOapuUAAAA:8 a=GcyzOjIWAAAA:8 a=uZvujYp8AAAA:8 a=NEAV23lmAAAA:8 a=N6dHsvTjAAAA:20 a=CCpqsmhAAAAA:8 a=ejknC5xS72zp2OFXFO8A:9 a=QEXdDO2ut3YA:10 a=7KOiVEw-WwsA:10 a=AomDDAB-fV4A:10 a=LZNa0TB5dTIA:10 a=0BaqRfgCL6CLbWgV2pdm:22 a=hQL3dl6oAZ8NdCsdz28n:22 a=SLzB8X_8jTLwj6mN0q5r:22 a=ul9cdbp4aOFLsgKbc677:22 Message-ID: <15976fc5-2260-5fa7-ff4d-48ca8a640d59@SystematicSw.ab.ca> Date: Thu, 14 Apr 2022 10:31:26 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 Reply-To: newlib@sourceware.org Subject: Re: [PATCH] add tests for tzset(3) Content-Language: en-CA To: newlib@sourceware.org References: <822e81a0-ed9f-200e-3318-0495456ad67e@SystematicSw.ab.ca> <20220407233425.2012-1-Brian.Inglis@SystematicSW.ab.ca> <9603e3aa-bd7d-6740-c710-27ace1d80397@SystematicSw.ab.ca> <25cfc7a2-2c66-f9fe-581b-8d3cec5d3bd9@jdoubleu.de> From: Brian Inglis Organization: Systematic Software In-Reply-To: <25cfc7a2-2c66-f9fe-581b-8d3cec5d3bd9@jdoubleu.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CMAE-Envelope: MS4xfELWqo3+MAEPQy54atZoOUh3eKS5LdnZmnoQNivuLgVzZeW9uSCh+I9I1/wXTDQKO0bc72RxJnhDZmTyZbTboFnOIpQo+PyI6h5F+jT8116G9uGXRcrZ b4SXjeRCSbgTPBIw7ZzZHSKc9fV5pWxiUTTqZt9Fjj+6CjabXzh3bvUE+gvUtlCg6hrX6U4BOubuHamnXSR1pDWVHAonKvRbCB4= X-Spam-Status: No, score=-1163.7 required=5.0 tests=BAYES_00, BODY_8BITS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2022 16:31:29 -0000 I am still not hearing from where the requirement originates to set UTC/GMT/etc or do anything other than leave everything as is. Is this glibc behaviour, and why not /etc/localtime or /etc/timezone? -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in binary units and prefixes, physical quantities in SI.] On 2022-04-14 02:59, jdoubleu wrote: > On 2022-04-13 14:33, Jeff Johnston wrote: >> Looking at the glibc tzset code I have locally (not latest/greatest, but >> does support angle brackets): > > I can confirm the behavior with glibc[1]. As it turns out, glibc does > not directly impose a character limit on the timezone name, but requires > at least 3 characters. From the man page[2]: > >> The std string specifies an abbreviation for the timezone and must be >> three or more alphabetic characters. > > To my misunderstanding, they don't even ignore remaining characters, but > keep all of them, as you can see in the output[1] and Jeff Johnston > explained. > >> but you imply that glibc in fact uses the equivalent of the scanf >> "%m[...]" (malloc) > modifier, and I think using that would be against >> the newlib > philosophy to keep things >> limited and under control to support small targets. > > I agree, newlib SHOULD impose a limit. Especially, since the POSIX > standard[3] already introduces an upper limit, though unspecified. > > The current limit is 11 characters, if I'm not mistaken. The longest > name from the tzdb[4] is "<+1030>" i.e. 5 chars (see all extracted > names[5]). All others usually are 3 or 4 chars long. > > That said, I think 11 is reasonably large enough. > > However, it could be helpful to get the limit from user-code, because > there is no error reporting mechanism used. Right now, the limit is only > defined in tzset_r.c[6]. So maybe move it to limits.h? One thing to not > forget here is to keep limit in sync with the sscanf format's maximum > field width[7]. > > > To summarize, the following cases are errors: > 1. name is too short (less than 3 chars) > 2. name is too long (more than TZNAME_MAX) > 3. name includes arbitrary chars (not <>+-ALPHANUM) > In all of these error cases, the time should be set back to UTC, right? > > > I'm going to prepare some test cases for the test suite to check for the > errors as well. > > > [1]: https://godbolt.org/z/o93zo3qxv > [2]: https://www.man7.org/linux/man-pages/man3/tzset.3.html > [3]: > https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 > > [4]: https://github.com/eggert/tz > [5]: > https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.csv > [6]: > https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l13 > > [7]: > https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l57 > > > > > Cheers > --- > 🙎🏻‍♂️ jdoubleu > On 4/14/2022 12:19 AM, Brian Inglis wrote: >> On 2022-04-13 14:33, Jeff Johnston wrote: >>> Looking at the glibc tzset code I have locally (not latest/greatest, but >>> does support angle brackets): >>> >>> If there any parse failures, UTC is defaulted. >> >> We currently leave the time zone info unchanged. >> >>> Extraneous characters inside brackets or less than 3 characters is a >>> parse failure. >> ✔ Check    ✔ Check >> >>> Glibc parses the tz name string char by char and allocates space for >>> the name strings so there is no max size. >> >> The suggestion was that glibc ignores the remaining characters, but >> you imply that glibc in fact uses the equivalent of the scanf >> "%m[...]" (malloc) modifier, and I think using that would be against >> the newlib philosophy to keep things limited and under control to >> support small targets.  Larger targets like Cygwin (do our own thing >> including zoneinfo), and perhaps RTEMS, can supply their own >> enhancements. >> >>> the name strings so there is no max size.  I am fine if you want to >>> mandate a maximum, but if you do, then too many chars should be >>> treated as a failure.  If you aren't certain of the limit, make the >>> limit higher than you expect. >> Current limits are 3-10 allowing for e.g. which is the >> most ever likely to be used. It might be reasonable to bump it up to >> say 15. >> >>> If people run into max limit with reasonable timezone format strings, >>> then >>> we can up the limit. >> >> The conditions are more or less what is implemented, but we could do >> with a couple more tweaks to improve things, like check for more or >> extraneous chars within the bracket quotes, and that no characters >> remain unconsumed at the end of the parse.