From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from omta001.cacentral1.a.cloudfilter.net (omta001.cacentral1.a.cloudfilter.net [3.97.99.32]) by sourceware.org (Postfix) with ESMTPS id 0ABAF3858C2C for ; Wed, 13 Apr 2022 17:53:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0ABAF3858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=SystematicSw.ab.ca Authentication-Results: sourceware.org; spf=none smtp.mailfrom=systematicsw.ab.ca Received: from shw-obgw-4004a.ext.cloudfilter.net ([10.228.9.227]) by cmsmtp with ESMTP id eg5mnlRaY43SgehBInwXA9; Wed, 13 Apr 2022 17:53:28 +0000 Received: from [10.0.0.5] ([184.64.124.72]) by cmsmtp with ESMTP id ehBHnJxaGd7RfehBHnDC3P; Wed, 13 Apr 2022 17:53:28 +0000 X-Authority-Analysis: v=2.4 cv=XrLphHJ9 c=1 sm=1 tr=0 ts=62570e18 a=oHm12aVswOWz6TMtn9zYKg==:117 a=oHm12aVswOWz6TMtn9zYKg==:17 a=IkcTkHD0fZMA:10 a=GcyzOjIWAAAA:8 a=CCpqsmhAAAAA:8 a=uZvujYp8AAAA:8 a=bFa4T7R0XJYlUjNNVqgA:9 a=QEXdDO2ut3YA:10 a=JWtY7oSOxG8A:10 a=O-xZXXFxz3YA:10 a=7ipR1gm3pxYA:10 a=AomDDAB-fV4A:10 a=hQL3dl6oAZ8NdCsdz28n:22 a=ul9cdbp4aOFLsgKbc677:22 a=SLzB8X_8jTLwj6mN0q5r:22 Message-ID: <9603e3aa-bd7d-6740-c710-27ace1d80397@SystematicSw.ab.ca> Date: Wed, 13 Apr 2022 11:53:27 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 Reply-To: Newlib Subject: Re: [PATCH] add tests for tzset(3) Content-Language: en-CA From: Brian Inglis To: Newlib References: <822e81a0-ed9f-200e-3318-0495456ad67e@SystematicSw.ab.ca> <20220407233425.2012-1-Brian.Inglis@SystematicSW.ab.ca> Organization: Systematic Software In-Reply-To: <20220407233425.2012-1-Brian.Inglis@SystematicSW.ab.ca> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CMAE-Envelope: MS4xfJfq5OSU9UZwNOfwme0ixz33SD3lVFrMHbid52xQ1tpXtHgik8qHFMZfC4poxu6bDrSl1YMyezoRk4kjG+1eZQPASFgN3u3XaTlwmeNdrcIuZvCVA9be 1R6fGvcEN6Ff6eSPD5kCN/+Ll8uOcX3eJt2iWhRiPkqt8AuWNYfnaxZ72y8RwD5T5dNzLG6Ga4EH0TP3tBKfEW0gN8Fn+s6Hp7c= X-Spam-Status: No, score=-1163.1 required=5.0 tests=BAYES_00, BODY_8BITS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Apr 2022 17:53:31 -0000 On 2022-04-12 12:33, Brian Inglis wrote: > On 2022-04-12 05:19, jdoubleu wrote: >> On 4/11/2022 7:27 PM, Dimitar Dimitrov wrote: >>> On Mon, Apr 11, 2022 at 01:17:16PM +0200, jdoubleu wrote: >>>> looks like I'm running the testsuite against glibc and not newlib >>>> (for target x86_64-pc-linux-gnu). I'm not even sure whether >>>> there's a backend for linux. >>>> I'm currently trying to run only the tzset code against the test >>>> vectors (like Brian Inglis did[1]). >>>> At least it show, that the newlib implementation differs from glibc. >>>> Maybe the test case is flawed and it should indeed fail. > >>>>> 6:20:12 is the timezone string of the previous test case, whose tzset >>>>> call was successful. Looking at the current code, this is expected >>>>> behaviour. > >>>> Okay. Looks like the condition[2] fails. The question is, which >>>> part of it does? >>> I believe it is the ('>' != tzenv[n]) check that fails, because the >>> maximum parsed string limit of 10 has been reached and n=10. > >>>> I've appended a patch, which prints all variables when the condition >>>> fails. >>>> Could you please apply the patch and then recompile and re-run the >>>> tests >>>> again? > >>> Here is the result for arm-none-eabi: >>> parsing name: tzenv="+0123456789ABCDEF>3:33:33", res=1, n=10, >>> tzenv[n] = 9 >>> Assertion failed! Expected 1647906533 to equal 1647916532. winter >>> time, timezone = "<+0123456789ABCDEF>3:33:33" >>> I assume you no longer need assembly output from compiler? > >>>> I've previously noticed something with the sscanf format[3]. >>>>> Perhaps TZ should be reset to UTC before the bail out? > >>>> I don't think the implementation should fall back to UTC >>>> whenever parsing failed. It apparently doesn't in glibc. I'm not >>>> sure if the behavior is specified somewhere. > >>> The tzset manual page says that UTC is used if TZ cannot be parsed: >>> https://man7.org/linux/man-pages/man3/tzset.3.html > >>>> Maybe resetting it before each test case is a good idea, though. >>>> That makes it clearer, why the test case failed. >>>>> With that chunk removed, as shown below: >>>>>      {"3:33:33",               IN_SECONDS(3, 33, 33), >>>>> NO_TIME},                 // truncates the name (17 + 1) >>>>> I still get: >>>>>    Assertion failed! Expected 1647906533 to equal 1647916532. >>>>> winter time, timezone = "3:33:33" > >>>> My bad; "3:33:33" isn't a valid timezone string. It has to be >>>> prefixed by a >>>> name e.g. "MESZ" or "<+00>", as you tried. That explains why it is also >>>> failing. > >>>> [1]: https://sourceware.org/pipermail/newlib/2022/019529.html >>>> [2]: >>>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l57 >>>> >>>> [3]: https://sourceware.org/pipermail/newlib/2022/019535.html > >>> I believe it is the ('>' != tzenv[n]) check that fails, because the >>> maximum parsed string limit of 10 has been reached and n=10. > >> Okay, so the newlib implementation actually fails, when the name is >> too long. >> The POSIX standard leaves it up to the implementation how to handle >> names[1]: > >>> The interpretation of these fields is unspecified if either field is >>> less than three bytes (except for the case when dst is missing), more >>> than {TZNAME_MAX} bytes, or if they contain characters other than >>> those specified. > >> I'm not sure where to go with this. In both cases, the current >> implementation needs a fix. Looping in Brian Inglis, the writer of the >> current implementation and Jeff Johnston, a maintainer. >> As I see it, there are two solutions: >> 1. Treat long names as an error case. The failed condition[2] needs to >> apply UTC as current time, like Dimitar Dimitrov previously suggested. >> The test needs to be changed to check that it actually fails. >> 2. Mimic glibc's behavior and ignore the remaining characters from the >> name. In that case, the condition[2] and following code needs to be >> updated as well. >> I am happy to provide a patch, when we agreed on one solution. > >>> I assume you no longer need assembly output from compiler? > >> No, thanks. We already found the issue. It would only be helpful, if >> different configurations generate different code (through macros, >> optimization, etc.). >> [1]:https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 >> [2]:https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;hb=HEAD#l57 > Does anyone know what glibc or the BSDs do on short < 3 char > abbreviations, or the BSDs do on > LIMIT abbreviations? > What do they do on errors - nothing or revert to UTC? > > Should tzset_r allow 1...2 char abbreviations or treat them as errors? > > Should tzset_r allow > 10 char abbreviations, ignore and skip extra > chars, or treat them as errors? Should tzset_r accept characters within quotes not in the allowed character set, ignore and skip them, or reject such abbreviations? > If it ignores and skips extra or accepts too few characters, then > presumably it should process the remaining fields if present? > > POSIX tzset_r says only that it overrides the default time zone, so is > there a spec for the default time zone, or is it implementation defined > (e.g. /etc/timezone /etc/localtime)? > > Should tzset_r revert to UTC on errors, or just not change the TZ, and > leave it up to the application to handle the error? -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in binary units and prefixes, physical quantities in SI.]