From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <brian.inglis@systematicsw.ab.ca>
Received: from omta001.cacentral1.a.cloudfilter.net
 (omta001.cacentral1.a.cloudfilter.net [3.97.99.32])
 by sourceware.org (Postfix) with ESMTPS id 0ABAF3858C2C
 for <newlib@sourceware.org>; Wed, 13 Apr 2022 17:53:29 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0ABAF3858C2C
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=SystematicSw.ab.ca
Authentication-Results: sourceware.org;
 spf=none smtp.mailfrom=systematicsw.ab.ca
Received: from shw-obgw-4004a.ext.cloudfilter.net ([10.228.9.227])
 by cmsmtp with ESMTP
 id eg5mnlRaY43SgehBInwXA9; Wed, 13 Apr 2022 17:53:28 +0000
Received: from [10.0.0.5] ([184.64.124.72]) by cmsmtp with ESMTP
 id ehBHnJxaGd7RfehBHnDC3P; Wed, 13 Apr 2022 17:53:28 +0000
X-Authority-Analysis: v=2.4 cv=XrLphHJ9 c=1 sm=1 tr=0 ts=62570e18
 a=oHm12aVswOWz6TMtn9zYKg==:117 a=oHm12aVswOWz6TMtn9zYKg==:17
 a=IkcTkHD0fZMA:10 a=GcyzOjIWAAAA:8 a=CCpqsmhAAAAA:8 a=uZvujYp8AAAA:8
 a=bFa4T7R0XJYlUjNNVqgA:9 a=QEXdDO2ut3YA:10 a=JWtY7oSOxG8A:10
 a=O-xZXXFxz3YA:10 a=7ipR1gm3pxYA:10 a=AomDDAB-fV4A:10
 a=hQL3dl6oAZ8NdCsdz28n:22 a=ul9cdbp4aOFLsgKbc677:22 a=SLzB8X_8jTLwj6mN0q5r:22
Message-ID: <9603e3aa-bd7d-6740-c710-27ace1d80397@SystematicSw.ab.ca>
Date: Wed, 13 Apr 2022 11:53:27 -0600
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
 Thunderbird/91.8.0
Reply-To: Newlib <newlib@sourceware.org>
Subject: Re: [PATCH] add tests for tzset(3)
Content-Language: en-CA
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca>
To: Newlib <newlib@sourceware.org>
References: <822e81a0-ed9f-200e-3318-0495456ad67e@SystematicSw.ab.ca>
 <20220407233425.2012-1-Brian.Inglis@SystematicSW.ab.ca>
Organization: Systematic Software
In-Reply-To: <20220407233425.2012-1-Brian.Inglis@SystematicSW.ab.ca>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-CMAE-Envelope: MS4xfJfq5OSU9UZwNOfwme0ixz33SD3lVFrMHbid52xQ1tpXtHgik8qHFMZfC4poxu6bDrSl1YMyezoRk4kjG+1eZQPASFgN3u3XaTlwmeNdrcIuZvCVA9be
 1R6fGvcEN6Ff6eSPD5kCN/+Ll8uOcX3eJt2iWhRiPkqt8AuWNYfnaxZ72y8RwD5T5dNzLG6Ga4EH0TP3tBKfEW0gN8Fn+s6Hp7c=
X-Spam-Status: No, score=-1163.1 required=5.0 tests=BAYES_00, BODY_8BITS,
 KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, NICE_REPLY_A,
 RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: newlib@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Newlib mailing list <newlib.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/newlib>,
 <mailto:newlib-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/newlib/>
List-Post: <mailto:newlib@sourceware.org>
List-Help: <mailto:newlib-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/newlib>,
 <mailto:newlib-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Apr 2022 17:53:31 -0000

On 2022-04-12 12:33, Brian Inglis wrote:
> On 2022-04-12 05:19, jdoubleu wrote:
>> On 4/11/2022 7:27 PM, Dimitar Dimitrov wrote:
>>> On Mon, Apr 11, 2022 at 01:17:16PM +0200, jdoubleu wrote:
>>>> looks like I'm running the testsuite against glibc and not newlib
>>>> (for target x86_64-pc-linux-gnu). I'm not even sure whether
>>>> there's a backend for linux.
>>>> I'm currently trying to run only the tzset code against the test
>>>> vectors (like Brian Inglis did[1]).
>>>> At least it show, that the newlib implementation differs from glibc. 
>>>> Maybe the test case is flawed and it should indeed fail.
> 
>>>>> 6:20:12 is the timezone string of the previous test case, whose tzset
>>>>> call was successful. Looking at the current code, this is expected
>>>>> behaviour.
> 
>>>> Okay. Looks like the condition[2] fails. The question is, which
>>>> part of it does?
>>> I believe it is the ('>' != tzenv[n]) check that fails, because the
>>> maximum parsed string limit of 10 has been reached and n=10.
> 
>>>> I've appended a patch, which prints all variables when the condition 
>>>> fails.
>>>> Could you please apply the patch and then recompile and re-run the 
>>>> tests
>>>> again?
> 
>>> Here is the result for arm-none-eabi:
>>> parsing name: tzenv="+0123456789ABCDEF>3:33:33", res=1, n=10, 
>>> tzenv[n] = 9
>>> Assertion failed! Expected 1647906533 to equal 1647916532. winter 
>>> time, timezone = "<+0123456789ABCDEF>3:33:33"
>>> I assume you no longer need assembly output from compiler?
> 
>>>> I've previously noticed something with the sscanf format[3].
>>>>> Perhaps TZ should be reset to UTC before the bail out?
> 
>>>> I don't think the implementation should fall back to UTC
>>>> whenever parsing failed. It apparently doesn't in glibc. I'm not
>>>> sure if the behavior is specified somewhere.
> 
>>> The tzset manual page says that UTC is used if TZ cannot be parsed:
>>> https://man7.org/linux/man-pages/man3/tzset.3.html
> 
>>>> Maybe resetting it before each test case is a good idea, though.
>>>> That makes it clearer, why the test case failed.
>>>>> With that chunk removed, as shown below:
>>>>>      {"3:33:33",               IN_SECONDS(3, 33, 33), 
>>>>> NO_TIME},                 // truncates the name (17 + 1)
>>>>> I still get:
>>>>>    Assertion failed! Expected 1647906533 to equal 1647916532. 
>>>>> winter time, timezone = "3:33:33"
> 
>>>> My bad; "3:33:33" isn't a valid timezone string. It has to be 
>>>> prefixed by a
>>>> name e.g. "MESZ" or "<+00>", as you tried. That explains why it is also
>>>> failing.
> 
>>>> [1]: https://sourceware.org/pipermail/newlib/2022/019529.html
>>>> [2]: 
>>>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l57 
>>>>
>>>> [3]: https://sourceware.org/pipermail/newlib/2022/019535.html
> 
>>> I believe it is the ('>' != tzenv[n]) check that fails, because the
>>> maximum parsed string limit of 10 has been reached and n=10.
> 
>> Okay, so the newlib implementation actually fails, when the name is
>> too long.
>> The POSIX standard leaves it up to the implementation how to handle 
>> names[1]:
> 
>>> The interpretation of these fields is unspecified if either field is 
>>> less than three bytes (except for the case when dst is missing), more 
>>> than {TZNAME_MAX} bytes, or if they contain characters other than 
>>> those specified.
> 
>> I'm not sure where to go with this. In both cases, the current 
>> implementation needs a fix. Looping in Brian Inglis, the writer of the 
>> current implementation and Jeff Johnston, a maintainer.
>> As I see it, there are two solutions:
>> 1. Treat long names as an error case. The failed condition[2] needs to 
>> apply UTC as current time, like Dimitar Dimitrov previously suggested. 
>> The test needs to be changed to check that it actually fails.
>> 2. Mimic glibc's behavior and ignore the remaining characters from the 
>> name. In that case, the condition[2] and following code needs to be 
>> updated as well.
>> I am happy to provide a patch, when we agreed on one solution.
> 
>>> I assume you no longer need assembly output from compiler?
> 
>> No, thanks. We already found the issue. It would only be helpful, if 
>> different configurations generate different code (through macros, 
>> optimization, etc.).
>> [1]:https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 
>> [2]:https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;hb=HEAD#l57 

> Does anyone know what glibc or the BSDs do on short < 3 char 
> abbreviations, or the BSDs do on > LIMIT abbreviations?
> What do they do on errors - nothing or revert to UTC?
> 
> Should tzset_r allow 1...2 char abbreviations or treat them as errors?
> 
> Should tzset_r allow > 10 char abbreviations, ignore and skip extra 
> chars, or treat them as errors?

Should tzset_r accept characters within quotes not in the allowed 
character set, ignore and skip them, or reject such abbreviations?

> If it ignores and skips extra or accepts too few characters, then 
> presumably it should process the remaining fields if present?
> 
> POSIX tzset_r says only that it overrides the default time zone, so is 
> there a spec for the default time zone, or is it implementation defined 
> (e.g. /etc/timezone /etc/localtime)?
> 
> Should tzset_r revert to UTC on errors, or just not change the TZ, and 
> leave it up to the application to handle the error?

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]