public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
From: jdoubleu <hi@jdoubleu.de>
To: newlib@sourceware.org
Subject: Re: [PATCH] add tests for tzset(3)
Date: Thu, 14 Apr 2022 10:59:22 +0200	[thread overview]
Message-ID: <25cfc7a2-2c66-f9fe-581b-8d3cec5d3bd9@jdoubleu.de> (raw)
In-Reply-To: <a2bd00ae-897a-01f9-2fc9-33df0b0ee08b@SystematicSw.ab.ca>

On 2022-04-13 14:33, Jeff Johnston wrote:
> Looking at the glibc tzset code I have locally (not latest/greatest, but
> does support angle brackets): 

I can confirm the behavior with glibc[1]. As it turns out, glibc does 
not directly impose a character limit on the timezone name, but requires 
at least 3 characters. From the man page[2]:

> The std string specifies an abbreviation for the timezone and must be
> three or more alphabetic characters.

To my misunderstanding, they don't even ignore remaining characters, but 
keep all of them, as you can see in the output[1] and Jeff Johnston 
explained.

> but you imply that glibc in fact uses the equivalent of the scanf "%m[...]" (malloc) > modifier, and I think using that would be against the newlib 
philosophy to keep things
> limited and under control to support small targets.

I agree, newlib SHOULD impose a limit. Especially, since the POSIX 
standard[3] already introduces an upper limit, though unspecified.

The current limit is 11 characters, if I'm not mistaken. The longest 
name from the tzdb[4] is "<+1030>" i.e. 5 chars (see all extracted 
names[5]). All others usually are 3 or 4 chars long.

That said, I think 11 is reasonably large enough.

However, it could be helpful to get the limit from user-code, because 
there is no error reporting mechanism used. Right now, the limit is only 
defined in tzset_r.c[6]. So maybe move it to limits.h? One thing to not 
forget here is to keep limit in sync with the sscanf format's maximum 
field width[7].


To summarize, the following cases are errors:
1. name is too short (less than 3 chars)
2. name is too long (more than TZNAME_MAX)
3. name includes arbitrary chars (not <>+-ALPHANUM)
In all of these error cases, the time should be set back to UTC, right?


I'm going to prepare some test cases for the test suite to check for the 
errors as well.


[1]: https://godbolt.org/z/o93zo3qxv
[2]: https://www.man7.org/linux/man-pages/man3/tzset.3.html
[3]: 
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03
[4]: https://github.com/eggert/tz
[5]: 
https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.csv
[6]: 
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l13
[7]: 
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l57



Cheers
---
🙎🏻‍♂️ jdoubleu
On 4/14/2022 12:19 AM, Brian Inglis wrote:
> On 2022-04-13 14:33, Jeff Johnston wrote:
>> Looking at the glibc tzset code I have locally (not latest/greatest, but
>> does support angle brackets):
>>
>> If there any parse failures, UTC is defaulted.
> 
> We currently leave the time zone info unchanged.
> 
>> Extraneous characters inside brackets or less than 3 characters is a
>> parse failure.
> ✔ Check    ✔ Check
> 
>> Glibc parses the tz name string char by char and allocates space for
>> the name strings so there is no max size.
> 
> The suggestion was that glibc ignores the remaining characters, but you 
> imply that glibc in fact uses the equivalent of the scanf "%m[...]" 
> (malloc) modifier, and I think using that would be against the newlib 
> philosophy to keep things limited and under control to support small 
> targets.  Larger targets like Cygwin (do our own thing including 
> zoneinfo), and perhaps RTEMS, can supply their own enhancements.
> 
>> the name strings so there is no max size.  I am fine if you want to 
>> mandate a maximum, but if you do, then too many chars should be 
>> treated as a failure.  If you aren't certain of the limit, make the 
>> limit higher than you expect.
> Current limits are 3-10 allowing for e.g. <MESZ+03:30> which is the most 
> ever likely to be used. It might be reasonable to bump it up to say 15.
> 
>> If people run into max limit with reasonable timezone format strings, 
>> then
>> we can up the limit.
> 
> The conditions are more or less what is implemented, but we could do 
> with a couple more tweaks to improve things, like check for more or 
> extraneous chars within the bracket quotes, and that no characters 
> remain unconsumed at the end of the parse.

  reply	other threads:[~2022-04-14  8:59 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-07 15:58 jdoubleu
2022-04-08 21:21 ` Jeff Johnston
2022-04-10  8:43 ` Dimitar Dimitrov
2022-04-10 17:55   ` jdoubleu
2022-04-10 21:00     ` Dimitar Dimitrov
2022-04-11 11:17       ` jdoubleu
2022-04-11 17:27         ` Dimitar Dimitrov
2022-04-12 11:19           ` jdoubleu
2022-04-12 18:33             ` Brian Inglis
2022-04-07 23:34               ` [PATCH v2 0/2] add tzset/_r POSIX angle bracket <> support in TZ env var Brian Inglis
2022-04-07 23:34                 ` [PATCH v2 1/2] newlib/libc/time/tzset.c: doc update POSIX angle bracket <> support Brian Inglis
2022-04-07 23:34                 ` [PATCH v2 2/2] newlib/libc/time/tzset_r.c(_tzset_unlocked_r): " Brian Inglis
2022-04-08 19:11                 ` [PATCH v2 0/2] add tzset/_r POSIX angle bracket <> support in TZ env var Jeff Johnston
2022-04-13 17:53                 ` [PATCH] add tests for tzset(3) Brian Inglis
2022-04-13 20:33                   ` Jeff Johnston
2022-04-13 22:19                     ` Brian Inglis
2022-04-14  8:59                       ` jdoubleu [this message]
2022-04-14 16:31                         ` Brian Inglis
2022-04-14 19:23                           ` Jeff Johnston
2022-04-15 10:10                             ` jdoubleu
2022-04-27 19:30                               ` Jeff Johnston
2022-05-14 14:39                         ` jdoubleu
2022-05-16 16:05                           ` Dimitar Dimitrov
2022-05-16 17:38                             ` Jeff Johnston
2022-05-17  8:45                           ` [PATCH] update tzset tests jdoubleu
2022-05-18 18:48                             ` Dimitar Dimitrov
2022-05-18 20:56                               ` Keith Packard
2022-05-19  8:47                                 ` jdoubleu
2022-05-22  9:51                                   ` [PATCH v2] " jdoubleu
2022-05-22 21:02                                     ` Dimitar Dimitrov
2022-05-27 15:46                                       ` Jeff Johnston
2022-04-13 22:21               ` [PATCH] add tests for tzset(3) Brian Inglis
2022-04-29 15:46 Jeff Johnston
2022-05-12 18:35 ` jdoubleu
2022-05-16 17:49   ` Jeff Johnston

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25cfc7a2-2c66-f9fe-581b-8d3cec5d3bd9@jdoubleu.de \
    --to=hi@jdoubleu.de \
    --cc=newlib@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).