From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 784D73857822 for ; Thu, 14 Apr 2022 19:23:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 784D73857822 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649964206; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=4cYWd21qOqUAIv7H9NkpOQRBVkaxONpHCdCTHbp3FRs=; b=bhkM8YWtfn1N0OtGaFDVq2Y5cuqdfqDfMlMAsdvVn0qUigXvUrLLVHvgmXBlOF5pvFrBw4 NwkcfyViOfd9z3Fr7YD3GQgquaeWIE302XpspzNMinOANIaWApfOZDI7vS8gA1LRTyxVvV ub1WoiBOoB73R4OcDh0IM1bHUX3QDt0= Received: from mail-yb1-f200.google.com (mail-yb1-f200.google.com [209.85.219.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-605-qLklKuXKO_eBUNVDyRYFcg-1; Thu, 14 Apr 2022 15:23:24 -0400 X-MC-Unique: qLklKuXKO_eBUNVDyRYFcg-1 Received: by mail-yb1-f200.google.com with SMTP id d129-20020a254f87000000b006411bf3f331so5128922ybb.4 for ; Thu, 14 Apr 2022 12:23:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=4cYWd21qOqUAIv7H9NkpOQRBVkaxONpHCdCTHbp3FRs=; b=0Y3dqUD1OMdqHHVyvh+WllqfDUwVpgCNXmwu2+B1xJkxoO8k+nqh8iKcNjrq9XpvV2 xebWXF0s2MrBRXk0MyEUEuo6Fiv7Er5iql7TFumW89jhzgq7muBOlYQqabfIASOFZVfv gV7zWU++KRlfM0qQuwP+jHhuBBxxKoritOo0dbbT8alSmr/GmYTJMJYQutvWhZUV7fZX AEgtNkwioLerSS5odRyTE3nR1vXxjEa5pgfXFwKe79Y7OcY4ksfpyBk/9n3QJwMjbk+A eixuuLFF8B5iZ5v1Mc3X3V/vKXlYKsRQ7vFfQGFV6uJufgK+kEWfsuxbUvLJ+3GZQP/Q nhnw== X-Gm-Message-State: AOAM530Rj3eaUrx9dC+RWE4mWzy14rKiGHzlGvorUSwY8oMpdGnlgA0F cy4c2w2oX9Zg6bYx2D4GRCASLOLcqf/Toie1CSYTEHfYALh+UlYIlDUteErIk4GzWxfUin7f9H6 zoBviRIeDuanSf/rqHLMAilVk4tJba2I= X-Received: by 2002:a5b:9cc:0:b0:61d:f7ba:7fc with SMTP id y12-20020a5b09cc000000b0061df7ba07fcmr2817994ybq.434.1649964203762; Thu, 14 Apr 2022 12:23:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzY+nP6C4VEfMtYRLSpqKcg17W/ISCqctnE5Hw9IYm2iu+933Zn3w4afgVVztPR7PXgW7hxrT0EevifWHBrOtI= X-Received: by 2002:a5b:9cc:0:b0:61d:f7ba:7fc with SMTP id y12-20020a5b09cc000000b0061df7ba07fcmr2817978ybq.434.1649964203514; Thu, 14 Apr 2022 12:23:23 -0700 (PDT) MIME-Version: 1.0 References: <822e81a0-ed9f-200e-3318-0495456ad67e@SystematicSw.ab.ca> <20220407233425.2012-1-Brian.Inglis@SystematicSW.ab.ca> <9603e3aa-bd7d-6740-c710-27ace1d80397@SystematicSw.ab.ca> <25cfc7a2-2c66-f9fe-581b-8d3cec5d3bd9@jdoubleu.de> <15976fc5-2260-5fa7-ff4d-48ca8a640d59@SystematicSw.ab.ca> In-Reply-To: <15976fc5-2260-5fa7-ff4d-48ca8a640d59@SystematicSw.ab.ca> From: Jeff Johnston Date: Thu, 14 Apr 2022 15:23:12 -0400 Message-ID: Subject: Re: [PATCH] add tests for tzset(3) To: Newlib Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=jjohnstn@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, HTML_MESSAGE, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2022 19:23:29 -0000 On Thu, Apr 14, 2022 at 12:31 PM Brian Inglis < Brian.Inglis@systematicsw.ab.ca> wrote: > I am still not hearing from where the requirement originates to set > UTC/GMT/etc or do anything other than leave everything as is. > Is this glibc behaviour, and why not /etc/localtime or /etc/timezone? > > It is glibc behaviour as I mentioned in my note. The following is also from man tzset If the TZ variable does not appear in the environment, the tzname vari=E2=80=90 able is initialized with the best approximation of local wall clock time, as specified by the tzfile(5)-format file localtime found in the system timezone directory (see below). (One also often sees /etc/localtime used here, a symlink to the right file in the system timezone directory.) If the TZ variable does appear in the environment but its value is empty or its value cannot be interpreted using any of the formats spec=E2=80=90 ified below, Coordinated Universal Time (UTC) is used. Note about if its value is specified and cannot be interpreted using any of the formats specified. If there is an error, then that clause would apply. In glibc's case, it is less than 3 chars and invalid chars. In our case, exceeding the max limit would also apply. >From glibc: tzset.c /* Clear out old state and reset to unnamed UTC. */ memset (tz_rules, '\0', sizeof tz_rules); tz_rules[0].name =3D tz_rules[1].name =3D ""; /* Get the standard timezone name. */ if (parse_tzname (&tz, 0) && parse_offset (&tz, 0)) If the parse_tzname fails or parsing the dst name fails, unnamed UTC is used. -- Jeff J. --=20 > Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada > > This email may be disturbing to some readers as it contains > too much technical detail. Reader discretion is advised. > [Data in binary units and prefixes, physical quantities in SI.] > > > On 2022-04-14 02:59, jdoubleu wrote: > > On 2022-04-13 14:33, Jeff Johnston wrote: > >> Looking at the glibc tzset code I have locally (not latest/greatest, b= ut > >> does support angle brackets): > > > > I can confirm the behavior with glibc[1]. As it turns out, glibc does > > not directly impose a character limit on the timezone name, but require= s > > at least 3 characters. From the man page[2]: > > > >> The std string specifies an abbreviation for the timezone and must be > >> three or more alphabetic characters. > > > > To my misunderstanding, they don't even ignore remaining characters, bu= t > > keep all of them, as you can see in the output[1] and Jeff Johnston > > explained. > > > >> but you imply that glibc in fact uses the equivalent of the scanf > >> "%m[...]" (malloc) > modifier, and I think using that would be against > >> the newlib > > philosophy to keep things > >> limited and under control to support small targets. > > > > I agree, newlib SHOULD impose a limit. Especially, since the POSIX > > standard[3] already introduces an upper limit, though unspecified. > > > > The current limit is 11 characters, if I'm not mistaken. The longest > > name from the tzdb[4] is "<+1030>" i.e. 5 chars (see all extracted > > names[5]). All others usually are 3 or 4 chars long. > > > > That said, I think 11 is reasonably large enough. > > > > However, it could be helpful to get the limit from user-code, because > > there is no error reporting mechanism used. Right now, the limit is onl= y > > defined in tzset_r.c[6]. So maybe move it to limits.h? One thing to not > > forget here is to keep limit in sync with the sscanf format's maximum > > field width[7]. > > > > > > To summarize, the following cases are errors: > > 1. name is too short (less than 3 chars) > > 2. name is too long (more than TZNAME_MAX) > > 3. name includes arbitrary chars (not <>+-ALPHANUM) > > In all of these error cases, the time should be set back to UTC, right? > > > > > > I'm going to prepare some test cases for the test suite to check for th= e > > errors as well. > > > > > > [1]: https://godbolt.org/z/o93zo3qxv > > [2]: https://www.man7.org/linux/man-pages/man3/tzset.3.html > > [3]: > > > https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#= tag_08_03 > > > > [4]: https://github.com/eggert/tz > > [5]: > > > https://raw.githubusercontent.com/nayarsystems/posix_tz_db/master/zones.c= sv > > [6]: > > > https://sourceware.org/git/?p=3Dnewlib-cygwin.git;a=3Dblob;f=3Dnewlib/lib= c/time/tzset_r.c;h=3D9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=3DHEAD#l13 > > > > [7]: > > > https://sourceware.org/git/?p=3Dnewlib-cygwin.git;a=3Dblob;f=3Dnewlib/lib= c/time/tzset_r.c;h=3D9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=3DHEAD#l57 > > > > > > > > > > Cheers > > --- > > =F0=9F=99=8E=F0=9F=8F=BB=E2=80=8D=E2=99=82=EF=B8=8F jdoubleu > > On 4/14/2022 12:19 AM, Brian Inglis wrote: > >> On 2022-04-13 14:33, Jeff Johnston wrote: > >>> Looking at the glibc tzset code I have locally (not latest/greatest, > but > >>> does support angle brackets): > >>> > >>> If there any parse failures, UTC is defaulted. > >> > >> We currently leave the time zone info unchanged. > >> > >>> Extraneous characters inside brackets or less than 3 characters is a > >>> parse failure. > >> =E2=9C=94 Check =E2=9C=94 Check > >> > >>> Glibc parses the tz name string char by char and allocates space for > >>> the name strings so there is no max size. > >> > >> The suggestion was that glibc ignores the remaining characters, but > >> you imply that glibc in fact uses the equivalent of the scanf > >> "%m[...]" (malloc) modifier, and I think using that would be against > >> the newlib philosophy to keep things limited and under control to > >> support small targets. Larger targets like Cygwin (do our own thing > >> including zoneinfo), and perhaps RTEMS, can supply their own > >> enhancements. > >> > >>> the name strings so there is no max size. I am fine if you want to > >>> mandate a maximum, but if you do, then too many chars should be > >>> treated as a failure. If you aren't certain of the limit, make the > >>> limit higher than you expect. > >> Current limits are 3-10 allowing for e.g. which is the > >> most ever likely to be used. It might be reasonable to bump it up to > >> say 15. > >> > >>> If people run into max limit with reasonable timezone format strings, > >>> then > >>> we can up the limit. > >> > >> The conditions are more or less what is implemented, but we could do > >> with a couple more tweaks to improve things, like check for more or > >> extraneous chars within the bracket quotes, and that no characters > >> remain unconsumed at the end of the parse. > >