public inbox for cygwin-patches@cygwin.com
 help / color / mirror / Atom feed
From: Brian Inglis <Brian.Inglis@Shaw.ca>
To: cygwin-patches@cygwin.com
Cc: Pedro Luis Castedo Cepeda <pedroluis.castedo@upm.es>
Subject: Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
Date: Fri, 10 Nov 2023 22:57:28 -0700	[thread overview]
Message-ID: <5cd4b96f-cad1-456c-b4d9-a6a649d36e3a@Shaw.ca> (raw)
In-Reply-To: <27a7257d-1e06-40ff-89ec-f100b8734802@upm.es>

[-- Attachment #1: Type: text/plain, Size: 3739 bytes --]

On 2023-11-10 10:44, Pedro Luis Castedo Cepeda wrote:
> El 10/11/2023 a las 11:16, Corinna Vinschen escribió:
>> On Nov  9 23:17, Brian Inglis wrote:
>>> On 2023-11-09 12:04, Pedro Luis Castedo Cepeda wrote:
>>>> - Prevent strftime to parsing format string beyond its end when
>>>>     it finish with "%E" or "%O".
>>>> ---
>>>>    newlib/libc/time/strftime.c | 2 ++
>>>>    1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/newlib/libc/time/strftime.c b/newlib/libc/time/strftime.c
>>>> index 56f227c5f..c4e9e45a9 100644
>>>> --- a/newlib/libc/time/strftime.c
>>>> +++ b/newlib/libc/time/strftime.c
>>>> @@ -754,6 +754,8 @@ __strftime (CHAR *s, size_t maxsize, const CHAR *format,
>>>>          switch (*format)
>>>>        {
>>>> +    case CQ('\0'):
>>>> +      break;
>>>>        case CQ('a'):
>>>>          _ctloc (wday[tim_p->tm_wday]);
>>>>          for (i = 0; i < ctloclen; i++)
>>>
>>> These cases appear to already be taken care of by setting and using
>>> (depending on the config parameters) the "alt" variable for those modifiers,
>>> and the default: return 0; for the format *character* (possibly wide) not
>>> matching following any modifiers.
>>>
>>> Patches to newlib should go to the newlib mailing list at sourceware dot org.
>>
>> Also, a simple reproducer would be nice.

> My first contribution. Sorry about posting to wrong mail list and, at best, 
> minimalistic patch motivation reasoning. First time with git send-mail, too.
> 
> I came across this newlib "feature" trying to update GLib port to 2.78.1. When 
> trying to find out why test_strftime (glib/test/date.c)
> was failing I discovered that one of the test format strings, "%E" was 
> triggering a loop in g_date_strftime (glib/gdate.c) requiring more and more 
> memory till it was stopped by a fortunate maximum size check in function.
> 
> The problem is that __strftime  (newlib/libc/time/strftime.c) doesn't check for 
> '\0' after a terminal "%E" and it continues parsing the format string. Finally 
> (not sure if intentionally), this triggers a direct return 0 from __strftime 
> instead breaking the loop, preventing it from add '\0' to the end of returned 
> string. Same for "%O", I think (not tested).
> 
> It seems that this trailing '\0' allows to differentiate returning an empty 
> string from needing more space (at least, in Glib).
> 
> So, is it a newlib bug? Not really, I think this format string is bad-formed (%E 
> should modify something, shouldn't it?) So undefined behaviour is OK. I could 
> patch-out these format strings from the port.
> 
> But... from Glib tests, it seems that, at least:
> 
> - If G_OS_WIN32, terminal "%E" & "%O" are silently discarded.
> - If __FreeBSD__ || __OpenBSD__ || __APPLE__ they are transformed to E & O, 
> respectively.
> - And if #else the same thing is expected.
> 
> So it seems that returning 0-terminated string is a common practice and I also 
> think that this is more deterministic and, potentially, safer. That's why I sent 
> the patch. It tries to be the shortest addition to check for end of string after 
> %E & %O modifiers and takes G_OS_WIN32 approach (only cause it's the simplest).

Not seeing any issue with any format - see attached source and log output, built 
under Cygwin.
[Derived from a bash script using printf %(...)T to do the same thing.]

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                 -- Antoine de Saint-Exupéry

[-- Attachment #2: strftime-formats.c --]
[-- Type: text/plain, Size: 7576 bytes --]

/*!gcc -o strftime-formats strftime-formats.c
 * strftime-formats.c - enumerate strftime formats and values using bash printf %(%...)T
 *
 * %a          locale abbreviated weekday name [Sun..Sat]  Fri
 * %A              locale weekday name [Sunday..Saturday]  Friday
 * %b            locale abbreviated month name [Jan..Dec]  Aug
 * %B               locale month name [January..December]  August
 * %c                                    locale date time  2022 Aug 26 Fri 12:45:19
 * %C                                    century [-99...]  20
 * %d                  day in month, zero-padded [01..31]  26
 * %D                                  US date [%m/%d/%y]  08/26/22
 * %e                 day in month, blank-padded [ 1..31]  26
 * %E                     locale era extensions [ignored]  
 * %f                          invalid format character ?  
 * %F             ISO 8601 date year-month-day [%Y-%m-%d]  2022-08-26
 * %g        ISO 8601 week based year in century [00..99]  22
 * %G    ISO 8601 week based year with century [-9999...]  2022
 * %h            locale abbreviated month name [Jan..Dec]  Aug
 * %H                        hour, 24-hour clock [00..23]  12
 * %i                            year in century [00..99]  
 * %I                        hour, 12-hour clock [01..12]  12
 * %j                              day in year [001..366]  238
 * %J                          invalid format character ?  
 * %k             hour, 24-hour clock, blank pad [ 0..23]  12
 * %K                          invalid format character ?  
 * %l             hour, 12-hour clock, blank pad [ 0..12]  12
 * %L                          invalid format character ?  
 * %m                                      month [01..12]  08
 * %M                                     minute [00..59]  45
 * %n                                             newline '
 * '
 * %N                           nanoseconds [0-999999999] 123456789 
 * %N                                    Emperor/Era Name  
 * %o                                    Emperor/Era Year  
 * %O         locale alternate digit extensions [ignored]  
 * %p                          locale AM/PM 12-hour clock  PM
 * %P                locale lowercase am/pm 12-hour clock  pm
 * %q                                 quarter year [1..4]  
 * %Q                          invalid format character ?  
 * %r                            time, 12-hour [%I:%M:%S]  12:45:19
 * %R                      ISO 8601 time, 24-hour [%H:%M]  12:45
 * %s                    seconds since the epoch -...0...  1661539519
 * %S                                     second [00..60]  19
 * %t                                                 tab '	'
 * %T                   ISO 8601 time, 24-hour [%H:%M:%S]  12:45:19
 * %u            ISO 8601 day of week [1..7, Monday == 1]  5
 * %U        week in year, starting first Sunday [00..53]  34
 * %v                          VMS/Oracle date [%e-%b-%Y]  26-Aug-2022
 * %V           ISO 8601 week in week based year [01..53]  34
 * %w                     day of week [0..6, Sunday == 0]  5
 * %W        week in year, starting first Monday [00..53]  34
 * %x                          locale date representation  2022-08-26
 * %X               locale time representation [%H:%M:%S]  12:45:19
 * %y                            year in century [00..99]  22
 * %Y                        year with century [-9999...]  2022
 * %z          timezone offset east of GMT [-2300..+2300]  -0600
 * %Z  timezone abbreviation, blank if undetermined [XXX]  MDT
 * %+      default locale date format [%a %b %e %T %Z %Y]  Fri Aug 26 21:34:07 MDT 2022
 * %                                no format character ?
 */


#include <locale.h>
#include <stdio.h>
#include <time.h>


const char *tfmt[] = {
	"%a          locale abbreviated weekday name [Sun..Sat]",
	"%A              locale weekday name [Sunday..Saturday]",
	"%b            locale abbreviated month name [Jan..Dec]",
	"%B               locale month name [January..December]",
	"%c                                    locale date time",
	"%C                                    century [-99...]",
	"%d                  day in month, zero-padded [01..31]",
	"%D                                  US date [%m/%d/%y]",
	"%e                 day in month, blank-padded [ 1..31]",
	"%E                      locale alternate era [ignored]",
	"%f                          invalid format character ?",
	"%F             ISO 8601 date year-month-day [%Y-%m-%d]",
	"%g        ISO 8601 week based year in century [00..99]",
	"%G    ISO 8601 week based year with century [-9999...]",
	"%h            locale abbreviated month name [Jan..Dec]",
	"%H                        hour, 24-hour clock [00..23]",
	"%i                            year in century [00..99]",
	"%I                        hour, 12-hour clock [01..12]",
	"%j                              day in year [001..366]",
	"%J                          invalid format character ?",
	"%k             hour, 24-hour clock, blank pad [ 0..23]",
	"%K                          invalid format character ?",
	"%l             hour, 12-hour clock, blank pad [ 0..12]",
	"%L                          invalid format character ?",
	"%m                                      month [01..12]",
	"%M                                     minute [00..59]",
	"%n                                             newline",
	"%N                           nanoseconds [0-999999999]",
	"%N                   locale extension Emperor/Era Name",
	"%o                   locale extension Emperor/Era Year",
	"%O                   locale alternate digits [ignored]",
	"%p                          locale 12-hour clock AM/PM",
	"%P                locale 12-hour clock lowercase am/pm",
	"%q                                 quarter year [1..4]",
	"%Q                          invalid format character ?",
	"%r                            time, 12-hour [%I:%M:%S]",
	"%R                      ISO 8601 time, 24-hour [%H:%M]",
	"%s                    seconds since the epoch -...0...",
	"%S                                     second [00..60]",
	"%t                                                 tab",
	"%T                   ISO 8601 time, 24-hour [%H:%M:%S]",
	"%u            ISO 8601 day of week [1..7, Monday == 1]",
	"%U        week in year, starting first Sunday [00..53]",
	"%v             BSD/OSX/Ruby VMS/Oracle date [%e-%b-%Y]",
	"%V           ISO 8601 week in week based year [01..53]",
	"%w                     day of week [0..6, Sunday == 0]",
	"%W        week in year, starting first Monday [00..53]",
	"%x                          locale date representation",
	"%X               locale time representation [%H:%M:%S]",
	"%y                            year in century [00..99]",
	"%Y                        year with century [-9999...]",
	"%z          timezone offset east of GMT [-2300..+2300]",
	"%Z  timezone abbreviation, blank if undetermined [XXX]",
	"%+      default locale date format [%a %b %e %T %Z %Y]",
	"%                                no format character ?",
    };


int
main( void )
{
    char	out[64]		= "";
    char	fmt[4]		= "% ";
    char *	f		= &fmt[1];
    time_t	now		= time( &now );
    struct tm *	local		= localtime( &now );
    char *	locale		= setlocale( LC_ALL, "");
    size_t	len;
    int		rc;


    for (size_t tf = 0; tf < sizeof( tfmt )/sizeof( *tfmt ); ++tf)
    {
	*f	= tfmt[tf][1];
	*out	= '\0';
	len	= strftime( out, sizeof( out ), fmt, local);

	if (!(rc = printf( "%s\t'%s'\t%2zi\n", tfmt[tf], out, len)))
	{
	    perror( "strftime-formats: printf" );
	}
    }

    locale	= setlocale( LC_ALL, locale);
} /* main() */

[-- Attachment #3: strftime-formats.log --]
[-- Type: text/plain, Size: 3538 bytes --]

%a          locale abbreviated weekday name [Sun..Sat]	'Fri'	 3
%A              locale weekday name [Sunday..Saturday]	'Friday'	 6
%b            locale abbreviated month name [Jan..Dec]	'Nov'	 3
%B               locale month name [January..December]	'November'	 8
%c                                    locale date time	'2023 Nov 10 Fri 22:52:46'	24
%C                                    century [-99...]	'20'	 2
%d                  day in month, zero-padded [01..31]	'10'	 2
%D                                  US date [%m/%d/%y]	'11/10/23'	 8
%e                 day in month, blank-padded [ 1..31]	'10'	 2
%E                      locale alternate era [ignored]	''	 0
%f                          invalid format character ?	''	 0
%F             ISO 8601 date year-month-day [%Y-%m-%d]	'2023-11-10'	10
%g        ISO 8601 week based year in century [00..99]	'23'	 2
%G    ISO 8601 week based year with century [-9999...]	'2023'	 4
%h            locale abbreviated month name [Jan..Dec]	'Nov'	 3
%H                        hour, 24-hour clock [00..23]	'22'	 2
%i                            year in century [00..99]	''	 0
%I                        hour, 12-hour clock [01..12]	'10'	 2
%j                              day in year [001..366]	'314'	 3
%J                          invalid format character ?	''	 0
%k             hour, 24-hour clock, blank pad [ 0..23]	'22'	 2
%K                          invalid format character ?	''	 0
%l             hour, 12-hour clock, blank pad [ 0..12]	'10'	 2
%L                          invalid format character ?	''	 0
%m                                      month [01..12]	'11'	 2
%M                                     minute [00..59]	'52'	 2
%n                                             newline	'
'	 1
%N                           nanoseconds [0-999999999]	''	 0
%N                   locale extension Emperor/Era Name	''	 0
%o                   locale extension Emperor/Era Year	''	 0
%O                   locale alternate digits [ignored]	''	 0
%p                          locale 12-hour clock AM/PM	'PM'	 2
%P                locale 12-hour clock lowercase am/pm	'pm'	 2
%q                                 quarter year [1..4]	'4'	 1
%Q                          invalid format character ?	''	 0
%r                            time, 12-hour [%I:%M:%S]	'10:52:46'	 8
%R                      ISO 8601 time, 24-hour [%H:%M]	'22:52'	 5
%s                    seconds since the epoch -...0...	'1699681966'	10
%S                                     second [00..60]	'46'	 2
%t                                                 tab	'	'	 1
%T                   ISO 8601 time, 24-hour [%H:%M:%S]	'22:52:46'	 8
%u            ISO 8601 day of week [1..7, Monday == 1]	'5'	 1
%U        week in year, starting first Sunday [00..53]	'45'	 2
%v             BSD/OSX/Ruby VMS/Oracle date [%e-%b-%Y]	'10-Nov-2023'	11
%V           ISO 8601 week in week based year [01..53]	'45'	 2
%w                     day of week [0..6, Sunday == 0]	'5'	 1
%W        week in year, starting first Monday [00..53]	'45'	 2
%x                          locale date representation	'2023-11-10'	10
%X               locale time representation [%H:%M:%S]	'22:52:46'	 8
%y                            year in century [00..99]	'23'	 2
%Y                        year with century [-9999...]	'2023'	 4
%z          timezone offset east of GMT [-2300..+2300]	'-0700'	 5
%Z  timezone abbreviation, blank if undetermined [XXX]	'MST'	 3
%+      default locale date format [%a %b %e %T %Z %Y]	''	 0
%                                no format character ?	''	 0

  reply	other threads:[~2023-11-11  5:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-09 19:04 Pedro Luis Castedo Cepeda
2023-11-10  6:17 ` Brian Inglis
2023-11-10 10:16   ` Corinna Vinschen
2023-11-10 17:44     ` Pedro Luis Castedo Cepeda
2023-11-11  5:57       ` Brian Inglis [this message]
2023-11-11 17:29         ` Pedro Luis Castedo Cepeda
2023-11-13 16:42           ` Corinna Vinschen
2023-11-13 18:46             ` Brian Inglis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5cd4b96f-cad1-456c-b4d9-a6a649d36e3a@Shaw.ca \
    --to=brian.inglis@shaw.ca \
    --cc=cygwin-patches@cygwin.com \
    --cc=pedroluis.castedo@upm.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).