From: Brian Inglis <Brian.Inglis@Shaw.ca>
To: cygwin-patches@cygwin.com
Cc: Pedro Luis Castedo Cepeda <pedroluis.castedo@upm.es>
Subject: Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
Date: Fri, 10 Nov 2023 22:57:28 -0700 [thread overview]
Message-ID: <5cd4b96f-cad1-456c-b4d9-a6a649d36e3a@Shaw.ca> (raw)
In-Reply-To: <27a7257d-1e06-40ff-89ec-f100b8734802@upm.es>
[-- Attachment #1: Type: text/plain, Size: 3739 bytes --]
On 2023-11-10 10:44, Pedro Luis Castedo Cepeda wrote:
> El 10/11/2023 a las 11:16, Corinna Vinschen escribió:
>> On Nov 9 23:17, Brian Inglis wrote:
>>> On 2023-11-09 12:04, Pedro Luis Castedo Cepeda wrote:
>>>> - Prevent strftime to parsing format string beyond its end when
>>>> it finish with "%E" or "%O".
>>>> ---
>>>> newlib/libc/time/strftime.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/newlib/libc/time/strftime.c b/newlib/libc/time/strftime.c
>>>> index 56f227c5f..c4e9e45a9 100644
>>>> --- a/newlib/libc/time/strftime.c
>>>> +++ b/newlib/libc/time/strftime.c
>>>> @@ -754,6 +754,8 @@ __strftime (CHAR *s, size_t maxsize, const CHAR *format,
>>>> switch (*format)
>>>> {
>>>> + case CQ('\0'):
>>>> + break;
>>>> case CQ('a'):
>>>> _ctloc (wday[tim_p->tm_wday]);
>>>> for (i = 0; i < ctloclen; i++)
>>>
>>> These cases appear to already be taken care of by setting and using
>>> (depending on the config parameters) the "alt" variable for those modifiers,
>>> and the default: return 0; for the format *character* (possibly wide) not
>>> matching following any modifiers.
>>>
>>> Patches to newlib should go to the newlib mailing list at sourceware dot org.
>>
>> Also, a simple reproducer would be nice.
> My first contribution. Sorry about posting to wrong mail list and, at best,
> minimalistic patch motivation reasoning. First time with git send-mail, too.
>
> I came across this newlib "feature" trying to update GLib port to 2.78.1. When
> trying to find out why test_strftime (glib/test/date.c)
> was failing I discovered that one of the test format strings, "%E" was
> triggering a loop in g_date_strftime (glib/gdate.c) requiring more and more
> memory till it was stopped by a fortunate maximum size check in function.
>
> The problem is that __strftime (newlib/libc/time/strftime.c) doesn't check for
> '\0' after a terminal "%E" and it continues parsing the format string. Finally
> (not sure if intentionally), this triggers a direct return 0 from __strftime
> instead breaking the loop, preventing it from add '\0' to the end of returned
> string. Same for "%O", I think (not tested).
>
> It seems that this trailing '\0' allows to differentiate returning an empty
> string from needing more space (at least, in Glib).
>
> So, is it a newlib bug? Not really, I think this format string is bad-formed (%E
> should modify something, shouldn't it?) So undefined behaviour is OK. I could
> patch-out these format strings from the port.
>
> But... from Glib tests, it seems that, at least:
>
> - If G_OS_WIN32, terminal "%E" & "%O" are silently discarded.
> - If __FreeBSD__ || __OpenBSD__ || __APPLE__ they are transformed to E & O,
> respectively.
> - And if #else the same thing is expected.
>
> So it seems that returning 0-terminated string is a common practice and I also
> think that this is more deterministic and, potentially, safer. That's why I sent
> the patch. It tries to be the shortest addition to check for end of string after
> %E & %O modifiers and takes G_OS_WIN32 approach (only cause it's the simplest).
Not seeing any issue with any format - see attached source and log output, built
under Cygwin.
[Derived from a bash script using printf %(...)T to do the same thing.]
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
[-- Attachment #2: strftime-formats.c --]
[-- Type: text/plain, Size: 7576 bytes --]
/*!gcc -o strftime-formats strftime-formats.c
* strftime-formats.c - enumerate strftime formats and values using bash printf %(%...)T
*
* %a locale abbreviated weekday name [Sun..Sat] Fri
* %A locale weekday name [Sunday..Saturday] Friday
* %b locale abbreviated month name [Jan..Dec] Aug
* %B locale month name [January..December] August
* %c locale date time 2022 Aug 26 Fri 12:45:19
* %C century [-99...] 20
* %d day in month, zero-padded [01..31] 26
* %D US date [%m/%d/%y] 08/26/22
* %e day in month, blank-padded [ 1..31] 26
* %E locale era extensions [ignored]
* %f invalid format character ?
* %F ISO 8601 date year-month-day [%Y-%m-%d] 2022-08-26
* %g ISO 8601 week based year in century [00..99] 22
* %G ISO 8601 week based year with century [-9999...] 2022
* %h locale abbreviated month name [Jan..Dec] Aug
* %H hour, 24-hour clock [00..23] 12
* %i year in century [00..99]
* %I hour, 12-hour clock [01..12] 12
* %j day in year [001..366] 238
* %J invalid format character ?
* %k hour, 24-hour clock, blank pad [ 0..23] 12
* %K invalid format character ?
* %l hour, 12-hour clock, blank pad [ 0..12] 12
* %L invalid format character ?
* %m month [01..12] 08
* %M minute [00..59] 45
* %n newline '
* '
* %N nanoseconds [0-999999999] 123456789
* %N Emperor/Era Name
* %o Emperor/Era Year
* %O locale alternate digit extensions [ignored]
* %p locale AM/PM 12-hour clock PM
* %P locale lowercase am/pm 12-hour clock pm
* %q quarter year [1..4]
* %Q invalid format character ?
* %r time, 12-hour [%I:%M:%S] 12:45:19
* %R ISO 8601 time, 24-hour [%H:%M] 12:45
* %s seconds since the epoch -...0... 1661539519
* %S second [00..60] 19
* %t tab ' '
* %T ISO 8601 time, 24-hour [%H:%M:%S] 12:45:19
* %u ISO 8601 day of week [1..7, Monday == 1] 5
* %U week in year, starting first Sunday [00..53] 34
* %v VMS/Oracle date [%e-%b-%Y] 26-Aug-2022
* %V ISO 8601 week in week based year [01..53] 34
* %w day of week [0..6, Sunday == 0] 5
* %W week in year, starting first Monday [00..53] 34
* %x locale date representation 2022-08-26
* %X locale time representation [%H:%M:%S] 12:45:19
* %y year in century [00..99] 22
* %Y year with century [-9999...] 2022
* %z timezone offset east of GMT [-2300..+2300] -0600
* %Z timezone abbreviation, blank if undetermined [XXX] MDT
* %+ default locale date format [%a %b %e %T %Z %Y] Fri Aug 26 21:34:07 MDT 2022
* % no format character ?
*/
#include <locale.h>
#include <stdio.h>
#include <time.h>
const char *tfmt[] = {
"%a locale abbreviated weekday name [Sun..Sat]",
"%A locale weekday name [Sunday..Saturday]",
"%b locale abbreviated month name [Jan..Dec]",
"%B locale month name [January..December]",
"%c locale date time",
"%C century [-99...]",
"%d day in month, zero-padded [01..31]",
"%D US date [%m/%d/%y]",
"%e day in month, blank-padded [ 1..31]",
"%E locale alternate era [ignored]",
"%f invalid format character ?",
"%F ISO 8601 date year-month-day [%Y-%m-%d]",
"%g ISO 8601 week based year in century [00..99]",
"%G ISO 8601 week based year with century [-9999...]",
"%h locale abbreviated month name [Jan..Dec]",
"%H hour, 24-hour clock [00..23]",
"%i year in century [00..99]",
"%I hour, 12-hour clock [01..12]",
"%j day in year [001..366]",
"%J invalid format character ?",
"%k hour, 24-hour clock, blank pad [ 0..23]",
"%K invalid format character ?",
"%l hour, 12-hour clock, blank pad [ 0..12]",
"%L invalid format character ?",
"%m month [01..12]",
"%M minute [00..59]",
"%n newline",
"%N nanoseconds [0-999999999]",
"%N locale extension Emperor/Era Name",
"%o locale extension Emperor/Era Year",
"%O locale alternate digits [ignored]",
"%p locale 12-hour clock AM/PM",
"%P locale 12-hour clock lowercase am/pm",
"%q quarter year [1..4]",
"%Q invalid format character ?",
"%r time, 12-hour [%I:%M:%S]",
"%R ISO 8601 time, 24-hour [%H:%M]",
"%s seconds since the epoch -...0...",
"%S second [00..60]",
"%t tab",
"%T ISO 8601 time, 24-hour [%H:%M:%S]",
"%u ISO 8601 day of week [1..7, Monday == 1]",
"%U week in year, starting first Sunday [00..53]",
"%v BSD/OSX/Ruby VMS/Oracle date [%e-%b-%Y]",
"%V ISO 8601 week in week based year [01..53]",
"%w day of week [0..6, Sunday == 0]",
"%W week in year, starting first Monday [00..53]",
"%x locale date representation",
"%X locale time representation [%H:%M:%S]",
"%y year in century [00..99]",
"%Y year with century [-9999...]",
"%z timezone offset east of GMT [-2300..+2300]",
"%Z timezone abbreviation, blank if undetermined [XXX]",
"%+ default locale date format [%a %b %e %T %Z %Y]",
"% no format character ?",
};
int
main( void )
{
char out[64] = "";
char fmt[4] = "% ";
char * f = &fmt[1];
time_t now = time( &now );
struct tm * local = localtime( &now );
char * locale = setlocale( LC_ALL, "");
size_t len;
int rc;
for (size_t tf = 0; tf < sizeof( tfmt )/sizeof( *tfmt ); ++tf)
{
*f = tfmt[tf][1];
*out = '\0';
len = strftime( out, sizeof( out ), fmt, local);
if (!(rc = printf( "%s\t'%s'\t%2zi\n", tfmt[tf], out, len)))
{
perror( "strftime-formats: printf" );
}
}
locale = setlocale( LC_ALL, locale);
} /* main() */
[-- Attachment #3: strftime-formats.log --]
[-- Type: text/plain, Size: 3538 bytes --]
%a locale abbreviated weekday name [Sun..Sat] 'Fri' 3
%A locale weekday name [Sunday..Saturday] 'Friday' 6
%b locale abbreviated month name [Jan..Dec] 'Nov' 3
%B locale month name [January..December] 'November' 8
%c locale date time '2023 Nov 10 Fri 22:52:46' 24
%C century [-99...] '20' 2
%d day in month, zero-padded [01..31] '10' 2
%D US date [%m/%d/%y] '11/10/23' 8
%e day in month, blank-padded [ 1..31] '10' 2
%E locale alternate era [ignored] '' 0
%f invalid format character ? '' 0
%F ISO 8601 date year-month-day [%Y-%m-%d] '2023-11-10' 10
%g ISO 8601 week based year in century [00..99] '23' 2
%G ISO 8601 week based year with century [-9999...] '2023' 4
%h locale abbreviated month name [Jan..Dec] 'Nov' 3
%H hour, 24-hour clock [00..23] '22' 2
%i year in century [00..99] '' 0
%I hour, 12-hour clock [01..12] '10' 2
%j day in year [001..366] '314' 3
%J invalid format character ? '' 0
%k hour, 24-hour clock, blank pad [ 0..23] '22' 2
%K invalid format character ? '' 0
%l hour, 12-hour clock, blank pad [ 0..12] '10' 2
%L invalid format character ? '' 0
%m month [01..12] '11' 2
%M minute [00..59] '52' 2
%n newline '
' 1
%N nanoseconds [0-999999999] '' 0
%N locale extension Emperor/Era Name '' 0
%o locale extension Emperor/Era Year '' 0
%O locale alternate digits [ignored] '' 0
%p locale 12-hour clock AM/PM 'PM' 2
%P locale 12-hour clock lowercase am/pm 'pm' 2
%q quarter year [1..4] '4' 1
%Q invalid format character ? '' 0
%r time, 12-hour [%I:%M:%S] '10:52:46' 8
%R ISO 8601 time, 24-hour [%H:%M] '22:52' 5
%s seconds since the epoch -...0... '1699681966' 10
%S second [00..60] '46' 2
%t tab ' ' 1
%T ISO 8601 time, 24-hour [%H:%M:%S] '22:52:46' 8
%u ISO 8601 day of week [1..7, Monday == 1] '5' 1
%U week in year, starting first Sunday [00..53] '45' 2
%v BSD/OSX/Ruby VMS/Oracle date [%e-%b-%Y] '10-Nov-2023' 11
%V ISO 8601 week in week based year [01..53] '45' 2
%w day of week [0..6, Sunday == 0] '5' 1
%W week in year, starting first Monday [00..53] '45' 2
%x locale date representation '2023-11-10' 10
%X locale time representation [%H:%M:%S] '22:52:46' 8
%y year in century [00..99] '23' 2
%Y year with century [-9999...] '2023' 4
%z timezone offset east of GMT [-2300..+2300] '-0700' 5
%Z timezone abbreviation, blank if undetermined [XXX] 'MST' 3
%+ default locale date format [%a %b %e %T %Z %Y] '' 0
% no format character ? '' 0
next prev parent reply other threads:[~2023-11-11 5:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-09 19:04 Pedro Luis Castedo Cepeda
2023-11-10 6:17 ` Brian Inglis
2023-11-10 10:16 ` Corinna Vinschen
2023-11-10 17:44 ` Pedro Luis Castedo Cepeda
2023-11-11 5:57 ` Brian Inglis [this message]
2023-11-11 17:29 ` Pedro Luis Castedo Cepeda
2023-11-13 16:42 ` Corinna Vinschen
2023-11-13 18:46 ` Brian Inglis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5cd4b96f-cad1-456c-b4d9-a6a649d36e3a@Shaw.ca \
--to=brian.inglis@shaw.ca \
--cc=cygwin-patches@cygwin.com \
--cc=pedroluis.castedo@upm.es \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).