public inbox for cygwin-patches@cygwin.com
 help / color / mirror / Atom feed
* [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
@ 2023-11-09 19:04 Pedro Luis Castedo Cepeda
  2023-11-10  6:17 ` Brian Inglis
  0 siblings, 1 reply; 8+ messages in thread
From: Pedro Luis Castedo Cepeda @ 2023-11-09 19:04 UTC (permalink / raw)
  To: cygwin-patches; +Cc: Pedro Luis Castedo Cepeda

- Prevent strftime to parsing format string beyond its end when
  it finish with "%E" or "%O".
---
 newlib/libc/time/strftime.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/newlib/libc/time/strftime.c b/newlib/libc/time/strftime.c
index 56f227c5f..c4e9e45a9 100644
--- a/newlib/libc/time/strftime.c
+++ b/newlib/libc/time/strftime.c
@@ -754,6 +754,8 @@ __strftime (CHAR *s, size_t maxsize, const CHAR *format,
 
       switch (*format)
 	{
+	case CQ('\0'):
+	  break;
 	case CQ('a'):
 	  _ctloc (wday[tim_p->tm_wday]);
 	  for (i = 0; i < ctloclen; i++)
-- 
2.42.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
  2023-11-09 19:04 [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string Pedro Luis Castedo Cepeda
@ 2023-11-10  6:17 ` Brian Inglis
  2023-11-10 10:16   ` Corinna Vinschen
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Inglis @ 2023-11-10  6:17 UTC (permalink / raw)
  To: cygwin-patches; +Cc: Pedro Luis Castedo Cepeda

On 2023-11-09 12:04, Pedro Luis Castedo Cepeda wrote:
> - Prevent strftime to parsing format string beyond its end when
>    it finish with "%E" or "%O".
> ---
>   newlib/libc/time/strftime.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/newlib/libc/time/strftime.c b/newlib/libc/time/strftime.c
> index 56f227c5f..c4e9e45a9 100644
> --- a/newlib/libc/time/strftime.c
> +++ b/newlib/libc/time/strftime.c
> @@ -754,6 +754,8 @@ __strftime (CHAR *s, size_t maxsize, const CHAR *format,
>   
>         switch (*format)
>   	{
> +	case CQ('\0'):
> +	  break;
>   	case CQ('a'):
>   	  _ctloc (wday[tim_p->tm_wday]);
>   	  for (i = 0; i < ctloclen; i++)

These cases appear to already be taken care of by setting and using (depending 
on the config parameters) the "alt" variable for those modifiers, and the 
default: return 0; for the format *character* (possibly wide) not matching 
following any modifiers.

Patches to newlib should go to the newlib mailing list at sourceware dot org.

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                 -- Antoine de Saint-Exupéry

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
  2023-11-10  6:17 ` Brian Inglis
@ 2023-11-10 10:16   ` Corinna Vinschen
  2023-11-10 17:44     ` Pedro Luis Castedo Cepeda
  0 siblings, 1 reply; 8+ messages in thread
From: Corinna Vinschen @ 2023-11-10 10:16 UTC (permalink / raw)
  To: cygwin-patches

On Nov  9 23:17, Brian Inglis wrote:
> On 2023-11-09 12:04, Pedro Luis Castedo Cepeda wrote:
> > - Prevent strftime to parsing format string beyond its end when
> >    it finish with "%E" or "%O".
> > ---
> >   newlib/libc/time/strftime.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/newlib/libc/time/strftime.c b/newlib/libc/time/strftime.c
> > index 56f227c5f..c4e9e45a9 100644
> > --- a/newlib/libc/time/strftime.c
> > +++ b/newlib/libc/time/strftime.c
> > @@ -754,6 +754,8 @@ __strftime (CHAR *s, size_t maxsize, const CHAR *format,
> >         switch (*format)
> >   	{
> > +	case CQ('\0'):
> > +	  break;
> >   	case CQ('a'):
> >   	  _ctloc (wday[tim_p->tm_wday]);
> >   	  for (i = 0; i < ctloclen; i++)
> 
> These cases appear to already be taken care of by setting and using
> (depending on the config parameters) the "alt" variable for those modifiers,
> and the default: return 0; for the format *character* (possibly wide) not
> matching following any modifiers.
> 
> Patches to newlib should go to the newlib mailing list at sourceware dot org.

Also, a simple reproducer would be nice.


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
  2023-11-10 10:16   ` Corinna Vinschen
@ 2023-11-10 17:44     ` Pedro Luis Castedo Cepeda
  2023-11-11  5:57       ` Brian Inglis
  0 siblings, 1 reply; 8+ messages in thread
From: Pedro Luis Castedo Cepeda @ 2023-11-10 17:44 UTC (permalink / raw)
  To: cygwin-patches


[-- Attachment #1.1: Type: text/plain, Size: 3031 bytes --]

El 10/11/2023 a las 11:16, Corinna Vinschen escribió:
> On Nov  9 23:17, Brian Inglis wrote:
>> On 2023-11-09 12:04, Pedro Luis Castedo Cepeda wrote:
>>> - Prevent strftime to parsing format string beyond its end when
>>>     it finish with "%E" or "%O".
>>> ---
>>>    newlib/libc/time/strftime.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/newlib/libc/time/strftime.c b/newlib/libc/time/strftime.c
>>> index 56f227c5f..c4e9e45a9 100644
>>> --- a/newlib/libc/time/strftime.c
>>> +++ b/newlib/libc/time/strftime.c
>>> @@ -754,6 +754,8 @@ __strftime (CHAR *s, size_t maxsize, const CHAR *format,
>>>          switch (*format)
>>>    	{
>>> +	case CQ('\0'):
>>> +	  break;
>>>    	case CQ('a'):
>>>    	  _ctloc (wday[tim_p->tm_wday]);
>>>    	  for (i = 0; i < ctloclen; i++)
>>
>> These cases appear to already be taken care of by setting and using
>> (depending on the config parameters) the "alt" variable for those modifiers,
>> and the default: return 0; for the format *character* (possibly wide) not
>> matching following any modifiers.
>>
>> Patches to newlib should go to the newlib mailing list at sourceware dot org.
> 
> Also, a simple reproducer would be nice.
> 
> 
> Thanks,
> Corinna

My first contribution. Sorry about posting to wrong mail list and, at 
best, minimalistic patch motivation reasoning. First time with git 
send-mail, too.

I came across this newlib "feature" trying to update GLib port to 
2.78.1. When trying to find out why test_strftime (glib/test/date.c)
was failing I discovered that one of the test format strings, "%E" was 
triggering a loop in g_date_strftime (glib/gdate.c) requiring more and 
more memory till it was stopped by a fortunate maximum size check in 
function.

The problem is that __strftime  (newlib/libc/time/strftime.c) doesn't 
check for '\0' after a terminal "%E" and it continues parsing the format 
string. Finally (not sure if intentionally), this triggers a direct 
return 0 from __strftime instead breaking the loop, preventing it from 
add '\0' to the end of returned string. Same for "%O", I think (not tested).

It seems that this trailing '\0' allows to differentiate returning an 
empty string from needing more space (at least, in Glib).

So, is it a newlib bug? Not really, I think this format string is 
bad-formed (%E should modify something, shouldn't it?) So undefined 
behaviour is OK. I could patch-out these format strings from the port.

But... from Glib tests, it seems that, at least:

- If G_OS_WIN32, terminal "%E" & "%O" are silently discarded.
- If __FreeBSD__ || __OpenBSD__ || __APPLE__ they are transformed to E & 
O, respectively.
- And if #else the same thing is expected.

So it seems that returning 0-terminated string is a common practice and 
I also think that this is more deterministic and, potentially, safer. 
That's why I sent the patch. It tries to be the shortest addition to 
check for end of string after %E & %O modifiers and takes G_OS_WIN32 
approach (only cause it's the simplest).

Best regards.


[-- Attachment #1.2: strftime_test.c --]
[-- Type: text/plain, Size: 738 bytes --]

#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <time.h>
#include <locale.h>
#include <assert.h>

int main(int argc, char *argv[])
{
  const struct tm date = {
    .tm_sec = 0, .tm_min = 0, .tm_hour = 0, .tm_mday = 1, .tm_mon = 0,
    .tm_year = -1899, .tm_wday = 1, .tm_yday = 0, .tm_isdst = -1,
    .tm_gmtoff = 0, .tm_zone = 0x0
  };

  setenv("LC_ALL", "en_US.utf-8", true);
  setlocale (LC_ALL, "");

  char tmpbuf[128];
  size_t tmplen;

  tmpbuf[0] = '\1';
  tmplen = strftime(tmpbuf, sizeof(tmpbuf), "%E", &date);
  assert(tmpbuf[0] = '\0' || strncmp(tmpbuf, "E", 2));
  tmplen = strftime(tmpbuf, sizeof(tmpbuf), "%O", &date);
  assert(tmpbuf[0] = '\0' || strncmp(tmpbuf, "O", 2));

  return EXIT_SUCCESS;
}

[-- Attachment #2: Firma criptográfica S/MIME --]
[-- Type: application/pkcs7-signature, Size: 5824 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
  2023-11-10 17:44     ` Pedro Luis Castedo Cepeda
@ 2023-11-11  5:57       ` Brian Inglis
  2023-11-11 17:29         ` Pedro Luis Castedo Cepeda
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Inglis @ 2023-11-11  5:57 UTC (permalink / raw)
  To: cygwin-patches; +Cc: Pedro Luis Castedo Cepeda

[-- Attachment #1: Type: text/plain, Size: 3739 bytes --]

On 2023-11-10 10:44, Pedro Luis Castedo Cepeda wrote:
> El 10/11/2023 a las 11:16, Corinna Vinschen escribió:
>> On Nov  9 23:17, Brian Inglis wrote:
>>> On 2023-11-09 12:04, Pedro Luis Castedo Cepeda wrote:
>>>> - Prevent strftime to parsing format string beyond its end when
>>>>     it finish with "%E" or "%O".
>>>> ---
>>>>    newlib/libc/time/strftime.c | 2 ++
>>>>    1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/newlib/libc/time/strftime.c b/newlib/libc/time/strftime.c
>>>> index 56f227c5f..c4e9e45a9 100644
>>>> --- a/newlib/libc/time/strftime.c
>>>> +++ b/newlib/libc/time/strftime.c
>>>> @@ -754,6 +754,8 @@ __strftime (CHAR *s, size_t maxsize, const CHAR *format,
>>>>          switch (*format)
>>>>        {
>>>> +    case CQ('\0'):
>>>> +      break;
>>>>        case CQ('a'):
>>>>          _ctloc (wday[tim_p->tm_wday]);
>>>>          for (i = 0; i < ctloclen; i++)
>>>
>>> These cases appear to already be taken care of by setting and using
>>> (depending on the config parameters) the "alt" variable for those modifiers,
>>> and the default: return 0; for the format *character* (possibly wide) not
>>> matching following any modifiers.
>>>
>>> Patches to newlib should go to the newlib mailing list at sourceware dot org.
>>
>> Also, a simple reproducer would be nice.

> My first contribution. Sorry about posting to wrong mail list and, at best, 
> minimalistic patch motivation reasoning. First time with git send-mail, too.
> 
> I came across this newlib "feature" trying to update GLib port to 2.78.1. When 
> trying to find out why test_strftime (glib/test/date.c)
> was failing I discovered that one of the test format strings, "%E" was 
> triggering a loop in g_date_strftime (glib/gdate.c) requiring more and more 
> memory till it was stopped by a fortunate maximum size check in function.
> 
> The problem is that __strftime  (newlib/libc/time/strftime.c) doesn't check for 
> '\0' after a terminal "%E" and it continues parsing the format string. Finally 
> (not sure if intentionally), this triggers a direct return 0 from __strftime 
> instead breaking the loop, preventing it from add '\0' to the end of returned 
> string. Same for "%O", I think (not tested).
> 
> It seems that this trailing '\0' allows to differentiate returning an empty 
> string from needing more space (at least, in Glib).
> 
> So, is it a newlib bug? Not really, I think this format string is bad-formed (%E 
> should modify something, shouldn't it?) So undefined behaviour is OK. I could 
> patch-out these format strings from the port.
> 
> But... from Glib tests, it seems that, at least:
> 
> - If G_OS_WIN32, terminal "%E" & "%O" are silently discarded.
> - If __FreeBSD__ || __OpenBSD__ || __APPLE__ they are transformed to E & O, 
> respectively.
> - And if #else the same thing is expected.
> 
> So it seems that returning 0-terminated string is a common practice and I also 
> think that this is more deterministic and, potentially, safer. That's why I sent 
> the patch. It tries to be the shortest addition to check for end of string after 
> %E & %O modifiers and takes G_OS_WIN32 approach (only cause it's the simplest).

Not seeing any issue with any format - see attached source and log output, built 
under Cygwin.
[Derived from a bash script using printf %(...)T to do the same thing.]

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                 -- Antoine de Saint-Exupéry

[-- Attachment #2: strftime-formats.c --]
[-- Type: text/plain, Size: 7576 bytes --]

/*!gcc -o strftime-formats strftime-formats.c
 * strftime-formats.c - enumerate strftime formats and values using bash printf %(%...)T
 *
 * %a          locale abbreviated weekday name [Sun..Sat]  Fri
 * %A              locale weekday name [Sunday..Saturday]  Friday
 * %b            locale abbreviated month name [Jan..Dec]  Aug
 * %B               locale month name [January..December]  August
 * %c                                    locale date time  2022 Aug 26 Fri 12:45:19
 * %C                                    century [-99...]  20
 * %d                  day in month, zero-padded [01..31]  26
 * %D                                  US date [%m/%d/%y]  08/26/22
 * %e                 day in month, blank-padded [ 1..31]  26
 * %E                     locale era extensions [ignored]  
 * %f                          invalid format character ?  
 * %F             ISO 8601 date year-month-day [%Y-%m-%d]  2022-08-26
 * %g        ISO 8601 week based year in century [00..99]  22
 * %G    ISO 8601 week based year with century [-9999...]  2022
 * %h            locale abbreviated month name [Jan..Dec]  Aug
 * %H                        hour, 24-hour clock [00..23]  12
 * %i                            year in century [00..99]  
 * %I                        hour, 12-hour clock [01..12]  12
 * %j                              day in year [001..366]  238
 * %J                          invalid format character ?  
 * %k             hour, 24-hour clock, blank pad [ 0..23]  12
 * %K                          invalid format character ?  
 * %l             hour, 12-hour clock, blank pad [ 0..12]  12
 * %L                          invalid format character ?  
 * %m                                      month [01..12]  08
 * %M                                     minute [00..59]  45
 * %n                                             newline '
 * '
 * %N                           nanoseconds [0-999999999] 123456789 
 * %N                                    Emperor/Era Name  
 * %o                                    Emperor/Era Year  
 * %O         locale alternate digit extensions [ignored]  
 * %p                          locale AM/PM 12-hour clock  PM
 * %P                locale lowercase am/pm 12-hour clock  pm
 * %q                                 quarter year [1..4]  
 * %Q                          invalid format character ?  
 * %r                            time, 12-hour [%I:%M:%S]  12:45:19
 * %R                      ISO 8601 time, 24-hour [%H:%M]  12:45
 * %s                    seconds since the epoch -...0...  1661539519
 * %S                                     second [00..60]  19
 * %t                                                 tab '	'
 * %T                   ISO 8601 time, 24-hour [%H:%M:%S]  12:45:19
 * %u            ISO 8601 day of week [1..7, Monday == 1]  5
 * %U        week in year, starting first Sunday [00..53]  34
 * %v                          VMS/Oracle date [%e-%b-%Y]  26-Aug-2022
 * %V           ISO 8601 week in week based year [01..53]  34
 * %w                     day of week [0..6, Sunday == 0]  5
 * %W        week in year, starting first Monday [00..53]  34
 * %x                          locale date representation  2022-08-26
 * %X               locale time representation [%H:%M:%S]  12:45:19
 * %y                            year in century [00..99]  22
 * %Y                        year with century [-9999...]  2022
 * %z          timezone offset east of GMT [-2300..+2300]  -0600
 * %Z  timezone abbreviation, blank if undetermined [XXX]  MDT
 * %+      default locale date format [%a %b %e %T %Z %Y]  Fri Aug 26 21:34:07 MDT 2022
 * %                                no format character ?
 */


#include <locale.h>
#include <stdio.h>
#include <time.h>


const char *tfmt[] = {
	"%a          locale abbreviated weekday name [Sun..Sat]",
	"%A              locale weekday name [Sunday..Saturday]",
	"%b            locale abbreviated month name [Jan..Dec]",
	"%B               locale month name [January..December]",
	"%c                                    locale date time",
	"%C                                    century [-99...]",
	"%d                  day in month, zero-padded [01..31]",
	"%D                                  US date [%m/%d/%y]",
	"%e                 day in month, blank-padded [ 1..31]",
	"%E                      locale alternate era [ignored]",
	"%f                          invalid format character ?",
	"%F             ISO 8601 date year-month-day [%Y-%m-%d]",
	"%g        ISO 8601 week based year in century [00..99]",
	"%G    ISO 8601 week based year with century [-9999...]",
	"%h            locale abbreviated month name [Jan..Dec]",
	"%H                        hour, 24-hour clock [00..23]",
	"%i                            year in century [00..99]",
	"%I                        hour, 12-hour clock [01..12]",
	"%j                              day in year [001..366]",
	"%J                          invalid format character ?",
	"%k             hour, 24-hour clock, blank pad [ 0..23]",
	"%K                          invalid format character ?",
	"%l             hour, 12-hour clock, blank pad [ 0..12]",
	"%L                          invalid format character ?",
	"%m                                      month [01..12]",
	"%M                                     minute [00..59]",
	"%n                                             newline",
	"%N                           nanoseconds [0-999999999]",
	"%N                   locale extension Emperor/Era Name",
	"%o                   locale extension Emperor/Era Year",
	"%O                   locale alternate digits [ignored]",
	"%p                          locale 12-hour clock AM/PM",
	"%P                locale 12-hour clock lowercase am/pm",
	"%q                                 quarter year [1..4]",
	"%Q                          invalid format character ?",
	"%r                            time, 12-hour [%I:%M:%S]",
	"%R                      ISO 8601 time, 24-hour [%H:%M]",
	"%s                    seconds since the epoch -...0...",
	"%S                                     second [00..60]",
	"%t                                                 tab",
	"%T                   ISO 8601 time, 24-hour [%H:%M:%S]",
	"%u            ISO 8601 day of week [1..7, Monday == 1]",
	"%U        week in year, starting first Sunday [00..53]",
	"%v             BSD/OSX/Ruby VMS/Oracle date [%e-%b-%Y]",
	"%V           ISO 8601 week in week based year [01..53]",
	"%w                     day of week [0..6, Sunday == 0]",
	"%W        week in year, starting first Monday [00..53]",
	"%x                          locale date representation",
	"%X               locale time representation [%H:%M:%S]",
	"%y                            year in century [00..99]",
	"%Y                        year with century [-9999...]",
	"%z          timezone offset east of GMT [-2300..+2300]",
	"%Z  timezone abbreviation, blank if undetermined [XXX]",
	"%+      default locale date format [%a %b %e %T %Z %Y]",
	"%                                no format character ?",
    };


int
main( void )
{
    char	out[64]		= "";
    char	fmt[4]		= "% ";
    char *	f		= &fmt[1];
    time_t	now		= time( &now );
    struct tm *	local		= localtime( &now );
    char *	locale		= setlocale( LC_ALL, "");
    size_t	len;
    int		rc;


    for (size_t tf = 0; tf < sizeof( tfmt )/sizeof( *tfmt ); ++tf)
    {
	*f	= tfmt[tf][1];
	*out	= '\0';
	len	= strftime( out, sizeof( out ), fmt, local);

	if (!(rc = printf( "%s\t'%s'\t%2zi\n", tfmt[tf], out, len)))
	{
	    perror( "strftime-formats: printf" );
	}
    }

    locale	= setlocale( LC_ALL, locale);
} /* main() */

[-- Attachment #3: strftime-formats.log --]
[-- Type: text/plain, Size: 3538 bytes --]

%a          locale abbreviated weekday name [Sun..Sat]	'Fri'	 3
%A              locale weekday name [Sunday..Saturday]	'Friday'	 6
%b            locale abbreviated month name [Jan..Dec]	'Nov'	 3
%B               locale month name [January..December]	'November'	 8
%c                                    locale date time	'2023 Nov 10 Fri 22:52:46'	24
%C                                    century [-99...]	'20'	 2
%d                  day in month, zero-padded [01..31]	'10'	 2
%D                                  US date [%m/%d/%y]	'11/10/23'	 8
%e                 day in month, blank-padded [ 1..31]	'10'	 2
%E                      locale alternate era [ignored]	''	 0
%f                          invalid format character ?	''	 0
%F             ISO 8601 date year-month-day [%Y-%m-%d]	'2023-11-10'	10
%g        ISO 8601 week based year in century [00..99]	'23'	 2
%G    ISO 8601 week based year with century [-9999...]	'2023'	 4
%h            locale abbreviated month name [Jan..Dec]	'Nov'	 3
%H                        hour, 24-hour clock [00..23]	'22'	 2
%i                            year in century [00..99]	''	 0
%I                        hour, 12-hour clock [01..12]	'10'	 2
%j                              day in year [001..366]	'314'	 3
%J                          invalid format character ?	''	 0
%k             hour, 24-hour clock, blank pad [ 0..23]	'22'	 2
%K                          invalid format character ?	''	 0
%l             hour, 12-hour clock, blank pad [ 0..12]	'10'	 2
%L                          invalid format character ?	''	 0
%m                                      month [01..12]	'11'	 2
%M                                     minute [00..59]	'52'	 2
%n                                             newline	'
'	 1
%N                           nanoseconds [0-999999999]	''	 0
%N                   locale extension Emperor/Era Name	''	 0
%o                   locale extension Emperor/Era Year	''	 0
%O                   locale alternate digits [ignored]	''	 0
%p                          locale 12-hour clock AM/PM	'PM'	 2
%P                locale 12-hour clock lowercase am/pm	'pm'	 2
%q                                 quarter year [1..4]	'4'	 1
%Q                          invalid format character ?	''	 0
%r                            time, 12-hour [%I:%M:%S]	'10:52:46'	 8
%R                      ISO 8601 time, 24-hour [%H:%M]	'22:52'	 5
%s                    seconds since the epoch -...0...	'1699681966'	10
%S                                     second [00..60]	'46'	 2
%t                                                 tab	'	'	 1
%T                   ISO 8601 time, 24-hour [%H:%M:%S]	'22:52:46'	 8
%u            ISO 8601 day of week [1..7, Monday == 1]	'5'	 1
%U        week in year, starting first Sunday [00..53]	'45'	 2
%v             BSD/OSX/Ruby VMS/Oracle date [%e-%b-%Y]	'10-Nov-2023'	11
%V           ISO 8601 week in week based year [01..53]	'45'	 2
%w                     day of week [0..6, Sunday == 0]	'5'	 1
%W        week in year, starting first Monday [00..53]	'45'	 2
%x                          locale date representation	'2023-11-10'	10
%X               locale time representation [%H:%M:%S]	'22:52:46'	 8
%y                            year in century [00..99]	'23'	 2
%Y                        year with century [-9999...]	'2023'	 4
%z          timezone offset east of GMT [-2300..+2300]	'-0700'	 5
%Z  timezone abbreviation, blank if undetermined [XXX]	'MST'	 3
%+      default locale date format [%a %b %e %T %Z %Y]	''	 0
%                                no format character ?	''	 0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
  2023-11-11  5:57       ` Brian Inglis
@ 2023-11-11 17:29         ` Pedro Luis Castedo Cepeda
  2023-11-13 16:42           ` Corinna Vinschen
  0 siblings, 1 reply; 8+ messages in thread
From: Pedro Luis Castedo Cepeda @ 2023-11-11 17:29 UTC (permalink / raw)
  To: cygwin-patches

[-- Attachment #1: Type: text/plain, Size: 3810 bytes --]

El 11/11/2023 a las 6:57, Brian Inglis escribió:
> On 2023-11-10 10:44, Pedro Luis Castedo Cepeda wrote:
>> El 10/11/2023 a las 11:16, Corinna Vinschen escribió:
>>> On Nov  9 23:17, Brian Inglis wrote:
>>>> On 2023-11-09 12:04, Pedro Luis Castedo Cepeda wrote:
>>>>> - Prevent strftime to parsing format string beyond its end when
>>>>>     it finish with "%E" or "%O".
>>>>> ---
>>>>>    newlib/libc/time/strftime.c | 2 ++
>>>>>    1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/newlib/libc/time/strftime.c b/newlib/libc/time/strftime.c
>>>>> index 56f227c5f..c4e9e45a9 100644
>>>>> --- a/newlib/libc/time/strftime.c
>>>>> +++ b/newlib/libc/time/strftime.c
>>>>> @@ -754,6 +754,8 @@ __strftime (CHAR *s, size_t maxsize, const CHAR 
>>>>> *format,
>>>>>          switch (*format)
>>>>>        {
>>>>> +    case CQ('\0'):
>>>>> +      break;
>>>>>        case CQ('a'):
>>>>>          _ctloc (wday[tim_p->tm_wday]);
>>>>>          for (i = 0; i < ctloclen; i++)
>>>>
>>>> These cases appear to already be taken care of by setting and using
>>>> (depending on the config parameters) the "alt" variable for those 
>>>> modifiers,
>>>> and the default: return 0; for the format *character* (possibly 
>>>> wide) not
>>>> matching following any modifiers.
>>>>
>>>> Patches to newlib should go to the newlib mailing list at sourceware 
>>>> dot org.
>>>
>>> Also, a simple reproducer would be nice.
> 
>> My first contribution. Sorry about posting to wrong mail list and, at 
>> best, minimalistic patch motivation reasoning. First time with git 
>> send-mail, too.
>>
>> I came across this newlib "feature" trying to update GLib port to 
>> 2.78.1. When trying to find out why test_strftime (glib/test/date.c)
>> was failing I discovered that one of the test format strings, "%E" was 
>> triggering a loop in g_date_strftime (glib/gdate.c) requiring more and 
>> more memory till it was stopped by a fortunate maximum size check in 
>> function.
>>
>> The problem is that __strftime  (newlib/libc/time/strftime.c) doesn't 
>> check for '\0' after a terminal "%E" and it continues parsing the 
>> format string. Finally (not sure if intentionally), this triggers a 
>> direct return 0 from __strftime instead breaking the loop, preventing 
>> it from add '\0' to the end of returned string. Same for "%O", I think 
>> (not tested).
>>
>> It seems that this trailing '\0' allows to differentiate returning an 
>> empty string from needing more space (at least, in Glib).
>>
>> So, is it a newlib bug? Not really, I think this format string is 
>> bad-formed (%E should modify something, shouldn't it?) So undefined 
>> behaviour is OK. I could patch-out these format strings from the port.
>>
>> But... from Glib tests, it seems that, at least:
>>
>> - If G_OS_WIN32, terminal "%E" & "%O" are silently discarded.
>> - If __FreeBSD__ || __OpenBSD__ || __APPLE__ they are transformed to E 
>> & O, respectively.
>> - And if #else the same thing is expected.
>>
>> So it seems that returning 0-terminated string is a common practice 
>> and I also think that this is more deterministic and, potentially, 
>> safer. That's why I sent the patch. It tries to be the shortest 
>> addition to check for end of string after %E & %O modifiers and takes 
>> G_OS_WIN32 approach (only cause it's the simplest).
> 
> Not seeing any issue with any format - see attached source and log 
> output, built under Cygwin.
> [Derived from a bash script using printf %(...)T to do the same thing.]
> 

OK. It's not a newlib problem but a GLib one as it is relaying on common 
but non-standard strftime implementation details.

I attach a short program more focused in g_date_strftime implementation 
so it can be evaluated if it worths addressing this corner case.

Thanks.

[-- Attachment #2: strftime_lonelyE.c --]
[-- Type: text/plain, Size: 1032 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <assert.h>

int main(int argc, char *argv[])
{
  const struct tm date = {
    .tm_sec = 0, .tm_min = 0, .tm_hour = 0, .tm_mday = 1, .tm_mon = 0,
    .tm_year = -1899, .tm_wday = 1, .tm_yday = 0, .tm_isdst = -1,
    .tm_gmtoff = 0, .tm_zone = 0x0
  };
  const char *fmt = "%E";
  const size_t maxsz = 65536u;

  size_t bufsz = 128;
  char *buf = NULL;

  // Mimic g_date_strftime (glib/gdate.c) approach
  do {
    buf = (char *)realloc(buf, bufsz);
    assert(buf);

    buf[0] = '\1'; // Mark to guess if empty or not enough space
    size_t len = strftime(buf, bufsz, fmt, &date);
    if (len != 0 || buf[0] == '\0')
      break;  // OK, done.

    bufsz *= 2;  // Assume more space needed
  } while (bufsz <= maxsz);  // .. up to a limit

  int rc;
  if (bufsz <= maxsz)
  {
    printf("Date: %s\n", buf);
    rc = EXIT_SUCCESS;
  }
  else
  {
    puts("Date: longest date ever :-\\");
    rc = EXIT_FAILURE;
  }
  free(buf);

  return rc;
}


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
  2023-11-11 17:29         ` Pedro Luis Castedo Cepeda
@ 2023-11-13 16:42           ` Corinna Vinschen
  2023-11-13 18:46             ` Brian Inglis
  0 siblings, 1 reply; 8+ messages in thread
From: Corinna Vinschen @ 2023-11-13 16:42 UTC (permalink / raw)
  To: cygwin-patches

Hi Pedro,

On Nov 11 18:29, Pedro Luis Castedo Cepeda wrote:
> OK. It's not a newlib problem but a GLib one as it is relaying on common but
> non-standard strftime implementation details.
> 
> I attach a short program more focused in g_date_strftime implementation so
> it can be evaluated if it worths addressing this corner case.

Tricky.  I wonder what the GLib test is actually trying to accomplish.

POSIX has this to say:

  RETURN VALUE
    If the total number of resulting bytes including the  terminating
    null byte  is not more than maxsize, these functions shall return
    the number of bytes placed into the array pointed to by s, not
    including the  terminating NUL character. Otherwise, 0 shall be
    returned and the contents of the array are unspecified.

  ERRORS
    No errors are defined.

But, and that's the big problem, POSIX does *not* provide for the
error case, because it doesn't allow an error like using an incorrect
format string to occur.  Using an incorrect or undefined format code
is just not part of the standard.

And the Linux man page has an interesting extension to the above
POSIX RETURN VALUE section:

    Note  that  the  return value 0 does not necessarily indicate an
    error.  For example, in many locales %p yields an empty string.  An
    empty  format string will likewise yield an empty string.

and additionally in the BUGS section:

    If the output string would exceed max bytes, errno is  not  set.
    This makes it impossible to distinguish this error case from cases
    where the format  string  legitimately  produces  a  zero-length
    output  string.  POSIX.1-2001 does not specify any errno settings
    for strftime().

So the below case tested by GLib is entirely out of scope of the
standard.


Corinna

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string
  2023-11-13 16:42           ` Corinna Vinschen
@ 2023-11-13 18:46             ` Brian Inglis
  0 siblings, 0 replies; 8+ messages in thread
From: Brian Inglis @ 2023-11-13 18:46 UTC (permalink / raw)
  To: cygwin-patches

On 2023-11-13 09:42, Corinna Vinschen wrote:
> Hi Pedro,
> 
> On Nov 11 18:29, Pedro Luis Castedo Cepeda wrote:
>> OK. It's not a newlib problem but a GLib one as it is relaying on common but
>> non-standard strftime implementation details.
>>
>> I attach a short program more focused in g_date_strftime implementation so
>> it can be evaluated if it worths addressing this corner case.
> 
> Tricky.  I wonder what the GLib test is actually trying to accomplish.
> 
> POSIX has this to say:
> 
>    RETURN VALUE
>      If the total number of resulting bytes including the  terminating
>      null byte  is not more than maxsize, these functions shall return
>      the number of bytes placed into the array pointed to by s, not
>      including the  terminating NUL character. Otherwise, 0 shall be
>      returned and the contents of the array are unspecified.
> 
>    ERRORS
>      No errors are defined.
> 
> But, and that's the big problem, POSIX does *not* provide for the
> error case, because it doesn't allow an error like using an incorrect
> format string to occur.  Using an incorrect or undefined format code
> is just not part of the standard.
> 
> And the Linux man page has an interesting extension to the above
> POSIX RETURN VALUE section:
> 
>      Note  that  the  return value 0 does not necessarily indicate an
>      error.  For example, in many locales %p yields an empty string.  An
>      empty  format string will likewise yield an empty string.
> 
> and additionally in the BUGS section:
> 
>      If the output string would exceed max bytes, errno is  not  set.
>      This makes it impossible to distinguish this error case from cases
>      where the format  string  legitimately  produces  a  zero-length
>      output  string.  POSIX.1-2001 does not specify any errno settings
>      for strftime().
> 
> So the below case tested by GLib is entirely out of scope of the
> standard.

C 2023 draft says:

"7.29.3.5 The strftime function
...
     Returns
8   If the total number of resulting characters including the terminating null
character is not more than maxsize, the strftime function returns the number of 
characters placed into the array pointed to by s not including the terminating 
null character. Otherwise, zero is returned and the members of the array have an 
indeterminate representation."

GLib is not tested under Cygwin/newlib only VS2017 and Msys2-Mingw32 native 
Windows:

https://gitlab.gnome.org/GNOME/glib/-/issues/2604

https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2480/diffs

https://gitlab.gnome.org/malureau/glib/-/pipelines/366193/test_report

[may need to navigate not always obvious "tabs"]

Should raise a followup issue there - and point out it is not a Win32 issue - it 
is another POSIX alternate libc issue - hosted under a Win64 environment.

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                 -- Antoine de Saint-Exupéry

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-11-13 18:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-09 19:04 [PATCH] fix(libc): Fix handle of %E & %O modifiers at end of format string Pedro Luis Castedo Cepeda
2023-11-10  6:17 ` Brian Inglis
2023-11-10 10:16   ` Corinna Vinschen
2023-11-10 17:44     ` Pedro Luis Castedo Cepeda
2023-11-11  5:57       ` Brian Inglis
2023-11-11 17:29         ` Pedro Luis Castedo Cepeda
2023-11-13 16:42           ` Corinna Vinschen
2023-11-13 18:46             ` Brian Inglis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).