I made several of the changes Paul suggested. And ran all the tests. This patch does not fill out the tm_zone information because the manpage for the api lists this a format character where "the corresponding fields are parsed, but no field in tm is changed." I feel that if this is the behaviour developers have expected & written around in the past, that behaviour should stay. However, if everyone here feels that a full implementation should be written, I can do so. On Tue, Jul 18, 2023 at 12:17 PM Stanley Lancaster wrote: > --- > time/strptime_l.c | 16 +++++++++++----- > time/tst-strptime.c | 2 ++ > 2 files changed, 13 insertions(+), 5 deletions(-) > > diff --git a/time/strptime_l.c b/time/strptime_l.c > index 85c3249fcc..2382defc92 100644 > --- a/time/strptime_l.c > +++ b/time/strptime_l.c > @@ -770,11 +770,17 @@ __strptime_internal (const char *rp, const char > *fmt, struct tm *tmp, > break; > case 'Z': > /* Read timezone but perform no conversion. */ > - while (ISSPACE (*rp)) > - rp++; > - while (!ISSPACE (*rp) && *rp != '\0') > - rp++; > - break; > + { > + while (ISSPACE (*rp)) > + rp++; > + /* Read time zone but perform no conversion. Recognize the format > [-+a-zA-Z0-9]{3,}. */ > + const char* start_rp = rp; > + while ((*rp >= 'A' && *rp <= 'Z') || (*rp >= 'a' && *rp <= 'z') || (*rp > >= '0' && *rp <= '9')) > + rp++; > + if (start_rp+3 < rp) > + return NULL; > + } > case 'z': > /* We recognize four formats: > 1. Two digits specify hours. > diff --git a/time/tst-strptime.c b/time/tst-strptime.c > index 3dae9e0594..40145cb109 100644 > --- a/time/tst-strptime.c > +++ b/time/tst-strptime.c > @@ -48,6 +48,8 @@ static const struct > 6, 0, 0, 1 }, > { "en_US.ISO-8859-1", "2000-01-01 08:12:21 PM", "%Y-%m-%d %I:%M:%S %p", > 6, 0, 0, 1 }, > + { "en_US.ISO-8859-1", "2000-01-01 08:12:21 AM CST/", "%Y-%m-%d %I:%M:%S > %p %Z/", > + 6, 0, 0, 1}, > { "ja_JP.EUC-JP", "2001 20 \xb7\xee", "%Y %U %a", 1, 140, 4, 21 }, > { "ja_JP.EUC-JP", "2001 21 \xb7\xee", "%Y %W %a", 1, 140, 4, 21 }, > /* Most of the languages do not need the declension of the month names > -- > 2.39.3 > > On Fri, Jul 14, 2023 at 11:11 AM Paul Eggert wrote: > >> On 2023-07-14 07:52, Stanley Lancaster via Libc-alpha wrote: >> >> > /* Read timezone but perform no conversion. */ >> > + /* we recognize the format [-+a-zA-Z0-9]{3,} */ >> >> Use GNU style in comment with active voice sentences and two spaces >> after sentence end, e.g., "/* Read time zone but perform no conversion. >> Recognize the format [-+a-zA-Z0-9]{3,}. */". >> >> >> > + const char* stop_rp = rp + 3; >> >> Again, GNU style: "char *stop_rp" not "char* stop_rp". >> >> More important, this has undefined behavior if rp + 3 overflows. >> Instead, count the number of bytes after the loop finishes, and make >> sure it's 3 or more. >> >> >> > + while (((*rp >= 'A' && *rp <= 'Z') || (*rp >= 'a' && *rp <= >> 'z') || (*rp >= '0' && *rp <= '9')) && (rp < stop_rp) && *rp != '\0') >> >> Omit "&& *rp != '\0'"; it's redundant. Reindent to 80 columns. >> >> >> > + { "C", "1999CST0502123412", "%Y%Z%m%d%H%M%S", 0, 121, 4, 2 }, >> >> I don't see how this test passes. "CST0502123412" is treated as a time >> zone abbreviation, so the only info is the year. Did you run the tests? >> If so, why did this test pass? If not, please run. >> >> Since the patch does not fix BZ#16088, it needs a commit message that >> describes what the patch does and why it's a win even though it doesn't >> fix. In particular, the patch does not set tm_zone, and there's a reason >> for that, and this should be explained. >> >