public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell <carlos@redhat.com>
To: TAMUKI Shoichi <tamuki@linet.gr.jp>, libc-alpha@sourceware.org
Subject: Re: [PATCH v4] Improve the width of alternate representation for year in strftime [BZ #23758]
Date: Thu, 03 Jan 2019 14:43:00 -0000	[thread overview]
Message-ID: <de2b2db3-184b-e37c-af3c-704969803385@redhat.com> (raw)
In-Reply-To: <201812070655.AA04129@tamuki.linet.gr.jp>

On 12/7/18 1:55 AM, TAMUKI Shoichi wrote:
> The Japanese era name is scheduled to be changed on May 1, 2019.
> Prior to this, change the alternate representation for year in
> strftime to pad the number with zero to keep it constant width, so
> that prevent the trouble we saw in the past from becoming obvious
> again from the year after the era name changes onward.
> 
> Since only one Japanese era name is used by each emperor's reign as
> lately, it is rare that the year ends in one digit or lasts more than
> three digits.  In addition, the default width of month, day, hour,
> minute, and second is 2, so adjust the default width of year the same
> as them, and then the whole display balance is improved.  Therefore,
> it would be reasonable to change the width padding with zero of %Ey
> default to 2.
> 
> Currently in glibc, besides ja_JP locale, the locales using the
> conversion specifier above are lo_LA (Laos) and th_TH (Thailand).  In
> these locales, they use the Buddhist era.  The Buddhist era is a value
> obtained by adding 543 to the Christian era, so they are not affected
> by the change of the conversion specifier %Ey.
> 
> Furthermore, for the output string of the conversion specifier %EY,
> an optional flag is given to the conversion specifier so that it can
> be also used the current non-padding format, and the padding format
> can be controlled.  To achieve this, when an optional flag is given to
> the conversion specifier %EY, the %Ey included in the combined
> conversion specifier is interpreted as if decorated with the
> appropriate flag.

Please split this patch into 2 patches as soon as possible.

The fix for "%_EY"/"%-EY" should be reviewed separately as a bug fix,
and can go into master even if we are in a frozen state (we are currently
frozen for 2.29 release).

I want to review these changes in parallel so we can get them into 2.29
and be prepared for the Era name change.

Cheers,
Carlos.

> ChangeLog:
> 
> 	[BZ #23758]
> 	* NEWS: Mention the change.
> 	* manual/time.texi (strftime): Document the desctiption for %EC, %Ey,
> 	and %EY.
> 	* time/Makefile (tests): Add tst-strftime2.
> 	* time/strftime_l.c (__strftime_internal): Add argument yr_spec to
> 	override padding for %Ey.
> 	Change the width padding with zero of %Ey default to 2.  If an
> 	optional flag ('_' or '-') is specified to %EY, the %Ey in subformat
> 	is interpreted as if decorated with the appropriate flag.
> 	* time/tst-strftime2.c: New file.
> ---
>  Changes since v3:
>  - use support/test-driver.c instead of the old-style test-skeleton.c
>    in tst-strftime2.c, also correct the copyright year
>  - slightly modify the documentation including commit message
> 
>  Remarks:
>  At last, I have fulfilled all legal requirements, so you can review
>  my patch without worrying about the copyright issues.
> 
>  In this patch, all items pointed out so far have been fixed.
> 
>  My concern is the documentation.  Please review especially wording,
>  expression, etc.  (I am not an English native speaker.)
> 
>  NEWS                 |   7 +++
>  manual/time.texi     |  15 ++++++
>  time/Makefile        |   2 +-
>  time/strftime_l.c    |  28 ++++++-----
>  time/tst-strftime2.c | 133 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 173 insertions(+), 12 deletions(-)
>  create mode 100644 time/tst-strftime2.c
> 
> diff --git a/NEWS b/NEWS
> index 8483dcf4928..40e33e228f8 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -41,6 +41,13 @@ Major new features:
>    incosistent mutex state after fork call in multithread environment.
>    In both popen and system there is no direct access to user-defined mutexes.
>  
> +* Improve the width of alternate representation for year in strftime.
> +  For %Ey conversion specifier, the default action is now to pad the
> +  number with zero to keep minimum 2 digits as well as %y.  Also, the
> +  optional flag (either _ or -) can be used for %EY, so that the
> +  internal %Ey is interpreted as if decorated with the appropriate
> +  flag.
> +
>  Deprecated and removed features, and other changes affecting compatibility:
>  
>  * The glibc.tune tunable namespace has been renamed to glibc.cpu and the
> diff --git a/manual/time.texi b/manual/time.texi
> index 4d154452eba..c1d5770f9e0 100644
> --- a/manual/time.texi
> +++ b/manual/time.texi
> @@ -1393,6 +1393,9 @@ The preferred calendar time representation for the current locale.
>  The century of the year.  This is equivalent to the greatest integer not
>  greater than the year divided by 100.
>  
> +If the @code{E} modifier is specified (@code{%EC}), the locale's
> +alternate representation for year (the era name) is used instead.
> +
>  This format was first standardized by POSIX.2-1992 and by @w{ISO C99}.
>  
>  @item %d
> @@ -1568,10 +1571,22 @@ The preferred time of day representation for the current locale.
>  The year without a century as a decimal number (range @code{00} through
>  @code{99}).  This is equivalent to the year modulo 100.
>  
> +If the @code{E} modifier is specified (@code{%Ey}), the locale's
> +alternate representation for year (the era year) is used instead.  The
> +default action is to pad the number with zero to keep minimum 2 digits
> +as well as @code{%y}.
> +
>  @item %Y
>  The year as a decimal number, using the Gregorian calendar.  Years
>  before the year @code{1} are numbered @code{0}, @code{-1}, and so on.
>  
> +If the @code{E} modifier is specified (@code{%EY}), the locale's
> +alternate representation for year (generally the combination of
> +@code{%EC} and @code{%Ey}) is used instead.  In this case, the
> +optional flag (either @code{_} or @code{-}) can be used, so that the
> +internal @code{%Ey} is interpreted as if decorated with the
> +appropriate flag.
> +
>  @item %z
>  @w{RFC 822}/@w{ISO 8601:1988} style numeric time zone (e.g.,
>  @code{-0600} or @code{+0100}), or nothing if no time zone is
> diff --git a/time/Makefile b/time/Makefile
> index 743bd99f182..9f94e07e3bc 100644
> --- a/time/Makefile
> +++ b/time/Makefile
> @@ -43,7 +43,7 @@ tests	:= test_time clocktest tst-posixtz tst-strptime tst_wcsftime \
>  	   tst-getdate tst-mktime tst-mktime2 tst-ftime_l tst-strftime \
>  	   tst-mktime3 tst-strptime2 bug-asctime bug-asctime_r bug-mktime1 \
>  	   tst-strptime3 bug-getdate1 tst-strptime-whitespace tst-ftime \
> -	   tst-tzname tst-y2039 bug-mktime4
> +	   tst-tzname tst-y2039 bug-mktime4 tst-strftime2
>  
>  include ../Rules
>  
> diff --git a/time/strftime_l.c b/time/strftime_l.c
> index c71f9f47a95..8797341ea53 100644
> --- a/time/strftime_l.c
> +++ b/time/strftime_l.c
> @@ -434,7 +434,7 @@ static CHAR_T const month_name[][10] =
>  #endif
>  
>  static size_t __strftime_internal (CHAR_T *, size_t, const CHAR_T *,
> -				   const struct tm *, bool *
> +				   const struct tm *, int *, bool *
>  				   ut_argument_spec
>  				   LOCALE_PARAM) __THROW;
>  
> @@ -456,8 +456,9 @@ my_strftime (CHAR_T *s, size_t maxsize, const CHAR_T *format,
>    tmcopy = *tp;
>    tp = &tmcopy;
>  #endif
> +  int yr_spec = 0;		/* Override padding for %Ey.  */
>    bool tzset_called = false;
> -  return __strftime_internal (s, maxsize, format, tp, &tzset_called
> +  return __strftime_internal (s, maxsize, format, tp, &yr_spec, &tzset_called
>  			      ut_argument LOCALE_ARG);
>  }
>  #ifdef _LIBC
> @@ -466,7 +467,7 @@ libc_hidden_def (my_strftime)
>  
>  static size_t
>  __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
> -		     const struct tm *tp, bool *tzset_called
> +		     const struct tm *tp, int *yr_spec, bool *tzset_called
>  		     ut_argument_spec LOCALE_PARAM)
>  {
>  #if defined _LIBC && defined USE_IN_EXTENDED_LOCALE_MODEL
> @@ -820,7 +821,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
>  	  if (modifier == L_('O'))
>  	    goto bad_format;
>  #ifdef _NL_CURRENT
> -	  if (! (modifier == 'E'
> +	  if (! (modifier == L_('E')
>  		 && (*(subfmt =
>  		       (const CHAR_T *) _NL_CURRENT (LC_TIME,
>  						     NLW(ERA_D_T_FMT)))
> @@ -838,11 +839,12 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
>  	  {
>  	    CHAR_T *old_start = p;
>  	    size_t len = __strftime_internal (NULL, (size_t) -1, subfmt,
> -					      tp, tzset_called ut_argument
> -					      LOCALE_ARG);
> +					      tp, yr_spec, tzset_called
> +					      ut_argument LOCALE_ARG);
>  	    add (len, __strftime_internal (p, maxsize - i, subfmt,
> -					   tp, tzset_called ut_argument
> -					   LOCALE_ARG));
> +					   tp, yr_spec, tzset_called
> +					   ut_argument LOCALE_ARG));
> +	    *yr_spec = 0;
>  
>  	    if (to_uppcase)
>  	      while (old_start < p)
> @@ -917,7 +919,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
>  #ifdef _NL_CURRENT
>  	  if (! (modifier == L_('E')
>  		 && (*(subfmt =
> -		       (const CHAR_T *)_NL_CURRENT (LC_TIME, NLW(ERA_D_FMT)))
> +		       (const CHAR_T *) _NL_CURRENT (LC_TIME, NLW(ERA_D_FMT)))
>  		     != L_('\0'))))
>  	    subfmt = (const CHAR_T *) _NL_CURRENT (LC_TIME, NLW(D_FMT));
>  	  goto subformat;
> @@ -1262,7 +1264,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
>  	  DO_NUMBER (1, tp->tm_wday);
>  
>  	case L_('Y'):
> -	  if (modifier == 'E')
> +	  if (modifier == L_('E'))
>  	    {
>  #if HAVE_STRUCT_ERA_ENTRY
>  	      struct era_entry *era = _nl_get_era_entry (tp HELPER_LOCALE_ARG);
> @@ -1273,6 +1275,8 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
>  # else
>  		  subfmt = era->era_format;
>  # endif
> +		  if (pad != 0)
> +		    *yr_spec = pad;
>  		  goto subformat;
>  		}
>  #else
> @@ -1294,7 +1298,9 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
>  	      if (era)
>  		{
>  		  int delta = tp->tm_year - era->start_date[0];
> -		  DO_NUMBER (1, (era->offset
> +		  if (*yr_spec != 0)
> +		    pad = *yr_spec;
> +		  DO_NUMBER (2, (era->offset
>  				 + delta * era->absolute_direction));
>  		}
>  #else
> diff --git a/time/tst-strftime2.c b/time/tst-strftime2.c
> new file mode 100644
> index 00000000000..6f2cf4ddd81
> --- /dev/null
> +++ b/time/tst-strftime2.c
> @@ -0,0 +1,133 @@
> +/* Verify the behavior of strftime on alternate representation for year.
> +
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <locale.h>
> +#include <time.h>
> +#include <stdio.h>
> +#include <string.h>
> +
> +static const char *locales[] = { "ja_JP.UTF-8", "lo_LA.UTF-8", "th_TH.UTF-8" };
> +#define nlocales (sizeof (locales) / sizeof (locales[0]))
> +
> +static const char *formats[] = { "%EY", "%_EY", "%-EY" };
> +#define nformats (sizeof (formats) / sizeof (formats[0]))
> +
> +static const struct
> +{
> +  const int d, m, y;
> +} dates[] =
> +  {
> +    { 1, 3, 88 },
> +    { 7, 0, 89 },
> +    { 8, 0, 89 },
> +    { 1, 3, 90 },
> +    { 1, 3, 97 },
> +    { 1, 3, 98 }
> +  };
> +#define ndates (sizeof (dates) / sizeof (dates[0]))
> +
> +static char ref[nlocales][nformats][ndates][100];
> +
> +static void
> +mkreftable (void)
> +{
> +  int i, j, k;
> +  char era[10];
> +  static const int yrj[] = { 63, 64, 1, 2, 9, 10 };
> +  static const int yrb[] = { 2531, 2532, 2532, 2533, 2540, 2541 };
> +
> +  for (i = 0; i < nlocales; i++)
> +    for (j = 0; j < nformats; j++)
> +      for (k = 0; k < ndates; k++)
> +	{
> +	  if (i == 0)
> +	    {
> +	      sprintf (era, "%s", (k < 2) ? "\xe6\x98\xad\xe5\x92\x8c"
> +					  : "\xe5\xb9\xb3\xe6\x88\x90");
> +	      if (yrj[k] == 1)
> +		sprintf (ref[i][j][k], "%s\xe5\x85\x83\xe5\xb9\xb4", era);
> +	      else
> +		{
> +		  if (j == 0)
> +		    sprintf (ref[i][j][k], "%s%02d\xe5\xb9\xb4", era, yrj[k]);
> +		  else if (j == 1)
> +		    sprintf (ref[i][j][k], "%s%2d\xe5\xb9\xb4", era, yrj[k]);
> +		  else
> +		    sprintf (ref[i][j][k], "%s%d\xe5\xb9\xb4", era, yrj[k]);
> +		}
> +	    }
> +	  else if (i == 1)
> +	    {
> +	      sprintf (era, "\xe0\xba\x9e\x2e\xe0\xba\xaa\x2e ");
> +	      sprintf (ref[i][j][k], "%s%d", era, yrb[k]);
> +	    }
> +	  else
> +	    {
> +	      sprintf (era, "\xe0\xb8\x9e\x2e\xe0\xb8\xa8\x2e ");
> +	      sprintf (ref[i][j][k], "%s%d", era, yrb[k]);
> +	    }
> +	}
> +}
> +
> +static int
> +do_test (void)
> +{
> +  int i, j, k, result = 0;
> +  struct tm ttm;
> +  char date[11], buf[100];
> +  size_t r, e;
> +
> +  mkreftable ();
> +  for (i = 0; i < nlocales; i++)
> +    {
> +      if (setlocale (LC_ALL, locales[i]) == NULL)
> +	{
> +	  printf ("locale %s does not exist, skipping...\n", locales[i]);
> +	  continue;
> +	}
> +      printf ("[%s]\n", locales[i]);
> +      for (j = 0; j < nformats; j++)
> +	{
> +	  for (k = 0; k < ndates; k++)
> +	    {
> +	      ttm.tm_mday = dates[k].d;
> +	      ttm.tm_mon  = dates[k].m;
> +	      ttm.tm_year = dates[k].y;
> +	      strftime (date, sizeof (date), "%F", &ttm);
> +	      r = strftime (buf, sizeof (buf), formats[j], &ttm);
> +	      e = strlen (ref[i][j][k]);
> +	      printf ("%s\t\"%s\"\t\"%s\"", date, formats[j], buf);
> +	      if (strcmp (buf, ref[i][j][k]) != 0)
> +		{
> +		  printf ("\tshould be \"%s\"", ref[i][j][k]);
> +		  if (r != e)
> +		    printf ("\tgot: %zu, expected: %zu", r, e);
> +		  result = 1;
> +		}
> +	      else
> +		printf ("\tOK");
> +	      putchar ('\n');
> +	    }
> +	  putchar ('\n');
> +	}
> +    }
> +  return result;
> +}
> +
> +#include <support/test-driver.c>
> 


-- 
Cheers,
Carlos.

  parent reply	other threads:[~2019-01-03 14:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-07  7:01 TAMUKI Shoichi
2018-12-14 12:44 ` TAMUKI Shoichi
2018-12-21 10:13   ` Rafal Luzynski
2018-12-22 11:16     ` TAMUKI Shoichi
2019-01-03 14:34       ` Carlos O'Donell
2018-12-28 11:18 ` Rafal Luzynski
2019-01-06  6:31   ` TAMUKI Shoichi
2019-01-07 16:23     ` Joseph Myers
2019-01-08  9:08       ` TAMUKI Shoichi
2019-01-03 14:43 ` Carlos O'Donell [this message]
2019-01-03 22:53   ` Rafal Luzynski
2019-01-12 13:38   ` TAMUKI Shoichi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=de2b2db3-184b-e37c-af3c-704969803385@redhat.com \
    --to=carlos@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=tamuki@linet.gr.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).