public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug stdio/28943] New: printf field width specifier is inconsistent between %d and %f for multibyte output
@ 2022-03-04  8:12 fweimer at redhat dot com
  2022-03-04  8:44 ` [Bug stdio/28943] " fweimer at redhat dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2022-03-04  8:12 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28943

            Bug ID: 28943
           Summary: printf field width specifier is inconsistent between
                    %d and %f for multibyte output
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: stdio
          Assignee: unassigned at sourceware dot org
          Reporter: fweimer at redhat dot com
  Target Milestone: ---

With the ' and I flags, numeric conversions can include non-ASCII multibyte
characters. Presently, field width is handled differently for %f and %d
conversions. In %f conversions, the width is based on the number of
digits/separators before output encoding (each has width 1), with %d, width is
based on bytes.

POSIX.1-2017 says:

“
An optional minimum field width.  If the converted value has fewer bytes than
the field width, it shall be padded with <space> characters by default on the
left; it shall be padded on the right if the left-adjustment flag ('-'),
described below, is given to the field width.
”

Which suggests the %d behavior. (POSIX only specifies ', but not I.) But this
makes printf inconsistent with wprintf. Counting bytes is also seems not very
useful for the column adjustments of numbers, particularly combined with
localization. So I think glibc should adopt character counts, not byte counts.

This only applies to the numeric conversions; for (non-wide) strings, field
width should continue to be measured in bytes, not multibyte characters.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug stdio/28943] printf field width specifier is inconsistent between %d and %f for multibyte output
  2022-03-04  8:12 [Bug stdio/28943] New: printf field width specifier is inconsistent between %d and %f for multibyte output fweimer at redhat dot com
@ 2022-03-04  8:44 ` fweimer at redhat dot com
  2022-05-23  9:08 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2022-03-04  8:44 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28943

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://sourceware.org/bugz
                   |                            |illa/show_bug.cgi?id=28944

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug stdio/28943] printf field width specifier is inconsistent between %d and %f for multibyte output
  2022-03-04  8:12 [Bug stdio/28943] New: printf field width specifier is inconsistent between %d and %f for multibyte output fweimer at redhat dot com
  2022-03-04  8:44 ` [Bug stdio/28943] " fweimer at redhat dot com
@ 2022-05-23  9:08 ` cvs-commit at gcc dot gnu.org
  2023-02-06 17:42 ` carlos at redhat dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-23  9:08 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28943

--- Comment #1 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Florian Weimer <fw@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=21bb8382b62f7dc20b9936bab32658e8fd5952e0

commit 21bb8382b62f7dc20b9936bab32658e8fd5952e0
Author: Florian Weimer <fweimer@redhat.com>
Date:   Mon May 23 10:08:18 2022 +0200

    stdio-common: Add tst-vfprintf-width-i18n to cover numeric field width

    Related to bug 28943 and bug 28944.

    Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug stdio/28943] printf field width specifier is inconsistent between %d and %f for multibyte output
  2022-03-04  8:12 [Bug stdio/28943] New: printf field width specifier is inconsistent between %d and %f for multibyte output fweimer at redhat dot com
  2022-03-04  8:44 ` [Bug stdio/28943] " fweimer at redhat dot com
  2022-05-23  9:08 ` cvs-commit at gcc dot gnu.org
@ 2023-02-06 17:42 ` carlos at redhat dot com
  2023-09-25 12:39 ` fweimer at redhat dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: carlos at redhat dot com @ 2023-02-06 17:42 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28943

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at redhat dot com

--- Comment #2 from Carlos O'Donell <carlos at redhat dot com> ---
I think it makes complete sense that where we talk about multi-byte characters
(logical columns) that we talk about character counts for width, just like for
wide interfaces we talk about wchar_t counts for width.

We cannot change POSIX at this point so I won't delve into the fact that width
for POSIX should also have been specified in terms of characters. The logical
column concept works if the output digits, grouping character, and <space> are
all 1 byte in width, but otherwise doesn't and requires translating to physical
columns for display (if you are going to display at all).

So the open question here is: Should we change I to make things consistent?

I think we should because it makes this a better interface to use based on
character counts, not bytes, and aligned with wprintf.

In summary:
- I agree that "I" should use character counts, not byte counts, and we should
make it consistent between %f and %d.
- We should consider the security impact of making this change and evaluate
current uses of the interface in known distributions.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug stdio/28943] printf field width specifier is inconsistent between %d and %f for multibyte output
  2022-03-04  8:12 [Bug stdio/28943] New: printf field width specifier is inconsistent between %d and %f for multibyte output fweimer at redhat dot com
                   ` (2 preceding siblings ...)
  2023-02-06 17:42 ` carlos at redhat dot com
@ 2023-09-25 12:39 ` fweimer at redhat dot com
  2023-09-25 13:50 ` vincent-srcware at vinc17 dot net
  2024-03-25 14:34 ` schwab@linux-m68k.org
  5 siblings, 0 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2023-09-25 12:39 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28943

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://sourceware.org/bugz
                   |                            |illa/show_bug.cgi?id=30883

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug stdio/28943] printf field width specifier is inconsistent between %d and %f for multibyte output
  2022-03-04  8:12 [Bug stdio/28943] New: printf field width specifier is inconsistent between %d and %f for multibyte output fweimer at redhat dot com
                   ` (3 preceding siblings ...)
  2023-09-25 12:39 ` fweimer at redhat dot com
@ 2023-09-25 13:50 ` vincent-srcware at vinc17 dot net
  2024-03-25 14:34 ` schwab@linux-m68k.org
  5 siblings, 0 replies; 7+ messages in thread
From: vincent-srcware at vinc17 dot net @ 2023-09-25 13:50 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28943

Vincent Lefèvre <vincent-srcware at vinc17 dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vincent-srcware at vinc17 dot net

--- Comment #3 from Vincent Lefèvre <vincent-srcware at vinc17 dot net> ---
Note that ISO C is involved too when the decimal-point character is encoded on
several bytes (e.g. in the ps_AF locale, 2 bytes in UTF-8). See bug 30883.

(In reply to Florian Weimer from comment #0)
> This only applies to the numeric conversions; for (non-wide) strings, field
> width should continue to be measured in bytes, not multibyte characters.

This alone would introduce an inconsistency.

Note also that if you want column alignment, you would need to take into
account the real width of the characters, which may take 0, 1 or 2 columns. So
the character count would not be correct.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug stdio/28943] printf field width specifier is inconsistent between %d and %f for multibyte output
  2022-03-04  8:12 [Bug stdio/28943] New: printf field width specifier is inconsistent between %d and %f for multibyte output fweimer at redhat dot com
                   ` (4 preceding siblings ...)
  2023-09-25 13:50 ` vincent-srcware at vinc17 dot net
@ 2024-03-25 14:34 ` schwab@linux-m68k.org
  5 siblings, 0 replies; 7+ messages in thread
From: schwab@linux-m68k.org @ 2024-03-25 14:34 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28943

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dreibh at simula dot no

--- Comment #4 from Andreas Schwab <schwab@linux-m68k.org> ---
*** Bug 31542 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-25 14:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-04  8:12 [Bug stdio/28943] New: printf field width specifier is inconsistent between %d and %f for multibyte output fweimer at redhat dot com
2022-03-04  8:44 ` [Bug stdio/28943] " fweimer at redhat dot com
2022-05-23  9:08 ` cvs-commit at gcc dot gnu.org
2023-02-06 17:42 ` carlos at redhat dot com
2023-09-25 12:39 ` fweimer at redhat dot com
2023-09-25 13:50 ` vincent-srcware at vinc17 dot net
2024-03-25 14:34 ` schwab@linux-m68k.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).