public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream
@ 2014-10-29 15:57 arjun.is at lostca dot se
  2014-10-29 18:03 ` [Bug stdio/17522] " carlos at redhat dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: arjun.is at lostca dot se @ 2014-10-29 15:57 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

            Bug ID: 17522
           Summary: `fputws' errors out when writing wide characters to
                    unbuffered stream
           Product: glibc
           Version: 2.21
            Status: NEW
          Severity: normal
          Priority: P2
         Component: stdio
          Assignee: unassigned at sourceware dot org
          Reporter: arjun.is at lostca dot se

I observed this bug when trying to modify libio/tst-fopenloc.c (which uses
fputws) to use test-skeleton.c (which un-buffers stdout).

When writing to an unbuffered stream, `fputws' seems to error out as soon as it
encounters a UTF-8 character that takes up more than one byte.

Reproducer: The below test case should print "Platform 9¾" to stdout and finish
successfully. It does not. fputws prints "Platform 9" then returns -1:

#include <locale.h>
#include <stdio.h>
#include <wchar.h>
#include <string.h>
#include <errno.h>

int
main (void)
{
  wchar_t buf[100] = L"Platform 9";
  FILE *fp;
  int r;

  setlocale (LC_ALL, "en_US.UTF-8");

  buf[10] = L'\xbe'; /* unicode code point for "3/4" */
  buf[11] = L'\n';
  buf[12] = L'\0';

  setvbuf (stdout, NULL, _IONBF, 0);

  if (fputws (buf, stdout) < 0)
    return 1;

  return 0;
}

-- 
You are receiving this mail because:
You are on the CC list for the bug.
>From glibc-bugs-return-26483-listarch-glibc-bugs=sources.redhat.com@sourceware.org Wed Oct 29 16:50:51 2014
Return-Path: <glibc-bugs-return-26483-listarch-glibc-bugs=sources.redhat.com@sourceware.org>
Delivered-To: listarch-glibc-bugs@sources.redhat.com
Received: (qmail 27600 invoked by alias); 29 Oct 2014 16:50:50 -0000
Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <glibc-bugs.sourceware.org>
List-Subscribe: <mailto:glibc-bugs-subscribe@sourceware.org>
List-Post: <mailto:glibc-bugs@sourceware.org>
List-Help: <mailto:glibc-bugs-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: glibc-bugs-owner@sourceware.org
Delivered-To: mailing list glibc-bugs@sourceware.org
Received: (qmail 27571 invoked by uid 48); 29 Oct 2014 16:50:47 -0000
From: "schwab@linux-m68k.org" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
Date: Wed, 29 Oct 2014 16:50:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: glibc
X-Bugzilla-Component: stdio
X-Bugzilla-Version: 2.21
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: schwab@linux-m68k.org
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at sourceware dot org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-17522-131-zXNZ7G6oOO@http.sourceware.org/bugzilla/>
In-Reply-To: <bug-17522-131@http.sourceware.org/bugzilla/>
References: <bug-17522-131@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-10/txt/msg00119.txt.bz2
Content-length: 298

https://sourceware.org/bugzilla/show_bug.cgi?id\x17522

--- Comment #1 from Andreas Schwab <schwab@linux-m68k.org> ---
The problem is that an unbuffered stream has only room for a single byte buffer
for code conversion.

--
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
@ 2014-10-29 18:03 ` carlos at redhat dot com
  2014-10-29 18:43 ` arjun.is at lostca dot se
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: carlos at redhat dot com @ 2014-10-29 18:03 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at redhat dot com

--- Comment #2 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Andreas Schwab from comment #1)
> The problem is that an unbuffered stream has only room for a single byte
> buffer for code conversion.

This is a bug in tst-skeleton.c IMO, it should allocate a buffer large enough
for the test to succeed, but small enough that you still get output as quickly
as possible in the even of a crash.

Thus tst-skeleton.c needs to be enhanced to allow the test to define the size
of the stdout buffer it needs and then that can be allocated and passed to
setvbuf?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
  2014-10-29 18:03 ` [Bug stdio/17522] " carlos at redhat dot com
@ 2014-10-29 18:43 ` arjun.is at lostca dot se
  2014-10-29 18:49 ` carlos at redhat dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: arjun.is at lostca dot se @ 2014-10-29 18:43 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

--- Comment #3 from Arjun Shankar <arjun.is at lostca dot se> ---
(In reply to Carlos O'Donell from comment #2)
> This is a bug in tst-skeleton.c IMO, it should allocate a buffer large
> enough for the test to succeed, but small enough that you still get output
> as quickly as possible in the even of a crash.
> 
> Thus tst-skeleton.c needs to be enhanced to allow the test to define the
> size of the stdout buffer it needs and then that can be allocated and passed
> to setvbuf?

Reading the definitions of setvbuf [1] and fputws [2] didn't make it clear to
me that fputws is going to error out when writing a multi-byte character to an
unbuffered stream.

Andreas notes that there is a single byte buffer associated with unbuffered
streams. Is this single byte buffer present in accordance with some contract
offered by unbuffered streams? If not, then can I call the one byte buffer an
implementation detail? If it is an implementation detail, would it make sense
to associate each unbuffered stream with a buffer just wide enough to represent
one wide character in the chosen encoding scheme, instead of just one byte?
i.e. in the case of UTF-8, I guess this would mean a 4 byte buffer.

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/setvbuf.html
[2] http://pubs.opengroup.org/onlinepubs/009695399/functions/fputws.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
  2014-10-29 18:03 ` [Bug stdio/17522] " carlos at redhat dot com
  2014-10-29 18:43 ` arjun.is at lostca dot se
@ 2014-10-29 18:49 ` carlos at redhat dot com
  2014-10-29 19:12 ` arjun.is at lostca dot se
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: carlos at redhat dot com @ 2014-10-29 18:49 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

--- Comment #4 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Arjun Shankar from comment #3)
> (In reply to Carlos O'Donell from comment #2)
> > This is a bug in tst-skeleton.c IMO, it should allocate a buffer large
> > enough for the test to succeed, but small enough that you still get output
> > as quickly as possible in the even of a crash.
> > 
> > Thus tst-skeleton.c needs to be enhanced to allow the test to define the
> > size of the stdout buffer it needs and then that can be allocated and passed
> > to setvbuf?
> 
> Reading the definitions of setvbuf [1] and fputws [2] didn't make it clear
> to me that fputws is going to error out when writing a multi-byte character
> to an unbuffered stream.

It's a QoI issue.

> Andreas notes that there is a single byte buffer associated with unbuffered
> streams. Is this single byte buffer present in accordance with some contract
> offered by unbuffered streams? If not, then can I call the one byte buffer
> an implementation detail? If it is an implementation detail, would it make
> sense to associate each unbuffered stream with a buffer just wide enough to
> represent one wide character in the chosen encoding scheme, instead of just
> one byte? i.e. in the case of UTF-8, I guess this would mean a 4 byte buffer.
> 
> [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/setvbuf.html
> [2] http://pubs.opengroup.org/onlinepubs/009695399/functions/fputws.html

That's right, it is an implementation detail.

Because UTF-8 is a variable length encoding, you would need to immediately
print a character whenever you complete it regardless of the buffer size, but
rather based on the fact that you are unbuffered.

I don't know how much more work it would be to enhance the file stream support
to do this when unbuffered.

For example, printing ASCII, should just print right away, it's unbuffered, and
that's valid UTF-8. It should not be a naive implementation where you might
have 4 ASCII characters waiting in a buffer before being printed.

Does that answer your question?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
                   ` (2 preceding siblings ...)
  2014-10-29 18:49 ` carlos at redhat dot com
@ 2014-10-29 19:12 ` arjun.is at lostca dot se
  2014-10-29 19:16 ` carlos at redhat dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: arjun.is at lostca dot se @ 2014-10-29 19:12 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

--- Comment #5 from Arjun Shankar <arjun.is at lostca dot se> ---
(In reply to Carlos O'Donell from comment #4)
> It's a QoI issue.

Yes.

> For example, printing ASCII, should just print right away, it's unbuffered,
> and that's valid UTF-8. It should not be a naive implementation where you
> might have 4 ASCII characters waiting in a buffer before being printed.

Agreed. I incorrectly used the word 'buffer' in my comment. What I was trying
to say is that ideally, the encoder should have enough "internal memory" to
convert the internal representation of a wide character into the one used by
the current encoding scheme.

Which brings us back to the question of:

* we do this?:
> Thus tst-skeleton.c needs to be enhanced to allow the test to define the
> size of the stdout buffer it needs and then that can be allocated and passed
> to setvbuf

* or this?:
> enhance the file stream support to do this when unbuffered.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
                   ` (3 preceding siblings ...)
  2014-10-29 19:12 ` arjun.is at lostca dot se
@ 2014-10-29 19:16 ` carlos at redhat dot com
  2014-11-03  9:00 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: carlos at redhat dot com @ 2014-10-29 19:16 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

--- Comment #6 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Arjun Shankar from comment #5)
> * we do this?:
> > Thus tst-skeleton.c needs to be enhanced to allow the test to define the
> > size of the stdout buffer it needs and then that can be allocated and passed
> > to setvbuf

This.

The quality of the implementation should not stop you from attaining your goal
of adding tst-skeleton support to tests. That will enhance testing across the
board. However, now you are armed with enough information to justify *why* you
want this buffer size hack in tst-skeleton.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
                   ` (4 preceding siblings ...)
  2014-10-29 19:16 ` carlos at redhat dot com
@ 2014-11-03  9:00 ` cvs-commit at gcc dot gnu.org
  2014-11-03  9:03 ` schwab@linux-m68k.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2014-11-03  9:00 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

--- Comment #7 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  04b76b5aa8b2d1d19066e42dd1a56a38f34e274c (commit)
      from  4c6da7da9fb1f0f94e668e6d2966a4f50a7f0d85 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=04b76b5aa8b2d1d19066e42dd1a56a38f34e274c

commit 04b76b5aa8b2d1d19066e42dd1a56a38f34e274c
Author: Andreas Schwab <schwab@suse.de>
Date:   Thu Oct 30 12:18:48 2014 +0100

    Don't error out writing a multibyte character to an unbuffered stream (bug
17522)

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                                  |    8 ++++++++
 NEWS                                       |    2 +-
 libio/Makefile                             |    2 +-
 posix/tst-fnmatch3.c => libio/tst-fputws.c |   21 +++++++++++++++------
 libio/wfileops.c                           |   25 ++++++++++++++++++++-----
 5 files changed, 45 insertions(+), 13 deletions(-)
 copy posix/tst-fnmatch3.c => libio/tst-fputws.c (71%)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
                   ` (5 preceding siblings ...)
  2014-11-03  9:00 ` cvs-commit at gcc dot gnu.org
@ 2014-11-03  9:03 ` schwab@linux-m68k.org
  2014-12-16 11:26 ` cvs-commit at gcc dot gnu.org
  2023-12-20 15:43 ` fweimer at redhat dot com
  8 siblings, 0 replies; 10+ messages in thread
From: schwab@linux-m68k.org @ 2014-11-03  9:03 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |2.21

--- Comment #8 from Andreas Schwab <schwab@linux-m68k.org> ---
Fixed in 2.21.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
                   ` (6 preceding siblings ...)
  2014-11-03  9:03 ` schwab@linux-m68k.org
@ 2014-12-16 11:26 ` cvs-commit at gcc dot gnu.org
  2023-12-20 15:43 ` fweimer at redhat dot com
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2014-12-16 11:26 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

--- Comment #9 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  a0d424ef9d7fc34f7d1a516f38c8efb1e8692a03 (commit)
       via  8b460906cdb8ef1501fa5dcff54206b201e527d5 (commit)
       via  fa13e15b9a5cc49c9c6dee33084c3ff54d48e50e (commit)
       via  0e426475a70800b6a17daa7a8ebbafeabfcbc022 (commit)
      from  4f646bce1cae4031bfe7517e4793f1edc1a15220 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a0d424ef9d7fc34f7d1a516f38c8efb1e8692a03

commit a0d424ef9d7fc34f7d1a516f38c8efb1e8692a03
Author: Siddhesh Poyarekar <siddhesh@redhat.com>
Date:   Tue Dec 16 16:53:05 2014 +0530

    Fix 'array subscript is above array bounds' warning in res_send.c

    I see this warning in my build on F21 x86_64, which seems to be due to
    a weak check for array bounds.  Fixed by making the bounds check
    stronger.

    This is not an actual bug since nscount is never set to anything
    greater than MAXNS.  The compiler however does not know this, so we
    need the stronger bounds check to quieten the compiler.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8b460906cdb8ef1501fa5dcff54206b201e527d5

commit 8b460906cdb8ef1501fa5dcff54206b201e527d5
Author: Arjun Shankar <arjun.is@lostca.se>
Date:   Tue Dec 16 15:21:01 2014 +0530

    Modify libio/tst-fopenloc.c to use test-skeleton.c

    This test would earlier fail when run under test-skeleton.c due to
    bug #17522 in 'fputws'. That bug is now fixed and so this test may
    be modified.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fa13e15b9a5cc49c9c6dee33084c3ff54d48e50e

commit fa13e15b9a5cc49c9c6dee33084c3ff54d48e50e
Author: Arjun Shankar <arjun.is@lostca.se>
Date:   Tue Dec 16 15:19:51 2014 +0530

    Modify stdlib/tst-bsearch.c to use test-skeleton.c

    This test used to define a 'struct entry' that conflicts with the
    definition in search.h included in test-skeleton. The struct is
    now renamed 'item'.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0e426475a70800b6a17daa7a8ebbafeabfcbc022

commit 0e426475a70800b6a17daa7a8ebbafeabfcbc022
Author: Arjun Shankar <arjun.is@lostca.se>
Date:   Tue Dec 16 15:18:46 2014 +0530

    Modify stdio-common/tst-fseek.c to use test-skeleton.c

    This test needs a TIMEOUT longer than the default 2 seconds since it
    sleeps twice for a second each.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                |   15 +++++++++++++++
 libio/tst-fopenloc.c     |    7 +++++--
 resolv/res_send.c        |    2 +-
 stdio-common/tst-fseek.c |    8 ++++++--
 stdlib/tst-bsearch.c     |   27 +++++++++++++++------------
 5 files changed, 42 insertions(+), 17 deletions(-)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug stdio/17522] `fputws' errors out when writing wide characters to unbuffered stream
  2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
                   ` (7 preceding siblings ...)
  2014-12-16 11:26 ` cvs-commit at gcc dot gnu.org
@ 2023-12-20 15:43 ` fweimer at redhat dot com
  8 siblings, 0 replies; 10+ messages in thread
From: fweimer at redhat dot com @ 2023-12-20 15:43 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17522

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://sourceware.org/bugz
                   |                            |illa/show_bug.cgi?id=31183

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-12-20 15:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-29 15:57 [Bug stdio/17522] New: `fputws' errors out when writing wide characters to unbuffered stream arjun.is at lostca dot se
2014-10-29 18:03 ` [Bug stdio/17522] " carlos at redhat dot com
2014-10-29 18:43 ` arjun.is at lostca dot se
2014-10-29 18:49 ` carlos at redhat dot com
2014-10-29 19:12 ` arjun.is at lostca dot se
2014-10-29 19:16 ` carlos at redhat dot com
2014-11-03  9:00 ` cvs-commit at gcc dot gnu.org
2014-11-03  9:03 ` schwab@linux-m68k.org
2014-12-16 11:26 ` cvs-commit at gcc dot gnu.org
2023-12-20 15:43 ` fweimer at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).