From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10349 invoked by alias); 15 Mar 2003 14:56:01 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 10288 invoked by uid 71); 15 Mar 2003 14:56:01 -0000 Date: Sat, 15 Mar 2003 14:56:00 -0000 Message-ID: <20030315145601.10286.qmail@sources.redhat.com> To: nobody@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: =?iso-8859-1?Q?P=E9tur_Run=F3lfsson?= Subject: RE: libstdc++/10094: Regression: wfilebuf::showmanyc return value too high. Reply-To: =?iso-8859-1?Q?P=E9tur_Run=F3lfsson?= X-SW-Source: 2003-03/txt/msg00980.txt.bz2 List-Id: The following reply was made to PR libstdc++/10094; it has been noted by GNATS. From: =?iso-8859-1?Q?P=E9tur_Run=F3lfsson?= To: , , , =?iso-8859-1?Q?P=E9tur_Run=F3lfsson?= , Cc: Subject: RE: libstdc++/10094: Regression: wfilebuf::showmanyc return value too high. Date: Sat, 15 Mar 2003 14:51:58 -0000 > Synopsis: Regression: wfilebuf::showmanyc return value too high. >=20 > State-Changed-From-To: open->analyzed > State-Changed-By: paolo > State-Changed-When: Sat Mar 15 13:32:25 2003 > State-Changed-Why: > Ok, however the (interesting) analysis is science fiction > ;-) : the new showmanyc it's still in review!! Indeed :-) seems there is a bug in the old implementation... > Anyway, what to do here? Which kind of estimate is not > affected by the problem?!? Obviously, if codecvt::always_noconv() =3D=3D true, this isn't a problem. However, it seems unreasonable to require showmanyc to read and convert the characters before returning. I think the real problem here is that underflow() shouldn't return eof() when codecvt::in() fails. Consider this example: wifstream stream; stream.imbue(locale("se_NO.UTF-8")); // foo begins with the bytes \xc0\xff followed by lots of // stuff. stream.open("foo"); int n; stream >> n; wifstream::iostate s =3D stream.rdstate(); What is the value of s? wistream::operator>>(int&) calls rdbuf()->sbumpc() or sgetc(), which call underflow(). underflow() reads bytes from the file, attempts to convert with codecvt::in(), which fails. Currently, this causes underflow() to return eof(), which causes sgetc() to return eof(), which means that s is set to eofbit | failbit. I see nothing in the standard that requires this behaviour (*), however, eofbit is described so: eofbit: indicates that an input operation reached the end of an input sequence; =20 This appears to mean that eofbit() should *not* be set when underflow() has a conversion error, however, badbit seems just right: badbit: indicates a loss of integrity in an input or output sequence (such as an irrecoverable read error from a file); So it seems that !(s & eofbit) && (s & badbit) should hold. The only reasonable way to achieve this is for underflow() to throw an exception (probably of type ios_base::failure). Back to showmanyc: If underflow() throws an exception on conversion errors, the proposed implementation of showmanyc is OK, since showmanyc only guarantees that underflow() doesn't return eof() until the specified number of characters has been read, there is a footnote that indicates that it is OK for underflow() to throw an exception anytime. Petur (*) Indeed, I don't see anything about what underflow() should do when codecvt::in() fails. If you know of anything please let me know :-)