From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28964 invoked by alias); 18 Apr 2002 02:26:01 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 28947 invoked by uid 71); 18 Apr 2002 02:26:01 -0000 Date: Wed, 17 Apr 2002 19:26:00 -0000 Message-ID: <20020418022601.28946.qmail@sources.redhat.com> To: jason@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: Jason Merrill Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code Reply-To: Jason Merrill X-SW-Source: 2002-04/txt/msg00909.txt.bz2 List-Id: The following reply was made to PR libstdc++/4150; it has been noted by GNATS. From: Jason Merrill To: libstdc++@gcc.gnu.org Cc: gcc-gnats@gcc.gnu.org Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code Date: Thu, 18 Apr 2002 03:18:17 +0100 --=-=-= >>>>> "Jason" == Jason Merrill writes: > Which is what libio does, and which would seem to be the only way to avoid > this sort of problem. Can anyone enlighten me as to the design > goals/decisions of the current implementation? The current implementation seems to be trying to allow interleaved reads and writes on the same buffer, whereas the libio implementation requires a flush when switching modes. Was this difference a deliberate design decision? I'm a bit skeptical of it's usefulness; how often do people really read a few characters, write a few characters, and then read a few more characters without an intervening seek? A problem with the current implementation of this is that if we do a read on an input/output filebuf, we end up writing the contents of the buffer back out to the file, even if we've never requested a write. Oops. It would be possible to avoid this by changing how we manage the various pointers into the buffer; when reading, if _M_out_beg == _M_out_cur we can bump them together. If then we only write out the characters between them, there won't be anything to write unless we actually requested a write. I was thinking that we couldn't change this, because a lot of the adjustment happens in non-virtual functions like sbumpc. But then of course the semantics specified in the standard (i.e. sbumpc just advances the read pointer) don't support proper filebuf semantics on the existing scheme either; if the read pointer is bumped, and the write pointer is also active, it needs to be bumped as well. In libio, this problem is avoided by the mode switch; when it switches from read to write, the write pointer is copied from the read pointer. In v3, this problem is handled, basically, by contaminating streambuf with information about filebuf semantics. We need the streambuf member functions to adjust _M_out_cur with _M_in_cur, so we add a flag (_M_buf_unified) that says so, and handle the logic in the _M_*_cur_move functions. Similarly, for the benefit of stringbuf, we pretend that we can just bump _M_out_end if we run up against it and we happen to know that there's still room in the buffer. It seems unfortunate to me that we need special hacks in streambuf to support both of the standard derived streambufs. Both seem to be for optimization; the filebuf hack to allow reading and writing on the same buffer, and the stringbuf hack to avoid having to call overflow() through the vtable for each character we want to add to the end of the string. Am I right? Anyway, here's a patch to clean up filebuf somewhat. It involves these design changes: 1) The external file pointer always corresponds to _M_in_end. Previously, it corresponded to _M_in_beg (thus the seeks). 2) _M_out_end always points to the end of the buffer. Previously, it would point to the end of the valid contents, and we would rely on _M_out_cur_move to push it forward if there was still room in the buffer, as it still does for stringbufs. We don't need to do this for filebufs, as _M_out_end doesn't have any special meaning like it does for stringbufs. 3) _M_out_beg always points to the beginning of the write area. If we haven't written anything to the buffer, _M_out_cur == _M_out_beg. Previously the area between the two could consist entirely of characters that were read. 4) We now override uflow() in filebuf, so we don't have to unget the character read by underflow() in the sync_with_stdio case. logauswerter.C times: synced not synced 2.96 0:19 0:19 pre-patch 1:09 0:14 post-patch 0:26 0:13 Interesting that v3 is faster than v2 when not synced with stdio... I feel like I know my way around streambufs a lot better now. Tested i686-pc-linux-gnu, no regressions. Any objections? Jason 2002-04-18 Jason Merrill PR libstdc++/4150 * include/std/std_streambuf.h (basic_streambuf::_M_in_cur_move): Also bump _M_out_beg if _M_buf_unified. (basic_streambuf::_M_out_cur_move): Don't bump _M_in_cur past _M_in_end. (basic_streambuf::_M_set_indeterminate): Move to filebuf. (basic_streambuf::_M_set_determinate): Likewise. (basic_streambuf::_M_is_indeterminate): Likewise. * include/bits/std_fstream.h (basic_filebuf::_M_really_underflow): New non-static member function. (basic_filebuf::_M_underflow, _M_uflow): Call it. (basic_filebuf::sync): seekpos will handle all the positioning. (basic_filebuf::_M_set_indeterminate): Move here from streambuf. Always set _M_out_end to the end of the buffer. (basic_filebuf::_M_set_determinate): Likewise. (basic_filebuf::_M_is_indeterminate): Likewise. * include/bits/fstream.tcc (basic_filebuf::close): Stuff to write is from _M_out_beg to _M_out_cur. (basic_filebuf::overflow): Likewise. (basic_filebuf::_M_really_overflow): Likewise. Seek back to _M_out_beg if necessary. (basic_filebuf::seekoff): Likewise. (basic_filebuf::_M_really_underflow): Generalization of old underflow(). Don't seek back to _M_in_beg. * testsuite/27_io/filebuf_virtuals.cc (test05): Don't overspecify ungetc test. --=-=-= Content-Type: text/x-patch Content-Disposition: inline *** ./include/bits/fstream.tcc.~1~ Fri Apr 12 11:28:30 2002 --- ./include/bits/fstream.tcc Thu Apr 18 02:16:12 2002 *************** namespace std *** 183,189 **** if (this->is_open()) { const int_type __eof = traits_type::eof(); ! bool __testput = _M_out_cur && _M_out_beg < _M_out_end; if (__testput && _M_really_overflow(__eof) == __eof) return __ret; --- 183,189 ---- if (this->is_open()) { const int_type __eof = traits_type::eof(); ! bool __testput = _M_out_cur && _M_out_beg < _M_out_cur; if (__testput && _M_really_overflow(__eof) == __eof) return __ret; *************** namespace std *** 242,248 **** template typename basic_filebuf<_CharT, _Traits>::int_type basic_filebuf<_CharT, _Traits>:: ! underflow() { int_type __ret = traits_type::eof(); bool __testin = _M_mode & ios_base::in; --- 242,248 ---- template typename basic_filebuf<_CharT, _Traits>::int_type basic_filebuf<_CharT, _Traits>:: ! _M_really_underflow(bool __bump) { int_type __ret = traits_type::eof(); bool __testin = _M_mode & ios_base::in; *************** namespace std *** 261,283 **** } // Sync internal and external buffers. ! // NB: __testget -> __testput as _M_buf_unified here. ! bool __testget = _M_in_cur && _M_in_beg < _M_in_cur; ! bool __testinit = _M_is_indeterminate(); if (__testget) { ! if (__testout) ! _M_really_overflow(); ! #if _GLIBCPP_AVOID_FSEEK ! else if ((_M_in_cur - _M_in_beg) == 1) ! _M_file->sys_getc(); ! #endif ! else ! _M_file->seekoff(_M_in_cur - _M_in_beg, ! ios_base::cur, ios_base::in); } ! ! if (__testinit || __testget) { const locale __loc = this->getloc(); const __codecvt_type& __cvt = use_facet<__codecvt_type>(__loc); --- 261,278 ---- } // Sync internal and external buffers. ! bool __testput = _M_out_cur && _M_out_beg < _M_out_cur; ! if (__testput && __testout) ! _M_really_overflow(); ! ! bool __testget = _M_in_cur && _M_in_cur < _M_in_end; if (__testget) { ! __ret = traits_type::to_int_type(*_M_in_cur); ! if (__bump) ! _M_in_cur_move(1); } ! else { const locale __loc = this->getloc(); const __codecvt_type& __cvt = use_facet<__codecvt_type>(__loc); *************** namespace std *** 313,332 **** if (0 < __ilen) { _M_set_determinate(__ilen); - if (__testout) - _M_out_cur = _M_in_cur; __ret = traits_type::to_int_type(*_M_in_cur); ! #if _GLIBCPP_AVOID_FSEEK ! if (__elen == 1) ! _M_file->sys_ungetc(*_M_in_cur); ! else { ! #endif ! _M_file->seekoff(-__elen, ios_base::cur, ios_base::in); ! #if _GLIBCPP_AVOID_FSEEK } ! #endif ! } } } _M_last_overflowed = false; --- 308,325 ---- if (0 < __ilen) { _M_set_determinate(__ilen); __ret = traits_type::to_int_type(*_M_in_cur); ! if (__bump) ! _M_in_cur_move(1); ! else if (_M_buf_size == 1) { ! /* If we are synced with stdio, we have to unget the ! character we just read so that the file pointer ! stays the same. */ ! _M_file->sys_ungetc(*_M_in_cur); ! _M_set_indeterminate(); } ! } } } _M_last_overflowed = false; *************** namespace std *** 407,413 **** overflow(int_type __c) { int_type __ret = traits_type::eof(); ! bool __testput = _M_out_cur && _M_out_cur < _M_buf + _M_buf_size; bool __testout = _M_mode & ios_base::out; if (__testout) --- 400,406 ---- overflow(int_type __c) { int_type __ret = traits_type::eof(); ! bool __testput = _M_out_cur && _M_out_cur < _M_out_end; bool __testout = _M_mode & ios_base::out; if (__testout) *************** namespace std *** 418,424 **** _M_out_cur_move(1); __ret = traits_type::not_eof(__c); } ! else __ret = this->_M_really_overflow(__c); } --- 411,417 ---- _M_out_cur_move(1); __ret = traits_type::not_eof(__c); } ! else __ret = this->_M_really_overflow(__c); } *************** namespace std *** 491,497 **** _M_really_overflow(int_type __c) { int_type __ret = traits_type::eof(); ! bool __testput = _M_out_cur && _M_out_beg < _M_out_end; bool __testunbuffered = _M_file && !_M_buf_size; if (__testput || __testunbuffered) --- 484,490 ---- _M_really_overflow(int_type __c) { int_type __ret = traits_type::eof(); ! bool __testput = _M_out_cur && _M_out_beg < _M_out_cur; bool __testunbuffered = _M_file && !_M_buf_size; if (__testput || __testunbuffered) *************** namespace std *** 500,509 **** streamsize __elen = 0; streamsize __plen = 0; // Convert internal buffer to external representation, output. // NB: In the unbuffered case, no internal buffer exists. if (!__testunbuffered) ! _M_convert_to_external(_M_out_beg, _M_out_end - _M_out_beg, __elen, __plen); // Convert pending sequence to external representation, output. --- 493,511 ---- streamsize __elen = 0; streamsize __plen = 0; + // Need to restore current position. The position of the external + // byte sequence (_M_file) corresponds to _M_in_end, and we need + // to move it to _M_out_beg for the write. + if (_M_in_end && _M_in_end != _M_out_beg) + { + off_type __off = _M_out_beg - _M_in_end; + _M_file->seekoff(__off, ios_base::cur); + } + // Convert internal buffer to external representation, output. // NB: In the unbuffered case, no internal buffer exists. if (!__testunbuffered) ! _M_convert_to_external(_M_out_beg, _M_out_cur - _M_out_beg, __elen, __plen); // Convert pending sequence to external representation, output. *************** namespace std *** 547,553 **** _M_buf_size_opt = _M_buf_size = __n; _M_set_indeterminate(); ! // Step 3: Make sure a pback buffer is allocated. _M_allocate_pback_buffer(); } _M_last_overflowed = false; --- 549,555 ---- _M_buf_size_opt = _M_buf_size = __n; _M_set_indeterminate(); ! // Step 3: Make sure a pback buffer is allocated. _M_allocate_pback_buffer(); } _M_last_overflowed = false; *************** namespace std *** 579,585 **** off_type __computed_off = __width * __off; bool __testget = _M_in_cur && _M_in_beg < _M_in_end; ! bool __testput = _M_out_cur && _M_out_beg < _M_out_end; // Sync the internal and external streams. // out if (__testput || _M_last_overflowed) --- 581,587 ---- off_type __computed_off = __width * __off; bool __testget = _M_in_cur && _M_in_beg < _M_in_end; ! bool __testput = _M_out_cur && _M_out_beg < _M_out_cur; // Sync the internal and external streams. // out if (__testput || _M_last_overflowed) *************** namespace std *** 592,598 **** //in // NB: underflow() rewinds the external buffer. else if (__testget && __way == ios_base::cur) ! __computed_off += _M_in_cur - _M_in_beg; __ret = _M_file->seekoff(__computed_off, __way, __mode); _M_set_indeterminate(); --- 594,600 ---- //in // NB: underflow() rewinds the external buffer. else if (__testget && __way == ios_base::cur) ! __computed_off -= _M_in_end - _M_in_cur; __ret = _M_file->seekoff(__computed_off, __way, __mode); _M_set_indeterminate(); *** ./include/std/std_fstream.h.~1~ Thu Apr 11 13:19:05 2002 --- ./include/std/std_fstream.h Thu Apr 18 01:15:27 2002 *************** namespace std *** 141,148 **** // underflow() and uflow() functions are called to get the next // charater from the real input source when the buffer is empty. // Buffered input uses underflow() virtual int_type ! underflow(); virtual int_type pbackfail(int_type __c = _Traits::eof()); --- 141,161 ---- // underflow() and uflow() functions are called to get the next // charater from the real input source when the buffer is empty. // Buffered input uses underflow() + + // The only difference between underflow() and uflow() is that the + // latter bumps _M_in_cur after the read. In the sync_with_stdio + // case, this is important, as we need to unget the read character in + // the underflow() case in order to maintain synchronization. So + // instead of calling underflow() from uflow(), we create a common + // subroutine to do the real work. + int_type + _M_really_underflow(bool __bump); + virtual int_type ! underflow() { return _M_really_underflow(false); } ! ! virtual int_type ! uflow() { return _M_really_underflow(true); } virtual int_type pbackfail(int_type __c = _Traits::eof()); *************** namespace std *** 187,207 **** virtual int sync() { ! bool __testput = _M_out_cur && _M_out_beg < _M_out_end; // Make sure that the internal buffer resyncs its idea of // the file position with the external file. if (__testput && !_M_file->sync()) ! { ! // Need to restore current position. This interpreted as ! // the position of the external byte sequence (_M_file) ! // plus the offset in the current internal buffer ! // (_M_out_beg - _M_out_cur) ! streamoff __cur = _M_file->seekoff(0, ios_base::cur); ! off_type __off = _M_out_cur - _M_out_beg; ! _M_really_overflow(); ! _M_file->seekpos(__cur + __off); ! } _M_last_overflowed = false; return 0; } --- 200,212 ---- virtual int sync() { ! bool __testput = _M_out_cur && _M_out_beg < _M_out_cur; // Make sure that the internal buffer resyncs its idea of // the file position with the external file. if (__testput && !_M_file->sync()) ! _M_really_overflow(); ! _M_last_overflowed = false; return 0; } *************** namespace std *** 239,244 **** --- 244,292 ---- void _M_output_unshift(); + + // These three functions are used to clarify internal buffer + // maintenance. After an overflow, or after a seekoff call that + // started at beg or end, or possibly when the stream becomes + // unbuffered, and a myriad other obscure corner cases, the + // internal buffer does not truly reflect the contents of the + // external buffer. At this point, for whatever reason, it is in + // an indeterminate state. + void + _M_set_indeterminate(void) + { + if (_M_mode & ios_base::in) + this->setg(_M_buf, _M_buf, _M_buf); + if (_M_mode & ios_base::out) + this->setp(_M_buf, _M_buf + _M_buf_size); + } + + void + _M_set_determinate(off_type __off) + { + bool __testin = _M_mode & ios_base::in; + bool __testout = _M_mode & ios_base::out; + if (__testin) + this->setg(_M_buf, _M_buf, _M_buf + __off); + if (__testout) + this->setp(_M_buf, _M_buf + _M_buf_size); + } + + bool + _M_is_indeterminate(void) + { + bool __ret = false; + // Don't return true if unbuffered. + if (_M_buf) + { + __ret = true; + if (_M_mode & ios_base::in) + __ret &= _M_in_beg == _M_in_cur && _M_in_cur == _M_in_end; + if (_M_mode & ios_base::out) + __ret &= _M_out_beg == _M_out_cur; + } + return __ret; + } }; *** ./include/std/std_streambuf.h.~1~ Mon Feb 18 00:38:23 2002 --- ./include/std/std_streambuf.h Thu Apr 18 01:13:03 2002 *************** namespace std *** 184,190 **** bool __testout = _M_out_cur; _M_in_cur += __n; if (__testout && _M_buf_unified) ! _M_out_cur += __n; } // Correctly sets the _M_out_cur pointer, and bumps the --- 184,194 ---- bool __testout = _M_out_cur; _M_in_cur += __n; if (__testout && _M_buf_unified) ! { ! if (_M_out_cur == _M_out_beg) ! _M_out_beg += __n; ! _M_out_cur += __n; ! } } // Correctly sets the _M_out_cur pointer, and bumps the *************** namespace std *** 202,208 **** _M_out_cur += __n; if (__testin && _M_buf_unified) ! _M_in_cur += __n; if (_M_out_cur > _M_out_end) { _M_out_end = _M_out_cur; --- 206,216 ---- _M_out_cur += __n; if (__testin && _M_buf_unified) ! { ! _M_in_cur += __n; ! if (_M_in_cur > _M_in_end) ! _M_in_cur = _M_in_end; ! } if (_M_out_cur > _M_out_end) { _M_out_end = _M_out_cur; *************** namespace std *** 230,277 **** } return __ret; } - - // These three functions are used to clarify internal buffer - // maintenance. After an overflow, or after a seekoff call that - // started at beg or end, or possibly when the stream becomes - // unbuffered, and a myrid other obscure corner cases, the - // internal buffer does not truly reflect the contents of the - // external buffer. At this point, for whatever reason, it is in - // an indeterminate state. - void - _M_set_indeterminate(void) - { - if (_M_mode & ios_base::in) - this->setg(_M_buf, _M_buf, _M_buf); - if (_M_mode & ios_base::out) - this->setp(_M_buf, _M_buf); - } - - void - _M_set_determinate(off_type __off) - { - bool __testin = _M_mode & ios_base::in; - bool __testout = _M_mode & ios_base::out; - if (__testin) - this->setg(_M_buf, _M_buf, _M_buf + __off); - if (__testout) - this->setp(_M_buf, _M_buf + __off); - } - - bool - _M_is_indeterminate(void) - { - bool __ret = false; - // Don't return true if unbuffered. - if (_M_buf) - { - if (_M_mode & ios_base::in) - __ret = _M_in_beg == _M_in_cur && _M_in_cur == _M_in_end; - if (_M_mode & ios_base::out) - __ret = _M_out_beg == _M_out_cur && _M_out_cur == _M_out_end; - } - return __ret; - } public: virtual --- 238,243 ---- *** ./src/ios.cc.~1~ Wed Apr 3 21:19:20 2002 --- ./src/ios.cc Thu Apr 18 02:50:33 2002 *************** namespace std *** 150,163 **** int __out_bufsize = __sync ? 0 : static_cast(BUFSIZ); int __in_bufsize = __sync ? 1 : static_cast(BUFSIZ); - #if _GLIBCPP_AVOID_FSEEK - // Platforms that prefer to avoid fseek() calls on streams only - // get their desire when the C++-layer input buffer size is 1. - // This hack hurts performance but keeps correctness across - // all types of streams that might be attached to (e.g.) cin. - __in_bufsize = 1; - #endif - // NB: The file globals.cc creates the four standard files // with NULL buffers. At this point, we swap out the dummy NULL // [io]stream objects and buffers with the real deal. --- 150,155 ---- *** ./testsuite/27_io/filebuf_virtuals.cc.~1~ Mon Jan 28 16:51:59 2002 --- ./testsuite/27_io/filebuf_virtuals.cc Thu Apr 18 02:29:12 2002 *************** void test05() *** 443,449 **** --- 443,451 ---- c3 = fb_03.sputc('\n'); strmsz_1 = fb_03.sputn("because because because. . .", 28); VERIFY( strmsz_1 == 28 ); + // Defect? retval of sungetc is not necessarily the character c1 = fb_03.sungetc(); + c1 = fb_03.sgetc(); fb_03.pubsync(); c3 = fb_03.sgetc(); VERIFY( c1 == c3 ); --=-=-=--