public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-16  5:46 Jason Merrill
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Merrill @ 2002-04-16  5:46 UTC (permalink / raw)
  To: jason; +Cc: gcc-prs

The following reply was made to PR libstdc++/4150; it has been noted by GNATS.

From: Jason Merrill <jason@redhat.com>
To: rittle@labs.mot.com
Cc: libstdc++@gcc.gnu.org, rth@redhat.com, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code
Date: Tue, 16 Apr 2002 11:02:46 +0100

 >>>>> "Loren" == Loren James Rittle <rittle@latour.rsch.comm.mot.com> writes:
 
 > Under the current architecture (which I have only ever tweaked for
 > performance, compliance and QoS of interactive cases not dictated by
 > standard), the whole reason for the backup to the point before the
 > read is that until a character is actually consumed by the
 > higher-layer of libstdc++-v3 IO, the lower-layer C stdio file-pointer
 > must not appear to move forward w.r.t. other C stdio.  Granted it
 > seems less-than-ideal to always use that algorithm even when not
 > sync'd to stdio.
 
 Then I suppose we should use a buffer size of 0.
 
 > It is why I told RTH the other day in an e-mail that I thought some
 > basic re-architecture would be required to solve all performance
 > issues related to outstanding libstdc++-v3 PRs.  When the higher-layer
 > knows it will consume more than X characters in sync'd IO cases, it
 > should be able to pull >1&<X characters from the lower-layer (current
 > architecture limits us to pulls of 1 character).  Or, if the
 > higher-layer knows it is looking for a newline character (another very
 > common case), it should be able to use the C stdio optimized routine
 > to pull >1 character from the lower layer (bounded only by newline or
 > the provided buffer size, aka the fgets function call).  Under a
 > re-architecture, it seems to me that only when the higher-layer of
 > libstdc++-v3 is in a scanning mode not directly supported by libc that
 > it must conditionally pull 1 character at a time through the layer
 > when sync'd to stdio
 
 Makes sense to me.
 
 > Now, I actually have no idea if the abstraction layer dictated by the
 > standard even allows these optimizations.  I looked at this situation >6
 > months ago and I actually think not.
 
 Why not?  It seems to me that the optimizations you suggest would conform
 fine to the spec for xs{put,get}n.  The spec for basic_streambuf::xsgetn
 talks about implementation "as if" by repeated calls to sbumpc, but then
 also says that derived classes can provide more efficient implementations.
 
 To optimize getline, we'd need to introduce a virtual helper function in
 streambuf, but I don't see any reason why that would violate the standard.
 
 > With your patch (plus the removal of the related _GLIBCPP_AVOID_FSEEK
 > region in src/ios.cc), I see one automatic regression here:
 
 > assertion "(off_2 == (off_1 + 2 + 1 + 1))" failed: file
 > "[...]/27_io/filebuf_virtuals.cc", line 428
 > FAIL: 27_io/filebuf_virtuals.cc execution test
 
 Yep, I'm aware of that.  I knew that the patch I posted was incomplete; it
 was meant more as a concrete illustration of my proposal.  I'm still
 working on it.
 
 > [1] I don't know if this is widely known information thus I want to
 >     make sure you tested my patch to enable _GLIBCPP_AVOID_FSEEK on
 >     Linux properly.  If you bootstrap all of gcc, then when
 >     libstdc++-v3 is built, it will be built with flags set by
 >     top-level Makefile (nominally, `-O2 -g').  If you later run make
 >     in libstdc++-v3, it will rebuild (some/all?) files with `-O0 -g'
 >     (except stuff built in libmath which appears to get top-level
 >     flags)...  IMHO, the only way to test performance patches in
 >     libstdc++-v3, is to `rm -rf <target>/libstdc++-v3' and rerun make
 >     at top-level.  This way libstdc++-v3 is built exactly as when it
 >     is bootstrapped.
 
 I'm aware of the difference; in all cases, I was building without
 optimization, on the assumption that the calls to the C layer would be
 where we were spending our time.  So I was comparing apples to apples, but
 perhaps not the most useful apples.  :)
 
 Jason


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-23  5:23 jason
  0 siblings, 0 replies; 12+ messages in thread
From: jason @ 2002-04-23  5:23 UTC (permalink / raw)
  To: gcc-bugs, gcc-prs, jason, leitner

Synopsis: catastrophic performance decrease in C++ code

State-Changed-From-To: analyzed->closed
State-Changed-By: jason
State-Changed-When: Tue Apr 23 05:23:24 2002
State-Changed-Why:
    Fixed for 3.1.

http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=4150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-18  4:06 Jason Merrill
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Merrill @ 2002-04-18  4:06 UTC (permalink / raw)
  To: jason; +Cc: gcc-prs

The following reply was made to PR libstdc++/4150; it has been noted by GNATS.

From: Jason Merrill <jason@redhat.com>
To: bkoz@redhat.com
Cc: libstdc++@gcc.gnu.org, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code
Date: Thu, 18 Apr 2002 12:00:19 +0100

 >>>>> "Jason" == Jason Merrill <jason@redhat.com> writes:
 
 >>> Tested i686-pc-linux-gnu, no regressions.  Any objections?
 
 >> Mainline and branch have diverged a bit right now. Did you test with
 >> branch or mainline?
 
 > trunk.  I'll test on the branch, too.
 
 No regressions.
 
 Jason


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-18  3:06 Jason Merrill
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Merrill @ 2002-04-18  3:06 UTC (permalink / raw)
  To: jason; +Cc: gcc-prs

The following reply was made to PR libstdc++/4150; it has been noted by GNATS.

From: Jason Merrill <jason@redhat.com>
To: bkoz@redhat.com
Cc: libstdc++@gcc.gnu.org, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code
Date: Thu, 18 Apr 2002 11:01:54 +0100

 --=-=-=
 
 >>>>> "Benjamin" == Benjamin Kosnik <bkoz@redhat.com> writes:
 
 >> A problem with the current implementation of this is that if we do a
 >> read on an input/output filebuf, we end up writing the contents of the
 >> buffer back out to the file, even if we've never requested a write. 
 >> Oops.
 
 > Hmmm.
 
 > Please write a testcase that demonstrates this, and add it to
 > 27_io/filebuf_members.cc
 
 I'm not sure how to check for this without strace...thoughts?
 
 >> I feel like I know my way around streambufs a lot better now.
 
 > Great. So, how do you like debugging C++ with the current tools?
 > Painful, huh? Does it make you feel psychic when you fix things?
 
 > Please let me know if you have any special kung-fu to pass on.
 
 1) When I started adding -static to my links, the random gdb crashes went
    away.
 2) Sometimes 'step' fails to step into a function, but 'stepi' always seems
    to work; once I'm into the function, I can start using 'step' again.
 3) The patch below fixes inspection of class contents.
 
 >> Tested i686-pc-linux-gnu, no regressions.  Any objections?
 
 > Mainline and branch have diverged a bit right now. Did you test with
 > branch or mainline?
 
 trunk.  I'll test on the branch, too.
 
 > I need to get solaris back in shape on mainline: bsd's, hpux, aix,
 > cygwin are all back in shape now, but solaris is still kind of dicy. I'm
 > going to ask you to hold off, at least on the branch, till I have the
 > libstdc++/4164 patch integrated. Also, you'll need to do the testsuite
 > entry before you can check in. Okay?
 
 I can hold off a bit; let me know.
 
 > Does this mean that the FSEEK hacks in config/os/*/bits/os_defines.h can
 > be removed, since this define is no longer used?
 
 Yep.
 
 Jason
 
 
 --=-=-=
 Content-Type: text/x-patch
 Content-Disposition: inline
 
 *** gnu-v3-abi.c.~1~	Sun Mar 17 17:10:01 2002
 --- gnu-v3-abi.c	Sun Apr 14 22:59:39 2002
 *************** gnuv3_rtti_type (struct value *value,
 *** 241,262 ****
     vtable_symbol_name = SYMBOL_DEMANGLED_NAME (vtable_symbol);
     if (vtable_symbol_name == NULL
         || strncmp (vtable_symbol_name, "vtable for ", 11))
 !     error ("can't find linker symbol for virtual table for `%s' value",
 !            TYPE_NAME (value_type));
     class_name = vtable_symbol_name + 11;
   
     /* Try to look up the class name as a type name.  */
     class_symbol = lookup_symbol (class_name, 0, STRUCT_NAMESPACE, 0, 0);
     if (! class_symbol)
 !     error ("can't find class named `%s', as given by C++ RTTI", class_name);
   
     /* Make sure the type symbol is sane.  (An earlier version of this
        code would find constructor functions, who have the same name as
        the class.)  */
     if (SYMBOL_CLASS (class_symbol) != LOC_TYPEDEF
         || TYPE_CODE (SYMBOL_TYPE (class_symbol)) != TYPE_CODE_CLASS)
 !     error ("C++ RTTI gives a class name of `%s', but that isn't a type name",
 !            class_name);
   
     /* This is the object's run-time type!  */
     run_time_type = SYMBOL_TYPE (class_symbol);
 --- 241,273 ----
     vtable_symbol_name = SYMBOL_DEMANGLED_NAME (vtable_symbol);
     if (vtable_symbol_name == NULL
         || strncmp (vtable_symbol_name, "vtable for ", 11))
 !     {
 !       warning ("can't find linker symbol for virtual table for `%s' value",
 ! 	       TYPE_NAME (value_type));
 !       if (vtable_symbol_name)
 ! 	warning ("  found `%s' instead", vtable_symbol_name);
 !       return NULL;
 !     }
     class_name = vtable_symbol_name + 11;
   
     /* Try to look up the class name as a type name.  */
     class_symbol = lookup_symbol (class_name, 0, STRUCT_NAMESPACE, 0, 0);
     if (! class_symbol)
 !     {
 !       warning ("can't find class named `%s', as given by C++ RTTI", class_name);
 !       return NULL;
 !     }
   
     /* Make sure the type symbol is sane.  (An earlier version of this
        code would find constructor functions, who have the same name as
        the class.)  */
     if (SYMBOL_CLASS (class_symbol) != LOC_TYPEDEF
         || TYPE_CODE (SYMBOL_TYPE (class_symbol)) != TYPE_CODE_CLASS)
 !     {
 !       warning ("C++ RTTI gives a class name of `%s', but that isn't a type name",
 ! 	       class_name);
 !       return NULL;
 !     }
   
     /* This is the object's run-time type!  */
     run_time_type = SYMBOL_TYPE (class_symbol);
 
 --=-=-=--


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-17 21:56 Benjamin Kosnik
  0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Kosnik @ 2002-04-17 21:56 UTC (permalink / raw)
  To: jason; +Cc: gcc-prs

The following reply was made to PR libstdc++/4150; it has been noted by GNATS.

From: Benjamin Kosnik <bkoz@redhat.com>
To: Jason Merrill <jason@redhat.com>
Cc: libstdc++@gcc.gnu.org, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code
Date: Wed, 17 Apr 2002 21:46:14 -0700

 First of all, thanks. I was hoping somebody else would touch this code.
 
 > A problem with the current implementation of this is that if we do a
 > read on an input/output filebuf, we end up writing the contents of the
 > buffer back out to the file, even if we've never requested a write. 
 > Oops.
 
 Hmmm.
 
 Please write a testcase that demonstrates this, and add it to
 27_io/filebuf_members.cc
 
 I know that file is kind of long and unwieldy at this point.
 
 I'm trying to have the testsuite actually work and prevent lossage.... I
 know it's more work for you, but it'll be less work in the long run for
 all concerned if all regressions are in the testsuite.
 
 > In v3, this problem is handled, basically, by contaminating streambuf
 > with information about filebuf semantics.  We need the streambuf
 > member functions to adjust _M_out_cur with _M_in_cur, so we add a flag
 > (_M_buf_unified) that says so, and handle the logic in the
 > _M_*_cur_move functions.  Similarly, for the benefit of stringbuf, we
 > pretend that we can just bump _M_out_end if we run up against it and
 > we happen to know that there's still room in the buffer.
 > 
 > It seems unfortunate to me that we need special hacks in streambuf to
 > support both of the standard derived streambufs.  Both seem to be for
 > optimization; the filebuf hack to allow reading and writing on the
 > same buffer, and the stringbuf hack to avoid having to call overflow()
 > through the vtable for each character we want to add to the end of the
 > string.  Am I right?
 
 Right. The design is actually for stringbufs.
 
 > logauswerter.C times:
 >           synced   not synced
 > 2.96       0:19       0:19
 > pre-patch  1:09       0:14
 > post-patch 0:26       0:13
 > 
 > Interesting that v3 is faster than v2 when not synced with stdio...
 
 Yeah. I noticed this with bench++ as well. There was some commentary on
 this a bit ago, but I cannot place it.
 
 The thing that sucks are the narrow/wide stream objects and
 sync_with_stdio. 
 
 > I feel like I know my way around streambufs a lot better now.
 
 Great. So, how do you like debugging C++ with the current tools?
 Painful, huh? Does it make you feel psychic when you fix things?
 
 ;)
 
 Please let me know if you have any special kung-fu to pass on.
 
 > Tested i686-pc-linux-gnu, no regressions.  Any objections?
 
 Mainline and branch have diverged a bit right now. Did you test with
 branch or mainline?
 
 I need to get solaris back in shape on mainline: bsd's, hpux, aix,
 cygwin are all back in shape now, but solaris is still kind of dicy. I'm
 going to ask you to hold off, at least on the branch, till I have the
 libstdc++/4164 patch integrated. Also, you'll need to do the testsuite
 entry before you can check in. Okay?
 
 Does this mean that the FSEEK hacks in config/os/*/bits/os_defines.h can
 be removed, since this define is no longer used?
 
 -benjamin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-17 19:26 Jason Merrill
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Merrill @ 2002-04-17 19:26 UTC (permalink / raw)
  To: jason; +Cc: gcc-prs

The following reply was made to PR libstdc++/4150; it has been noted by GNATS.

From: Jason Merrill <jason@redhat.com>
To: libstdc++@gcc.gnu.org
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code
Date: Thu, 18 Apr 2002 03:18:17 +0100

 --=-=-=
 
 >>>>> "Jason" == Jason Merrill <jason@redhat.com> writes:
 
 > Which is what libio does, and which would seem to be the only way to avoid
 > this sort of problem.  Can anyone enlighten me as to the design
 > goals/decisions of the current implementation?
 
 The current implementation seems to be trying to allow interleaved reads
 and writes on the same buffer, whereas the libio implementation requires a
 flush when switching modes.  Was this difference a deliberate design
 decision?  I'm a bit skeptical of it's usefulness; how often do people
 really read a few characters, write a few characters, and then read a few
 more characters without an intervening seek?
 
 A problem with the current implementation of this is that if we do a read
 on an input/output filebuf, we end up writing the contents of the buffer
 back out to the file, even if we've never requested a write.  Oops.
 
 It would be possible to avoid this by changing how we manage the various
 pointers into the buffer; when reading, if _M_out_beg == _M_out_cur we can
 bump them together.  If then we only write out the characters between them,
 there won't be anything to write unless we actually requested a write.
 
 I was thinking that we couldn't change this, because a lot of the
 adjustment happens in non-virtual functions like sbumpc.  But then of
 course the semantics specified in the standard (i.e. sbumpc just advances
 the read pointer) don't support proper filebuf semantics on the existing
 scheme either; if the read pointer is bumped, and the write pointer is also
 active, it needs to be bumped as well.  In libio, this problem is avoided
 by the mode switch; when it switches from read to write, the write pointer
 is copied from the read pointer.
 
 In v3, this problem is handled, basically, by contaminating streambuf with
 information about filebuf semantics.  We need the streambuf member
 functions to adjust _M_out_cur with _M_in_cur, so we add a flag
 (_M_buf_unified) that says so, and handle the logic in the _M_*_cur_move
 functions.  Similarly, for the benefit of stringbuf, we pretend that we can
 just bump _M_out_end if we run up against it and we happen to know that
 there's still room in the buffer.
 
 It seems unfortunate to me that we need special hacks in streambuf to
 support both of the standard derived streambufs.  Both seem to be for
 optimization; the filebuf hack to allow reading and writing on the same
 buffer, and the stringbuf hack to avoid having to call overflow() through
 the vtable for each character we want to add to the end of the string.  Am
 I right?
 
 Anyway, here's a patch to clean up filebuf somewhat.  It involves these
 design changes:
 
   1) The external file pointer always corresponds to _M_in_end.
      Previously, it corresponded to _M_in_beg (thus the seeks).
   2) _M_out_end always points to the end of the buffer.  Previously, it
      would point to the end of the valid contents, and we would rely on
      _M_out_cur_move to push it forward if there was still room in the
      buffer, as it still does for stringbufs.  We don't need to do this
      for filebufs, as _M_out_end doesn't have any special meaning like it
      does for stringbufs.
   3) _M_out_beg always points to the beginning of the write area.  If we
      haven't written anything to the buffer, _M_out_cur == _M_out_beg.
      Previously the area between the two could consist entirely of
      characters that were read.
   4) We now override uflow() in filebuf, so we don't have to unget the
      character read by underflow() in the sync_with_stdio case.
 
 logauswerter.C times:
           synced   not synced
 2.96       0:19       0:19
 pre-patch  1:09       0:14
 post-patch 0:26       0:13
 
 Interesting that v3 is faster than v2 when not synced with stdio...
 I feel like I know my way around streambufs a lot better now.
 
 Tested i686-pc-linux-gnu, no regressions.  Any objections?
 
 Jason
 
 2002-04-18  Jason Merrill  <jason@redhat.com>
 
 	PR libstdc++/4150
 	* include/std/std_streambuf.h (basic_streambuf::_M_in_cur_move):
 	Also bump _M_out_beg if _M_buf_unified.
 	(basic_streambuf::_M_out_cur_move): Don't bump _M_in_cur past 
 	_M_in_end.
 	(basic_streambuf::_M_set_indeterminate): Move to filebuf.
 	(basic_streambuf::_M_set_determinate): Likewise.
 	(basic_streambuf::_M_is_indeterminate): Likewise.
 	* include/bits/std_fstream.h (basic_filebuf::_M_really_underflow):
 	New non-static member function.
 	(basic_filebuf::_M_underflow, _M_uflow): Call it.
 	(basic_filebuf::sync): seekpos will handle all the positioning.
 	(basic_filebuf::_M_set_indeterminate): Move here from streambuf.
 	Always set _M_out_end to the end of the buffer.
 	(basic_filebuf::_M_set_determinate): Likewise.
 	(basic_filebuf::_M_is_indeterminate): Likewise.
 	* include/bits/fstream.tcc (basic_filebuf::close): Stuff to
 	write is from _M_out_beg to _M_out_cur.
 	(basic_filebuf::overflow): Likewise.
 	(basic_filebuf::_M_really_overflow): Likewise.
 	Seek back to _M_out_beg if necessary.
 	(basic_filebuf::seekoff): Likewise.
 	(basic_filebuf::_M_really_underflow): Generalization of old 
 	underflow().  Don't seek back to _M_in_beg.
 	* testsuite/27_io/filebuf_virtuals.cc (test05): Don't overspecify 
 	ungetc test.
 
 
 --=-=-=
 Content-Type: text/x-patch
 Content-Disposition: inline
 
 *** ./include/bits/fstream.tcc.~1~	Fri Apr 12 11:28:30 2002
 --- ./include/bits/fstream.tcc	Thu Apr 18 02:16:12 2002
 *************** namespace std
 *** 183,189 ****
         if (this->is_open())
   	{
   	  const int_type __eof = traits_type::eof();
 ! 	  bool __testput = _M_out_cur && _M_out_beg < _M_out_end;
   	  if (__testput && _M_really_overflow(__eof) == __eof)
   	    return __ret;
   
 --- 183,189 ----
         if (this->is_open())
   	{
   	  const int_type __eof = traits_type::eof();
 ! 	  bool __testput = _M_out_cur && _M_out_beg < _M_out_cur;
   	  if (__testput && _M_really_overflow(__eof) == __eof)
   	    return __ret;
   
 *************** namespace std
 *** 242,248 ****
     template<typename _CharT, typename _Traits>
       typename basic_filebuf<_CharT, _Traits>::int_type 
       basic_filebuf<_CharT, _Traits>::
 !     underflow()
       {
         int_type __ret = traits_type::eof();
         bool __testin = _M_mode & ios_base::in;
 --- 242,248 ----
     template<typename _CharT, typename _Traits>
       typename basic_filebuf<_CharT, _Traits>::int_type 
       basic_filebuf<_CharT, _Traits>::
 !     _M_really_underflow(bool __bump)
       {
         int_type __ret = traits_type::eof();
         bool __testin = _M_mode & ios_base::in;
 *************** namespace std
 *** 261,283 ****
   	    }
   
   	  // Sync internal and external buffers.
 ! 	  // NB: __testget -> __testput as _M_buf_unified here.
 ! 	  bool __testget = _M_in_cur && _M_in_beg < _M_in_cur;
 ! 	  bool __testinit = _M_is_indeterminate();
   	  if (__testget)
   	    {
 ! 	      if (__testout)
 ! 		_M_really_overflow();
 ! #if _GLIBCPP_AVOID_FSEEK
 ! 	      else if ((_M_in_cur - _M_in_beg) == 1)
 ! 		_M_file->sys_getc();
 ! #endif
 ! 	      else 
 ! 		_M_file->seekoff(_M_in_cur - _M_in_beg, 
 ! 				 ios_base::cur, ios_base::in);
   	    }
 ! 
 ! 	  if (__testinit || __testget)
   	    {
   	      const locale __loc = this->getloc();
   	      const __codecvt_type& __cvt = use_facet<__codecvt_type>(__loc); 
 --- 261,278 ----
   	    }
   
   	  // Sync internal and external buffers.
 ! 	  bool __testput = _M_out_cur && _M_out_beg < _M_out_cur;
 ! 	  if (__testput && __testout)
 ! 	    _M_really_overflow();
 ! 
 ! 	  bool __testget = _M_in_cur && _M_in_cur < _M_in_end;
   	  if (__testget)
   	    {
 ! 	      __ret = traits_type::to_int_type(*_M_in_cur);
 ! 	      if (__bump)
 ! 		_M_in_cur_move(1);
   	    }
 ! 	  else
   	    {
   	      const locale __loc = this->getloc();
   	      const __codecvt_type& __cvt = use_facet<__codecvt_type>(__loc); 
 *************** namespace std
 *** 313,332 ****
   	      if (0 < __ilen)
   		{
   		  _M_set_determinate(__ilen);
 - 		  if (__testout)
 - 		    _M_out_cur = _M_in_cur;
   		  __ret = traits_type::to_int_type(*_M_in_cur);
 ! #if _GLIBCPP_AVOID_FSEEK
 ! 		  if (__elen == 1)
 ! 		    _M_file->sys_ungetc(*_M_in_cur);
 ! 		  else
   		    {
 ! #endif
 ! 		      _M_file->seekoff(-__elen, ios_base::cur, ios_base::in);
 ! #if _GLIBCPP_AVOID_FSEEK
   		    }
 ! #endif
 ! 		}	   
   	    }
   	}
         _M_last_overflowed = false;	
 --- 308,325 ----
   	      if (0 < __ilen)
   		{
   		  _M_set_determinate(__ilen);
   		  __ret = traits_type::to_int_type(*_M_in_cur);
 ! 		  if (__bump)
 ! 		    _M_in_cur_move(1);
 ! 		  else if (_M_buf_size == 1)
   		    {
 ! 		      /* If we are synced with stdio, we have to unget the
 ! 		         character we just read so that the file pointer
 ! 		         stays the same.  */
 ! 		      _M_file->sys_ungetc(*_M_in_cur);
 ! 		      _M_set_indeterminate();
   		    }
 ! 		}
   	    }
   	}
         _M_last_overflowed = false;	
 *************** namespace std
 *** 407,413 ****
       overflow(int_type __c)
       {
         int_type __ret = traits_type::eof();
 !       bool __testput = _M_out_cur && _M_out_cur < _M_buf + _M_buf_size;
         bool __testout = _M_mode & ios_base::out;
         
         if (__testout)
 --- 400,406 ----
       overflow(int_type __c)
       {
         int_type __ret = traits_type::eof();
 !       bool __testput = _M_out_cur && _M_out_cur < _M_out_end;
         bool __testout = _M_mode & ios_base::out;
         
         if (__testout)
 *************** namespace std
 *** 418,424 ****
   	      _M_out_cur_move(1);
   	      __ret = traits_type::not_eof(__c);
   	    }
 ! 	  else 
   	    __ret = this->_M_really_overflow(__c);
   	}
   
 --- 411,417 ----
   	      _M_out_cur_move(1);
   	      __ret = traits_type::not_eof(__c);
   	    }
 ! 	  else
   	    __ret = this->_M_really_overflow(__c);
   	}
   
 *************** namespace std
 *** 491,497 ****
       _M_really_overflow(int_type __c)
       {
         int_type __ret = traits_type::eof();
 !       bool __testput = _M_out_cur && _M_out_beg < _M_out_end;
         bool __testunbuffered = _M_file && !_M_buf_size;
   
         if (__testput || __testunbuffered)
 --- 484,490 ----
       _M_really_overflow(int_type __c)
       {
         int_type __ret = traits_type::eof();
 !       bool __testput = _M_out_cur && _M_out_beg < _M_out_cur;
         bool __testunbuffered = _M_file && !_M_buf_size;
   
         if (__testput || __testunbuffered)
 *************** namespace std
 *** 500,509 ****
   	  streamsize __elen = 0;
   	  streamsize __plen = 0;
   
   	  // Convert internal buffer to external representation, output.
   	  // NB: In the unbuffered case, no internal buffer exists. 
   	  if (!__testunbuffered)
 ! 	    _M_convert_to_external(_M_out_beg,  _M_out_end - _M_out_beg, 
   				   __elen, __plen);
   
   	  // Convert pending sequence to external representation, output.
 --- 493,511 ----
   	  streamsize __elen = 0;
   	  streamsize __plen = 0;
   
 + 	  // Need to restore current position. The position of the external
 + 	  // byte sequence (_M_file) corresponds to _M_in_end, and we need
 + 	  // to move it to _M_out_beg for the write.
 + 	  if (_M_in_end && _M_in_end != _M_out_beg)
 + 	    {
 + 	      off_type __off = _M_out_beg - _M_in_end;
 + 	      _M_file->seekoff(__off, ios_base::cur);
 + 	    }
 + 
   	  // Convert internal buffer to external representation, output.
   	  // NB: In the unbuffered case, no internal buffer exists. 
   	  if (!__testunbuffered)
 ! 	    _M_convert_to_external(_M_out_beg,  _M_out_cur - _M_out_beg, 
   				   __elen, __plen);
   
   	  // Convert pending sequence to external representation, output.
 *************** namespace std
 *** 547,553 ****
   	  _M_buf_size_opt = _M_buf_size = __n;
   	  _M_set_indeterminate();
   	  
 ! 	// Step 3: Make sure a pback buffer is allocated.
   	  _M_allocate_pback_buffer();
   	}
         _M_last_overflowed = false;	
 --- 549,555 ----
   	  _M_buf_size_opt = _M_buf_size = __n;
   	  _M_set_indeterminate();
   	  
 ! 	  // Step 3: Make sure a pback buffer is allocated.
   	  _M_allocate_pback_buffer();
   	}
         _M_last_overflowed = false;	
 *************** namespace std
 *** 579,585 ****
   	      off_type __computed_off = __width * __off;
   	      
   	      bool __testget = _M_in_cur && _M_in_beg < _M_in_end;
 ! 	      bool __testput = _M_out_cur && _M_out_beg < _M_out_end;
   	      // Sync the internal and external streams.
   	      // out
   	      if (__testput || _M_last_overflowed)
 --- 581,587 ----
   	      off_type __computed_off = __width * __off;
   	      
   	      bool __testget = _M_in_cur && _M_in_beg < _M_in_end;
 ! 	      bool __testput = _M_out_cur && _M_out_beg < _M_out_cur;
   	      // Sync the internal and external streams.
   	      // out
   	      if (__testput || _M_last_overflowed)
 *************** namespace std
 *** 592,598 ****
   	      //in
   	      // NB: underflow() rewinds the external buffer.
   	      else if (__testget && __way == ios_base::cur)
 ! 		__computed_off += _M_in_cur - _M_in_beg;
   	  
   	      __ret = _M_file->seekoff(__computed_off, __way, __mode);
   	      _M_set_indeterminate();
 --- 594,600 ----
   	      //in
   	      // NB: underflow() rewinds the external buffer.
   	      else if (__testget && __way == ios_base::cur)
 ! 		__computed_off -= _M_in_end - _M_in_cur;
   	  
   	      __ret = _M_file->seekoff(__computed_off, __way, __mode);
   	      _M_set_indeterminate();
 *** ./include/std/std_fstream.h.~1~	Thu Apr 11 13:19:05 2002
 --- ./include/std/std_fstream.h	Thu Apr 18 01:15:27 2002
 *************** namespace std
 *** 141,148 ****
         // underflow() and uflow() functions are called to get the next
         // charater from the real input source when the buffer is empty.
         // Buffered input uses underflow()
         virtual int_type
 !       underflow();
   
         virtual int_type
         pbackfail(int_type __c = _Traits::eof());
 --- 141,161 ----
         // underflow() and uflow() functions are called to get the next
         // charater from the real input source when the buffer is empty.
         // Buffered input uses underflow()
 + 
 +       // The only difference between underflow() and uflow() is that the
 +       // latter bumps _M_in_cur after the read.  In the sync_with_stdio
 +       // case, this is important, as we need to unget the read character in
 +       // the underflow() case in order to maintain synchronization.  So
 +       // instead of calling underflow() from uflow(), we create a common
 +       // subroutine to do the real work.
 +       int_type
 +       _M_really_underflow(bool __bump);
 + 
         virtual int_type
 !       underflow() { return _M_really_underflow(false); }
 ! 
 !       virtual int_type
 !       uflow() { return _M_really_underflow(true); }
   
         virtual int_type
         pbackfail(int_type __c = _Traits::eof());
 *************** namespace std
 *** 187,207 ****
         virtual int
         sync()
         {
 ! 	bool __testput = _M_out_cur && _M_out_beg < _M_out_end;
   
   	// Make sure that the internal buffer resyncs its idea of
   	// the file position with the external file.
   	if (__testput && !_M_file->sync())
 ! 	  {
 ! 	    // Need to restore current position. This interpreted as
 ! 	    // the position of the external byte sequence (_M_file)
 ! 	    // plus the offset in the current internal buffer
 ! 	    // (_M_out_beg - _M_out_cur)
 ! 	    streamoff __cur = _M_file->seekoff(0, ios_base::cur);
 ! 	    off_type __off = _M_out_cur - _M_out_beg;
 ! 	    _M_really_overflow();
 ! 	    _M_file->seekpos(__cur + __off);
 ! 	  }
   	_M_last_overflowed = false;
   	return 0;
         }
 --- 200,212 ----
         virtual int
         sync()
         {
 ! 	bool __testput = _M_out_cur && _M_out_beg < _M_out_cur;
   
   	// Make sure that the internal buffer resyncs its idea of
   	// the file position with the external file.
   	if (__testput && !_M_file->sync())
 ! 	  _M_really_overflow();
 ! 
   	_M_last_overflowed = false;
   	return 0;
         }
 *************** namespace std
 *** 239,244 ****
 --- 244,292 ----
   
         void
         _M_output_unshift();
 + 
 +       // These three functions are used to clarify internal buffer
 +       // maintenance. After an overflow, or after a seekoff call that
 +       // started at beg or end, or possibly when the stream becomes
 +       // unbuffered, and a myriad other obscure corner cases, the
 +       // internal buffer does not truly reflect the contents of the
 +       // external buffer. At this point, for whatever reason, it is in
 +       // an indeterminate state.
 +       void
 +       _M_set_indeterminate(void)
 +       {
 + 	if (_M_mode & ios_base::in)
 + 	  this->setg(_M_buf, _M_buf, _M_buf);
 + 	if (_M_mode & ios_base::out)
 + 	  this->setp(_M_buf, _M_buf + _M_buf_size);
 +       }
 + 
 +       void
 +       _M_set_determinate(off_type __off)
 +       {
 + 	bool __testin = _M_mode & ios_base::in;
 + 	bool __testout = _M_mode & ios_base::out;
 + 	if (__testin)
 + 	  this->setg(_M_buf, _M_buf, _M_buf + __off);
 + 	if (__testout)
 + 	  this->setp(_M_buf, _M_buf + _M_buf_size);
 +       }
 + 
 +       bool
 +       _M_is_indeterminate(void)
 +       {
 + 	bool __ret = false;
 + 	// Don't return true if unbuffered.
 + 	if (_M_buf)
 + 	  {
 + 	    __ret = true;
 + 	    if (_M_mode & ios_base::in)
 + 	      __ret &= _M_in_beg == _M_in_cur && _M_in_cur == _M_in_end;
 + 	    if (_M_mode & ios_base::out)
 + 	      __ret &= _M_out_beg == _M_out_cur;
 + 	  }
 + 	return __ret;
 +       }
       };
   
   
 *** ./include/std/std_streambuf.h.~1~	Mon Feb 18 00:38:23 2002
 --- ./include/std/std_streambuf.h	Thu Apr 18 01:13:03 2002
 *************** namespace std
 *** 184,190 ****
   	bool __testout = _M_out_cur;
   	_M_in_cur += __n;
   	if (__testout && _M_buf_unified)
 ! 	  _M_out_cur += __n;
         }
   
         // Correctly sets the _M_out_cur pointer, and bumps the
 --- 184,194 ----
   	bool __testout = _M_out_cur;
   	_M_in_cur += __n;
   	if (__testout && _M_buf_unified)
 ! 	  {
 ! 	    if (_M_out_cur == _M_out_beg)
 ! 	      _M_out_beg += __n;
 ! 	    _M_out_cur += __n;
 ! 	  }
         }
   
         // Correctly sets the _M_out_cur pointer, and bumps the
 *************** namespace std
 *** 202,208 ****
   
   	_M_out_cur += __n;
   	if (__testin && _M_buf_unified)
 ! 	  _M_in_cur += __n;
   	if (_M_out_cur > _M_out_end)
   	  {
   	    _M_out_end = _M_out_cur;
 --- 206,216 ----
   
   	_M_out_cur += __n;
   	if (__testin && _M_buf_unified)
 ! 	  {
 ! 	    _M_in_cur += __n;
 ! 	    if (_M_in_cur > _M_in_end)
 ! 	      _M_in_cur = _M_in_end;
 ! 	  }
   	if (_M_out_cur > _M_out_end)
   	  {
   	    _M_out_end = _M_out_cur;
 *************** namespace std
 *** 230,277 ****
   	  }
   	return __ret;
         }
 - 
 -       // These three functions are used to clarify internal buffer
 -       // maintenance. After an overflow, or after a seekoff call that
 -       // started at beg or end, or possibly when the stream becomes
 -       // unbuffered, and a myrid other obscure corner cases, the
 -       // internal buffer does not truly reflect the contents of the
 -       // external buffer. At this point, for whatever reason, it is in
 -       // an indeterminate state.
 -       void
 -       _M_set_indeterminate(void)
 -       {
 - 	if (_M_mode & ios_base::in)
 - 	  this->setg(_M_buf, _M_buf, _M_buf);
 - 	if (_M_mode & ios_base::out)
 - 	  this->setp(_M_buf, _M_buf);
 -       }
 - 
 -       void
 -       _M_set_determinate(off_type __off)
 -       {
 - 	bool __testin = _M_mode & ios_base::in;
 - 	bool __testout = _M_mode & ios_base::out;
 - 	if (__testin)
 - 	  this->setg(_M_buf, _M_buf, _M_buf + __off);
 - 	if (__testout)
 - 	  this->setp(_M_buf, _M_buf + __off);
 -       }
 - 
 -       bool
 -       _M_is_indeterminate(void)
 -       { 
 - 	bool __ret = false;
 - 	// Don't return true if unbuffered.
 - 	if (_M_buf)
 - 	  {
 - 	    if (_M_mode & ios_base::in)
 - 	      __ret = _M_in_beg == _M_in_cur && _M_in_cur == _M_in_end;
 - 	    if (_M_mode & ios_base::out)
 - 	      __ret = _M_out_beg == _M_out_cur && _M_out_cur == _M_out_end;
 - 	  }
 - 	return __ret;
 -       }
   
     public:
         virtual 
 --- 238,243 ----
 *** ./src/ios.cc.~1~	Wed Apr  3 21:19:20 2002
 --- ./src/ios.cc	Thu Apr 18 02:50:33 2002
 *************** namespace std 
 *** 150,163 ****
       int __out_bufsize = __sync ? 0 : static_cast<int>(BUFSIZ);
       int __in_bufsize = __sync ? 1 : static_cast<int>(BUFSIZ);
   
 - #if _GLIBCPP_AVOID_FSEEK
 -     // Platforms that prefer to avoid fseek() calls on streams only
 -     // get their desire when the C++-layer input buffer size is 1.
 -     // This hack hurts performance but keeps correctness across
 -     // all types of streams that might be attached to (e.g.) cin.
 -     __in_bufsize = 1;
 - #endif
 - 
       // NB: The file globals.cc creates the four standard files
       // with NULL buffers. At this point, we swap out the dummy NULL
       // [io]stream objects and buffers with the real deal.
 --- 150,155 ----
 *** ./testsuite/27_io/filebuf_virtuals.cc.~1~	Mon Jan 28 16:51:59 2002
 --- ./testsuite/27_io/filebuf_virtuals.cc	Thu Apr 18 02:29:12 2002
 *************** void test05() 
 *** 443,449 ****
 --- 443,451 ----
     c3 = fb_03.sputc('\n');
     strmsz_1 = fb_03.sputn("because because because. . .", 28);  
     VERIFY( strmsz_1 == 28 );
 +   // Defect?  retval of sungetc is not necessarily the character
     c1 = fb_03.sungetc();
 +   c1 = fb_03.sgetc();
     fb_03.pubsync(); 
     c3 = fb_03.sgetc();
     VERIFY( c1 == c3 );
 
 --=-=-=--


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-15 16:16 Jason Merrill
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Merrill @ 2002-04-15 16:16 UTC (permalink / raw)
  To: jason; +Cc: gcc-prs

The following reply was made to PR libstdc++/4150; it has been noted by GNATS.

From: Jason Merrill <jason@redhat.com>
To: libstdc++@gcc.gnu.org
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code
Date: Tue, 16 Apr 2002 00:15:10 +0100

 >>>>> "Jason" == Jason Merrill <jason@redhat.com> writes:
 
 > is there any reason why we can't leave the file pointer at the end of the
 > last read block, rather than at the beginning as we do now?
 
 Which is what libio does, and which would seem to be the only way to avoid
 this sort of problem.  Can anyone enlighten me as to the design
 goals/decisions of the current implementation?
 
 Continuing to hack...
 
 Jason


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-15  7:42 jason
  0 siblings, 0 replies; 12+ messages in thread
From: jason @ 2002-04-15  7:42 UTC (permalink / raw)
  To: gcc-bugs, gcc-prs, jason, leitner, nobody

Synopsis: catastrophic performance decrease in C++ code

Responsible-Changed-From-To: unassigned->jason
Responsible-Changed-By: jason
Responsible-Changed-When: Mon Apr 15 07:42:48 2002
Responsible-Changed-Why:
    Working on it

http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=4150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-14 19:36 Jason Merrill
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Merrill @ 2002-04-14 19:36 UTC (permalink / raw)
  To: nobody; +Cc: gcc-prs

The following reply was made to PR libstdc++/4150; it has been noted by GNATS.

From: Jason Merrill <jason@redhat.com>
To: libstdc++@gcc.gnu.org
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code
Date: Mon, 15 Apr 2002 03:25:59 +0100

 --=-=-=
 
 Loren suggested defining _GLIBCPP_AVOID_FSEEK as a fix for 4150; I've tried
 it.  It avoids the useless seeks for the stdin case, but it's a rather
 Pyrrhic improvement; it also slows down execution from 0:42 to 1:50 (vs
 0:20 in 2.95).  The comment in ios.cc says that this "hurts" performance; I
 would suggest that verb isn't strong enough.  Perhaps "eviscerates".
 
 But going back to the problem without that define, the basic bug seems to
 be that after basic_filebuf::underflow() reads in a chunk of data, it then
 always rewinds the file to the beginning of that chunk.  The next time it
 gets called, it seeks ahead to the end of that chunk to read the next one.
 And so on.  So we get two useless seeks between each read.
 
 There may be a point to these seeks, but there are no comments in the code
 to justify them.  Simply tearing them out brings us up to 2.95 performance
 levels.  At least seekoff would need to be adjusted accordingly, but is
 there any reason why we can't leave the file pointer at the end of the last
 read block, rather than at the beginning as we do now?
 
 Jason
 
 
 --=-=-=
 Content-Type: text/x-patch
 Content-Disposition: inline
 
 *** include/bits/fstream.tcc.~1~	Fri Apr 12 11:28:30 2002
 --- include/bits/fstream.tcc	Mon Apr 15 02:49:14 2002
 *************** namespace std
 *** 268,280 ****
   	    {
   	      if (__testout)
   		_M_really_overflow();
 ! #if _GLIBCPP_AVOID_FSEEK
 ! 	      else if ((_M_in_cur - _M_in_beg) == 1)
 ! 		_M_file->sys_getc();
 ! #endif
 ! 	      else 
 ! 		_M_file->seekoff(_M_in_cur - _M_in_beg, 
 ! 				 ios_base::cur, ios_base::in);
   	    }
   
   	  if (__testinit || __testget)
 --- 268,275 ----
   	    {
   	      if (__testout)
   		_M_really_overflow();
 ! 	      if (_M_in_cur < _M_in_end)
 ! 		abort ();
   	    }
   
   	  if (__testinit || __testget)
 *************** namespace std
 *** 316,331 ****
   		  if (__testout)
   		    _M_out_cur = _M_in_cur;
   		  __ret = traits_type::to_int_type(*_M_in_cur);
 - #if _GLIBCPP_AVOID_FSEEK
 - 		  if (__elen == 1)
 - 		    _M_file->sys_ungetc(*_M_in_cur);
 - 		  else
 - 		    {
 - #endif
 - 		      _M_file->seekoff(-__elen, ios_base::cur, ios_base::in);
 - #if _GLIBCPP_AVOID_FSEEK
 - 		    }
 - #endif
   		}	   
   	    }
   	}
 --- 311,316 ----
 
 --=-=-=--


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-14 11:13 jason
  0 siblings, 0 replies; 12+ messages in thread
From: jason @ 2002-04-14 11:13 UTC (permalink / raw)
  To: gcc-bugs, gcc-prs, leitner, nobody

Synopsis: catastrophic performance decrease in C++ code

State-Changed-From-To: feedback->analyzed
State-Changed-By: jason
State-Changed-When: Sun Apr 14 11:13:53 2002
State-Changed-Why:
    should not be in feedback

http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=4150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-11 18:20 ljrittle
  0 siblings, 0 replies; 12+ messages in thread
From: ljrittle @ 2002-04-11 18:20 UTC (permalink / raw)
  To: gcc-bugs, gcc-prs, leitner, nobody

Synopsis: catastrophic performance decrease in C++ code

State-Changed-From-To: analyzed->feedback
State-Changed-By: ljrittle
State-Changed-When: Thu Apr 11 18:20:06 2002
State-Changed-Why:
    The root issue here is related to PR/5820.  If someone
    on a Linux platform could *please* just try my suggested
    fix (which is already used on platforms where it is not
    legal to seek on interactive streams), then this whole
    seeking performance problem will go away entirely for
    you...  See concurrent post to libstdc++ mailing list.

http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=4150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: libstdc++/4150: catastrophic performance decrease in C++ code
@ 2002-04-11 17:19 rth
  0 siblings, 0 replies; 12+ messages in thread
From: rth @ 2002-04-11 17:19 UTC (permalink / raw)
  To: gcc-bugs, gcc-prs, leitner, nobody

Synopsis: catastrophic performance decrease in C++ code

State-Changed-From-To: open->analyzed
State-Changed-By: rth
State-Changed-When: Thu Apr 11 17:19:06 2002
State-Changed-Why:
    Still present in 3.1.  We're seeking unnecessarily, even after
    adding a sync_with_stdio(false) to the source (which I now attach
    to the PR).  Strace shows
    
    _llseek(4, 8192, {8192}, SEEK_SET)      = 0
    read(4, "AOL (Traffic-Server/1.1.6 [1])\""..., 8192) = 8192
    _llseek(4, 8192, {8192}, SEEK_SET)      = 0
    _llseek(4, 16384, {16384}, SEEK_SET)    = 0
    read(4, "cur=0&skip=10\" \"Mozilla/4.0 (c"..., 8192) = 8192
    _llseek(4, 16384, {16384}, SEEK_SET)    = 0
    
    which indeed agrees with what basic_filebuf<>::underflow is
    doing.  To my mind this is appalling waste.  Surely this is
    not necessary when sync is off?

http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=4150


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2002-04-23 12:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-04-16  5:46 libstdc++/4150: catastrophic performance decrease in C++ code Jason Merrill
  -- strict thread matches above, loose matches on Subject: below --
2002-04-23  5:23 jason
2002-04-18  4:06 Jason Merrill
2002-04-18  3:06 Jason Merrill
2002-04-17 21:56 Benjamin Kosnik
2002-04-17 19:26 Jason Merrill
2002-04-15 16:16 Jason Merrill
2002-04-15  7:42 jason
2002-04-14 19:36 Jason Merrill
2002-04-14 11:13 jason
2002-04-11 18:20 ljrittle
2002-04-11 17:19 rth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).