From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11139 invoked by alias); 16 Apr 2002 12:46:09 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 11079 invoked by uid 71); 16 Apr 2002 12:46:06 -0000 Date: Tue, 16 Apr 2002 05:46:00 -0000 Message-ID: <20020416124606.11077.qmail@sources.redhat.com> To: jason@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: Jason Merrill Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code Reply-To: Jason Merrill X-SW-Source: 2002-04/txt/msg00830.txt.bz2 List-Id: The following reply was made to PR libstdc++/4150; it has been noted by GNATS. From: Jason Merrill To: rittle@labs.mot.com Cc: libstdc++@gcc.gnu.org, rth@redhat.com, gcc-gnats@gcc.gnu.org Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code Date: Tue, 16 Apr 2002 11:02:46 +0100 >>>>> "Loren" == Loren James Rittle writes: > Under the current architecture (which I have only ever tweaked for > performance, compliance and QoS of interactive cases not dictated by > standard), the whole reason for the backup to the point before the > read is that until a character is actually consumed by the > higher-layer of libstdc++-v3 IO, the lower-layer C stdio file-pointer > must not appear to move forward w.r.t. other C stdio. Granted it > seems less-than-ideal to always use that algorithm even when not > sync'd to stdio. Then I suppose we should use a buffer size of 0. > It is why I told RTH the other day in an e-mail that I thought some > basic re-architecture would be required to solve all performance > issues related to outstanding libstdc++-v3 PRs. When the higher-layer > knows it will consume more than X characters in sync'd IO cases, it > should be able to pull >1& architecture limits us to pulls of 1 character). Or, if the > higher-layer knows it is looking for a newline character (another very > common case), it should be able to use the C stdio optimized routine > to pull >1 character from the lower layer (bounded only by newline or > the provided buffer size, aka the fgets function call). Under a > re-architecture, it seems to me that only when the higher-layer of > libstdc++-v3 is in a scanning mode not directly supported by libc that > it must conditionally pull 1 character at a time through the layer > when sync'd to stdio Makes sense to me. > Now, I actually have no idea if the abstraction layer dictated by the > standard even allows these optimizations. I looked at this situation >6 > months ago and I actually think not. Why not? It seems to me that the optimizations you suggest would conform fine to the spec for xs{put,get}n. The spec for basic_streambuf::xsgetn talks about implementation "as if" by repeated calls to sbumpc, but then also says that derived classes can provide more efficient implementations. To optimize getline, we'd need to introduce a virtual helper function in streambuf, but I don't see any reason why that would violate the standard. > With your patch (plus the removal of the related _GLIBCPP_AVOID_FSEEK > region in src/ios.cc), I see one automatic regression here: > assertion "(off_2 == (off_1 + 2 + 1 + 1))" failed: file > "[...]/27_io/filebuf_virtuals.cc", line 428 > FAIL: 27_io/filebuf_virtuals.cc execution test Yep, I'm aware of that. I knew that the patch I posted was incomplete; it was meant more as a concrete illustration of my proposal. I'm still working on it. > [1] I don't know if this is widely known information thus I want to > make sure you tested my patch to enable _GLIBCPP_AVOID_FSEEK on > Linux properly. If you bootstrap all of gcc, then when > libstdc++-v3 is built, it will be built with flags set by > top-level Makefile (nominally, `-O2 -g'). If you later run make > in libstdc++-v3, it will rebuild (some/all?) files with `-O0 -g' > (except stuff built in libmath which appears to get top-level > flags)... IMHO, the only way to test performance patches in > libstdc++-v3, is to `rm -rf /libstdc++-v3' and rerun make > at top-level. This way libstdc++-v3 is built exactly as when it > is bootstrapped. I'm aware of the difference; in all cases, I was building without optimization, on the assumption that the calls to the C layer would be where we were spending our time. So I was comparing apples to apples, but perhaps not the most useful apples. :) Jason