From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id E05DE3857C5E for ; Tue, 8 Nov 2022 17:46:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E05DE3857C5E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667929563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=dtGhMFmz+naAcHSlzBfoaWBbGAsBdcJ3g0utP++G3NU=; b=UVlbaRN7vaN4k6+JMTKzsZFJ06eR0M+1Ua7Cb8qRk6AMPaB2LWyafPxFhERdCQM1QrLa+C qRoXVUM9mE2nMdp2EfyqXIkhWVKwqSe+WjVG5Hq77nSBxG9c6JQOcM0TQ5/lcicc6rBsli DqkaYhm5CZ/moUEpMM1T4CVWXLOXT2Y= Received: from mail-ej1-f69.google.com (mail-ej1-f69.google.com [209.85.218.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-615-Qnj-q8LtOcCYt3BChSk-0Q-1; Tue, 08 Nov 2022 12:46:02 -0500 X-MC-Unique: Qnj-q8LtOcCYt3BChSk-0Q-1 Received: by mail-ej1-f69.google.com with SMTP id jg27-20020a170907971b00b007ad9892f5f6so8778595ejc.7 for ; Tue, 08 Nov 2022 09:46:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dtGhMFmz+naAcHSlzBfoaWBbGAsBdcJ3g0utP++G3NU=; b=r6LaksI3i9Be46cm4nraiSnnFcJUNvxUHs5/+PQfQO0ee4oa09+zm/kvuW3Bb2XhpV iitOWCv42DCZrESsD7LHIPKdd+IwutaY/0DBh5Bt5h8UCjbZk4/rWNx1lrgqpnlzA+YU 0zccczQv0g6ssWL1Gy99Svi9lHbbLh8uSominRrY8BLCWL4Yw03v5yNHbTjm8roj6FSg YsSwlD4vWJIKvAyb1VAkOHYFhtVMRbe64QDqQ6kVzAfNmIJz6l9EFa34/SfffVj3qkak LuE6BGmtsDANXdMIWt/QJiMKkdI95/A13GuWpShwejmgrLz/nFfFCGlU3OXO/9bBZM1o yFjg== X-Gm-Message-State: ACrzQf1nhATtmA+il//k9nobNH7lXVvHWPfDaCjs3HTbzCqHjCtU+T7z Cx4/sRS5JHwOaTXhza8gCJ+f81vyO1IyXEP1SnrKCoyJfMhfbTMoFpJKvxLmiEnHYaprEY/U99I p1RcEF8qdA0+LNpmV6NyfuxDVRXZVa5o= X-Received: by 2002:a17:906:fe46:b0:73d:939a:ec99 with SMTP id wz6-20020a170906fe4600b0073d939aec99mr54594930ejb.169.1667929561065; Tue, 08 Nov 2022 09:46:01 -0800 (PST) X-Google-Smtp-Source: AMsMyM5d+F6J/8wLaUV9hzgriVSAIB7ZitBw8HgW221wcQsWqS5Tyd8bE4tf0CLLT2R37rtByEJc69CTCzgjA6IIHxY= X-Received: by 2002:a17:906:fe46:b0:73d:939a:ec99 with SMTP id wz6-20020a170906fe4600b0073d939aec99mr54594917ejb.169.1667929560843; Tue, 08 Nov 2022 09:46:00 -0800 (PST) MIME-Version: 1.0 References: <20221006190255.361385-1-cf.natali@gmail.com> In-Reply-To: From: Jonathan Wakely Date: Tue, 8 Nov 2022 17:45:50 +0000 Message-ID: Subject: Re: [PATCH v2] libstdc++: basic_filebuf: don't flush more often than necessary. To: Charles-Francois Natali Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 7 Nov 2022 at 17:00, Jonathan Wakely wrote: > > On Thu, 6 Oct 2022 at 20:03, Charles-Francois Natali via Libstdc++ > wrote: > > > > `basic_filebuf::xsputn` would bypass the buffer when passed a chunk of > > size 1024 and above, seemingly as an optimisation. > > > > This can have a significant performance impact if the overhead of a > > `write` syscall is non-negligible, e.g. on a slow disk, on network > > filesystems, or simply during IO contention because instead of flushing > > every `BUFSIZ` (by default), we can flush every 1024 char. > > The impact is even greater with custom larger buffers, e.g. for network > > filesystems, because the code could issue `write` for example 1000X more > > often than necessary with respect to the buffer size. > > It also introduces a significant discontinuity in performance when > > writing chunks of size 1024 and above. > > > > See this reproducer which writes down a fixed number of chunks to a file > > open with `O_SYNC` - to replicate high-latency `write` - for varying > > size of chunks: > > > > ``` > > $ cat test_fstream_flush.cpp > > > > int > > main(int argc, char* argv[]) > > { > > assert(argc == 3); > > > > const auto* path = argv[1]; > > const auto chunk_size = std::stoul(argv[2]); > > > > const auto fd = > > open(path, O_CREAT | O_TRUNC | O_WRONLY | O_SYNC | O_CLOEXEC, 0666); > > assert(fd >= 0); > > > > auto filebuf = __gnu_cxx::stdio_filebuf(fd, std::ios_base::out); > > auto stream = std::ostream(&filebuf); > > > > const auto chunk = std::vector(chunk_size); > > > > for (auto i = 0; i < 1'000; ++i) { > > stream.write(chunk.data(), chunk.size()); > > } > > > > return 0; > > } > > ``` > > > > ``` > > $ g++ -o /tmp/test_fstream_flush test_fstream_flush.cpp -std=c++17 > > $ for i in $(seq 1021 1025); do echo -e "\n$i"; time /tmp/test_fstream_flush /tmp/foo $i; done > > > > 1021 > > > > real 0m0.997s > > user 0m0.000s > > sys 0m0.038s > > > > 1022 > > > > real 0m0.939s > > user 0m0.005s > > sys 0m0.032s > > > > 1023 > > > > real 0m0.954s > > user 0m0.005s > > sys 0m0.034s > > > > 1024 > > > > real 0m7.102s > > user 0m0.040s > > sys 0m0.192s > > > > 1025 > > > > real 0m7.204s > > user 0m0.025s > > sys 0m0.209s > > ``` > > > > See the huge drop in performance at the 1024-boundary. > > I've finally found time to properly look at this, sorry for the delay. > > I thought I was unable to reproduce these numbers, then I realised I'd > already installed a build with the patch, so was measuring the patched > performance for both my "before" and "after" tests. Oops! > > My concern is that the patch doesn't only affect files on remote > filesystems. I assume the original 1024-byte chunking behaviour is > there for a reason, because for large writes the performance might be > better if we just write directly instead of buffering and then writing > again. Assuming we have a fast disk, writing straight to disk avoids > copying in and out of the buffer. But if we have a slow disk, it's > better to buffer and reduce the total number of disk writes. I'm > concerned that the patch optimizes the slow disk case potentially at a > cost for the fast disk case. > > I wonder whether it would make sense to check whether the buffer size > has been manually changed, i.e. epptr() - pbase() != _M_buf_size. If > the buffer has been explicitly set by the user, then we should assume > they really want it to be used and so don't bypass it for writes >= > 1024. > > In the absence of a better idea, I think I'm going to commit the patch > as-is. I don't see it causing any measurable slowdown for very large > writes on fast disks, and it's certainly a huge improvement for slow > disks. The patch has been pushed to trunk now, thanks for the contribution. I removed the testcase and results from the commit message as they don't need to be in the git log. I added a link to your email into bugzilla though, so we can still find it easily.