From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 92B423858D3C for ; Tue, 7 Mar 2023 13:13:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 92B423858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678194787; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-to: resent-from:resent-message-id; bh=tOK23P0VuclmVMhd7GBdKUluPRfMJjWmG+TDIFrbuTQ=; b=DHDcMzSvoNWKo/o13XlRBQzY5J+dMEE6AdRVBq9yWXBAtKz8WeVrCM86Q1AiGalCrB+RSC 8L/SrAgEc4zsPmQJ9M1uuD5UMfpzFEi2TklgztXIPisEhSRgHRtBmjFKWQQrbiqYcQ754i 4eUqopR9o8/4oh7d6bAnU+FhcqDbubY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-448-CP3a5B1jNWeoHouMyMyERA-1; Tue, 07 Mar 2023 08:13:04 -0500 X-MC-Unique: CP3a5B1jNWeoHouMyMyERA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CCA29100F90B; Tue, 7 Mar 2023 13:13:03 +0000 (UTC) Received: from localhost (unknown [10.33.36.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9104AC15BAD; Tue, 7 Mar 2023 13:13:03 +0000 (UTC) Resent-From: Jonathan Wakely Resent-Date: Tue, 7 Mar 2023 13:13:02 +0000 Resent-Message-ID: Resent-To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org, Jannik =?iso-8859-1?Q?Gl=FCckert?= Date: Tue, 7 Mar 2023 11:01:47 +0000 From: Jonathan Wakely To: Jannik =?iso-8859-1?Q?Gl=FCckert?= Subject: Re: [PATCH 2/2] libstdc++: use copy_file_range Message-ID: MIME-Version: 1.0 X-Clacks-Overhead: GNU Terry Pratchett X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 1 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 06/03/23 23:11 +0100, Jannik Glückert wrote: >copy_file_range is a recent-ish syscall for copying files. It is similar >to sendfile but allows filesystem-specific optimizations. Common are: >Reflinks: BTRFS, XFS, ZFS (does not implement the syscall yet) >Server-side copy: NFS, SMB > >If copy_file_range is not available for the given files, fall back to >sendfile / userspace copy. Thanks for the patch! Please note the legal prerequisites for GCC contributions: https://gcc.gnu.org/contribute.html#legal >libstdc++-v3/ChangeLog: > > * acinclude.m4 (_GLIBCXX_USE_COPY_FILE_RANGE): define > * config.h.in: Regenerate. > * configure: Regenerate. > * src/filesystem/ops-common.h: use copy_file_range in > std::filesystem::copy_file >--- > libstdc++-v3/acinclude.m4 | 20 ++++++++ > libstdc++-v3/config.h.in | 3 ++ > libstdc++-v3/configure | 62 ++++++++++++++++++++++++ > libstdc++-v3/src/filesystem/ops-common.h | 34 +++++++++++++ > 4 files changed, 119 insertions(+) > >diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 >index 5136c0571e8..ca09e1d22db 100644 >--- a/libstdc++-v3/acinclude.m4 >+++ b/libstdc++-v3/acinclude.m4 >@@ -4581,6 +4581,7 @@ dnl _GLIBCXX_USE_UTIMENSAT > dnl _GLIBCXX_USE_ST_MTIM > dnl _GLIBCXX_USE_FCHMOD > dnl _GLIBCXX_USE_FCHMODAT >+dnl _GLIBCXX_USE_COPY_FILE_RANGE > dnl _GLIBCXX_USE_SENDFILE > dnl HAVE_LINK > dnl HAVE_READLINK >@@ -4718,6 +4719,25 @@ dnl > if test $glibcxx_cv_fchmodat = yes; then > AC_DEFINE(_GLIBCXX_USE_FCHMODAT, 1, [Define if fchmodat is available in .]) > fi >+dnl >+ AC_CACHE_CHECK([for copy_file_range that can copy files], >+ glibcxx_cv_copy_file_range, [dnl >+ case "${target_os}" in >+ linux*) >+ GCC_TRY_COMPILE_OR_LINK( >+ [#include ], >+ [copy_file_range(1, NULL, 2, NULL, 1, 0);], This should either include for NULL, or use nullptr. >+ [glibcxx_cv_copy_file_range=yes], >+ [glibcxx_cv_copy_file_range=no]) >+ ;; >+ *) >+ glibcxx_cv_copy_file_range=no >+ ;; >+ esac >+ ]) >+ if test $glibcxx_cv_copy_file_range = yes; then >+ AC_DEFINE(_GLIBCXX_USE_COPY_FILE_RANGE, 1, [Define if copy_file_range is available in .]) >+ fi > dnl > AC_CACHE_CHECK([for sendfile that can copy files], > glibcxx_cv_sendfile, [dnl >diff --git a/libstdc++-v3/src/filesystem/ops-common.h b/libstdc++-v3/src/filesystem/ops-common.h >index d8afc6a4d64..0491dc8d811 100644 >--- a/libstdc++-v3/src/filesystem/ops-common.h >+++ b/libstdc++-v3/src/filesystem/ops-common.h >@@ -49,6 +49,9 @@ > #ifdef NEED_DO_COPY_FILE > # include > # include >+# ifdef _GLIBCXX_USE_COPY_FILE_RANGE >+# include // copy_file_range >+# endif > # ifdef _GLIBCXX_USE_SENDFILE > # include // sendfile > # endif >@@ -358,6 +361,24 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM > } > > #ifdef NEED_DO_COPY_FILE >+#ifdef _GLIBCXX_USE_COPY_FILE_RANGE >+ bool >+ copy_file_copy_file_range(int fd_in, int fd_out, size_t length) noexcept >+ { >+ size_t bytes_left = length; >+ off_t offset = 0; >+ ssize_t bytes_copied; >+ do { >+ bytes_copied = ::copy_file_range(fd_in, &offset, fd_out, NULL, bytes_left, 0); If there's a good reason to use &offset here instead of nullptr then I think we need a comment explaining it. I initially thought the point was to not adjust the file offset for fd_in, so that if copy_file_range fails after the first call, we retry using sendfile or filebufs and restart copying from the beginning again. But copy_file_range will have updated the file offset of fd_out in that case, and so sendfile/filebuf would start copying from the beginning of fd_in but write to the end of fd_out, producing an incorrect copy. Either we should pass &off_in and &off_out, so that neither file offset is updated, or we should rewind the output file offset before retrying with sendfile/filebuf. Your first patch removes the existing code that seeks the filebuf positions to account for any partial sendfile copying. I'm not sure the code is correct after removing that. >+ if (bytes_copied < 0) >+ { >+ return false; >+ } >+ bytes_left -= bytes_copied; >+ } while (bytes_left > 0 && bytes_copied > 0); This will break out of the loop of copy_file_range returns zero, even if bytes_left != 0. It's not clear from the man page whether that can happen (it says it will return zero if the file offset of fd_in is at EOF, but it doesn't say it *won't* return zero otherwise). This does match the loop condition in the man page's example though. >+ return true; >+ } >+#endif > #if defined _GLIBCXX_USE_SENDFILE && ! defined _GLIBCXX_FILESYSTEM_IS_WINDOWS > bool > copy_file_sendfile(int fd_in, int fd_out, size_t length) noexcept >@@ -518,6 +539,19 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM > > bool has_copied = false; > >+#ifdef _GLIBCXX_USE_COPY_FILE_RANGE >+ if (!has_copied) >+ has_copied = copy_file_copy_file_range(in.fd, out.fd, from_st->st_size); >+ if (!has_copied) >+ { >+ if (errno != EFBIG && errno != EOPNOTSUPP && errno != EOVERFLOW && errno != EXDEV) Shouldn't this be checking ENOSYS not EOPNOTSUPP? If copy_file_range fails with EFBIG of EOVERFLOW, won't sendfile and the filebuf copy fail in the same way? Are there actually error conditions where copy_file_range will succeed on the first call, then fail on a subsequent call (e.g. with EFBIG), but then sendfile or filebuf would succeed? If not, we can just report an error for EFBIG and EOVERFLOW. >+ { >+ ec.assign(errno, std::generic_category()); >+ return false; >+ } >+ } >+#endif >+ > #if defined _GLIBCXX_USE_SENDFILE && ! defined _GLIBCXX_FILESYSTEM_IS_WINDOWS > if (!has_copied) > has_copied = copy_file_sendfile(in.fd, out.fd, from_st->st_size); >-- >2.39.2