public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/99805] New: filesystem::path::parent_path got a wrong path
@ 2021-03-29  3:05 drfeng08 at gmail dot com
  2021-03-29 15:15 ` [Bug c++/99805] " redi at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: drfeng08 at gmail dot com @ 2021-03-29  3:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

            Bug ID: 99805
           Summary: filesystem::path::parent_path got a wrong path
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: drfeng08 at gmail dot com
  Target Milestone: ---

[work@centos7 ~]$ uname -a
Linux centos7 3.10.0-1160.11.1.el7.x86_64 #1 SMP Fri Dec 18 16:34:56 UTC 2020
x86_64 x86_64 x86_64 GNU/Linux
[work@centos7 ~]$ g++ --version
g++ (GCC) 10.2.1 20200804 (Red Hat 10.2.1-2)
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[work@centos7 ~]$ which g++
/opt/rh/devtoolset-10/root/usr/bin/g++
[work@centos7 ~]$ cat ans8.cpp
#include <iostream>
#include <filesystem>
#include <string>

std::string get_path() {
    static std::string path = "/ssd1/opt/stdpain/workspace";
    return path;
}

int main() {
    std::filesystem::path path(get_path());
    std::filesystem::path path2 = path.parent_path();
    std::cout << path << std::endl;
    std::cout << path2 << std::endl;
}

output:
/ssd1/opt/stdpain/workspace
/ssd1/opt/stdpain/workspace

expect:
/ssd1/opt/stdpain/workspace
/ssd1/opt/stdpain/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
@ 2021-03-29 15:15 ` redi at gcc dot gnu.org
  2021-04-01  6:15 ` drfeng08 at gmail dot com
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2021-03-29 15:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I cannot reproduce this with upstream GCC. This looks like a bug in
devtoolset-10, so I'll report it to Red Hat's bugzilla instead.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
  2021-03-29 15:15 ` [Bug c++/99805] " redi at gcc dot gnu.org
@ 2021-04-01  6:15 ` drfeng08 at gmail dot com
  2021-04-01 18:02 ` redi at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: drfeng08 at gmail dot com @ 2021-04-01  6:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

--- Comment #2 from hao an <drfeng08 at gmail dot com> ---
(In reply to Jonathan Wakely from comment #1)
> I cannot reproduce this with upstream GCC. This looks like a bug in
> devtoolset-10, so I'll report it to Red Hat's bugzilla instead.

Hi, I reproduce it in ubuntu , we should add flag "-D_GLIBCXX_USE_CXX11_ABI=0".

```
root@D4152DC50BFA208:~# g++ ans8.cpp -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0
root@D4152DC50BFA208:~# ./a.out
"/ssd1/opt/stdpain/workspace"
"/ssd1/opt/stdpain/workspace"

root@D4152DC50BFA208:~# g++ ans8.cpp -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=1
root@D4152DC50BFA208:~# ./a.out
"/ssd1/opt/stdpain/workspace"
"/ssd1/opt/stdpain"

```

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
  2021-03-29 15:15 ` [Bug c++/99805] " redi at gcc dot gnu.org
  2021-04-01  6:15 ` drfeng08 at gmail dot com
@ 2021-04-01 18:02 ` redi at gcc dot gnu.org
  2021-04-01 18:03 ` [Bug c++/99805] [9/10/11 Regression] " redi at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2021-04-01 18:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Assignee|unassigned at gcc dot gnu.org      |redi at gcc dot gnu.org
   Last reconfirmed|                            |2021-04-01
             Status|UNCONFIRMED                 |ASSIGNED

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Ah yes, that explains why devtoolset reproduces it. Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9/10/11 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (2 preceding siblings ...)
  2021-04-01 18:02 ` redi at gcc dot gnu.org
@ 2021-04-01 18:03 ` redi at gcc dot gnu.org
  2021-04-01 20:46 ` redi at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2021-04-01 18:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |10.2.0, 11.0, 9.3.0
   Target Milestone|---                         |9.4
            Summary|filesystem::path::parent_pa |[9/10/11 Regression]
                   |th got a wrong path         |filesystem::path::parent_pa
                   |                            |th got a wrong path
      Known to work|                            |8.4.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9/10/11 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (3 preceding siblings ...)
  2021-04-01 18:03 ` [Bug c++/99805] [9/10/11 Regression] " redi at gcc dot gnu.org
@ 2021-04-01 20:46 ` redi at gcc dot gnu.org
  2021-04-01 21:06 ` redi at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2021-04-01 20:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Wow, this is tricksy.

The bug happens when parsing the string into a path. The path is split into
components and the offset of each component from the beginning of the string is
stored, so that parent_path() can efficiently erase the part of the native()
string that refers to the last component. The offset is calculated like this:

          for (auto& c : buf)
            {
              auto pos = c.str.data() - _M_pathname.data();
              ::new(output++) _Cmpt(c.str, c.type, pos);
              ++_M_cmpts._M_impl->_M_size;
            }

The bug is that for the COW std::string _M_pathname.data() causes the string to
be reallocated, and so changes the address at which the string is stored. So
the string_view c.str no longer refers to the same data as _M_pathname. We get
a bogus offset for the components, and when we try to remove the last component
to get the parent_path(), we don't remove anything from the string.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9/10/11 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (4 preceding siblings ...)
  2021-04-01 20:46 ` redi at gcc dot gnu.org
@ 2021-04-01 21:06 ` redi at gcc dot gnu.org
  2021-04-07 15:40 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2021-04-01 21:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
It would have worked if it used std::as_const(_M_pathname).data() or
_M_pathname.c_str() instead of _M_pathname.data(). It's only the non-const
data() added in C++17 which reallocates (since r261642 anyway).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9/10/11 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (5 preceding siblings ...)
  2021-04-01 21:06 ` redi at gcc dot gnu.org
@ 2021-04-07 15:40 ` cvs-commit at gcc dot gnu.org
  2021-04-07 15:43 ` [Bug c++/99805] [9/10 " redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-07 15:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:e06d3f5dd7d0c6b4a20fe813e6ee5addd097f560

commit r11-8031-ge06d3f5dd7d0c6b4a20fe813e6ee5addd097f560
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Apr 7 16:05:42 2021 +0100

    libstdc++: Fix filesystem::path construction from COW string [PR 99805]

    Calling the non-const data() member on a COW string makes it "leaked",
    possibly resulting in reallocating the string to ensure a unique owner.

    The path::_M_split_cmpts() member parses its _M_pathname string using
    string_view objects and then calls _M_pathname.data() to find the offset
    of each string_view from the start of the string. However because
    _M_pathname is non-const that will cause a COW string to reallocate if
    it happens to be shared with another string object. This results in the
    offsets calculated for each component being wrong (i.e. undefined)
    because the string views no longer refer to substrings of the
    _M_pathname member. The fix is to use the parse.offset(c) member which
    gets the offset safely.

    The bug only happens for the path(string_type&&) constructor and only
    for COW strings. When constructed from an lvalue string the string's
    contents are copied rather than just incrementing the refcount, so
    there's no reallocation when calling the non-const data() member. The
    testsuite changes check the lvalue case anyway, because we should
    probably change the deep copying to just be a refcount increment (by
    adding a path(const string_type&) constructor or an overload for
    __effective_range(const string_type&), for COW strings only).

    libstdc++-v3/ChangeLog:

            PR libstdc++/99805
            * src/c++17/fs_path.cc (path::_M_split_cmpts): Do not call
            non-const member on _M_pathname, to avoid copy-on-write.
            * testsuite/27_io/filesystem/path/decompose/parent_path.cc:
            Check construction from strings that might be shared.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9/10 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (6 preceding siblings ...)
  2021-04-07 15:40 ` cvs-commit at gcc dot gnu.org
@ 2021-04-07 15:43 ` redi at gcc dot gnu.org
  2021-04-08 17:00 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2021-04-07 15:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[9/10/11 Regression]        |[9/10 Regression]
                   |filesystem::path::parent_pa |filesystem::path::parent_pa
                   |th got a wrong path         |th got a wrong path

--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Fixed on trunk only for now.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9/10 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (7 preceding siblings ...)
  2021-04-07 15:43 ` [Bug c++/99805] [9/10 " redi at gcc dot gnu.org
@ 2021-04-08 17:00 ` cvs-commit at gcc dot gnu.org
  2021-04-08 17:04 ` [Bug c++/99805] [9 " redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-08 17:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:beb485ddeb066d03b2568bb1bebaa2902b4dbf97

commit r10-9671-gbeb485ddeb066d03b2568bb1bebaa2902b4dbf97
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Apr 7 16:05:42 2021 +0100

    libstdc++: Fix filesystem::path construction from COW string [PR 99805]

    Calling the non-const data() member on a COW string makes it "leaked",
    possibly resulting in reallocating the string to ensure a unique owner.

    The path::_M_split_cmpts() member parses its _M_pathname string using
    string_view objects and then calls _M_pathname.data() to find the offset
    of each string_view from the start of the string. However because
    _M_pathname is non-const that will cause a COW string to reallocate if
    it happens to be shared with another string object. This results in the
    offsets calculated for each component being wrong (i.e. undefined)
    because the string views no longer refer to substrings of the
    _M_pathname member. The fix is to use the parse.offset(c) member which
    gets the offset safely.

    The bug only happens for the path(string_type&&) constructor and only
    for COW strings. When constructed from an lvalue string the string's
    contents are copied rather than just incrementing the refcount, so
    there's no reallocation when calling the non-const data() member. The
    testsuite changes check the lvalue case anyway, because we should
    probably change the deep copying to just be a refcount increment (by
    adding a path(const string_type&) constructor or an overload for
    __effective_range(const string_type&), for COW strings only).

    libstdc++-v3/ChangeLog:

            PR libstdc++/99805
            * src/c++17/fs_path.cc (path::_M_split_cmpts): Do not call
            non-const member on _M_pathname, to avoid copy-on-write.
            * testsuite/27_io/filesystem/path/decompose/parent_path.cc:
            Check construction from strings that might be shared.

    (cherry picked from commit e06d3f5dd7d0c6b4a20fe813e6ee5addd097f560)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (8 preceding siblings ...)
  2021-04-08 17:00 ` cvs-commit at gcc dot gnu.org
@ 2021-04-08 17:04 ` redi at gcc dot gnu.org
  2021-04-19 11:31 ` cvs-commit at gcc dot gnu.org
  2021-04-19 11:31 ` redi at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2021-04-08 17:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |10.3.1, 11.0
            Summary|[9/10 Regression]           |[9 Regression]
                   |filesystem::path::parent_pa |filesystem::path::parent_pa
                   |th got a wrong path         |th got a wrong path

--- Comment #9 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Fixed for 10.4 too.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (9 preceding siblings ...)
  2021-04-08 17:04 ` [Bug c++/99805] [9 " redi at gcc dot gnu.org
@ 2021-04-19 11:31 ` cvs-commit at gcc dot gnu.org
  2021-04-19 11:31 ` redi at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-19 11:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:056563557e5e6e00d467760744c5b8e8525a06ac

commit r9-9362-g056563557e5e6e00d467760744c5b8e8525a06ac
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Apr 7 16:05:42 2021 +0100

    libstdc++: Fix filesystem::path construction from COW string [PR 99805]

    Calling the non-const data() member on a COW string makes it "leaked",
    possibly resulting in reallocating the string to ensure a unique owner.

    The path::_M_split_cmpts() member parses its _M_pathname string using
    string_view objects and then calls _M_pathname.data() to find the offset
    of each string_view from the start of the string. However because
    _M_pathname is non-const that will cause a COW string to reallocate if
    it happens to be shared with another string object. This results in the
    offsets calculated for each component being wrong (i.e. undefined)
    because the string views no longer refer to substrings of the
    _M_pathname member. The fix is to use the parse.offset(c) member which
    gets the offset safely.

    The bug only happens for the path(string_type&&) constructor and only
    for COW strings. When constructed from an lvalue string the string's
    contents are copied rather than just incrementing the refcount, so
    there's no reallocation when calling the non-const data() member. The
    testsuite changes check the lvalue case anyway, because we should
    probably change the deep copying to just be a refcount increment (by
    adding a path(const string_type&) constructor or an overload for
    __effective_range(const string_type&), for COW strings only).

    libstdc++-v3/ChangeLog:

            PR libstdc++/99805
            * src/c++17/fs_path.cc (path::_M_split_cmpts): Do not call
            non-const member on _M_pathname, to avoid copy-on-write.
            * testsuite/27_io/filesystem/path/decompose/parent_path.cc:
            Check construction from strings that might be shared.

    (cherry picked from commit e06d3f5dd7d0c6b4a20fe813e6ee5addd097f560)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug c++/99805] [9 Regression] filesystem::path::parent_path got a wrong path
  2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
                   ` (10 preceding siblings ...)
  2021-04-19 11:31 ` cvs-commit at gcc dot gnu.org
@ 2021-04-19 11:31 ` redi at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: redi at gcc dot gnu.org @ 2021-04-19 11:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99805

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #11 from Jonathan Wakely <redi at gcc dot gnu.org> ---
And 9.4 too.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-04-19 11:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-29  3:05 [Bug c++/99805] New: filesystem::path::parent_path got a wrong path drfeng08 at gmail dot com
2021-03-29 15:15 ` [Bug c++/99805] " redi at gcc dot gnu.org
2021-04-01  6:15 ` drfeng08 at gmail dot com
2021-04-01 18:02 ` redi at gcc dot gnu.org
2021-04-01 18:03 ` [Bug c++/99805] [9/10/11 Regression] " redi at gcc dot gnu.org
2021-04-01 20:46 ` redi at gcc dot gnu.org
2021-04-01 21:06 ` redi at gcc dot gnu.org
2021-04-07 15:40 ` cvs-commit at gcc dot gnu.org
2021-04-07 15:43 ` [Bug c++/99805] [9/10 " redi at gcc dot gnu.org
2021-04-08 17:00 ` cvs-commit at gcc dot gnu.org
2021-04-08 17:04 ` [Bug c++/99805] [9 " redi at gcc dot gnu.org
2021-04-19 11:31 ` cvs-commit at gcc dot gnu.org
2021-04-19 11:31 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).