public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/108645] New: Change in behavior, std::accumulate doesn't always work as expected in C++20 builds
@ 2023-02-02 21:00 eteran at alum dot rit.edu
  2023-02-02 21:05 ` [Bug libstdc++/108645] " eteran at alum dot rit.edu
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: eteran at alum dot rit.edu @ 2023-02-02 21:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108645

            Bug ID: 108645
           Summary: Change in behavior, std::accumulate doesn't always
                    work as expected in C++20 builds
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: eteran at alum dot rit.edu
  Target Milestone: ---

I encountered an interesting change in behavior today involving playing with
std::move_iterator types using the following code:

```
#include <vector>
#include <iostream>
#include <string>
#include <numeric>
#include <iterator>

void print_v(const char *rem, const std::vector<std::string> &v) {
        std::cout << rem;
        for (const std::string &s : v)
                std::cout << '"' << s << '"' << ' ';
        std::cout << '\n';
}

int main() {

        std::vector<std::string> v = {"this", "_", "is", "_", "an", "_",
"example"};

        print_v("Old contents of the vector: ", v);

        std::string concat =
std::accumulate(std::make_move_iterator(v.begin()),
                                                                               
 std::make_move_iterator(v.end()),
                                                                               
 std::string());


        print_v("New contents of the vector: ", v);
        std::cout << "Concatenated as string: " << '"' << concat << '"' <<
'\n';
}
```

The expected output is:

```
Old contents of the vector: "this" "_" "is" "_" "an" "_" "example" 
New contents of the vector: "" "" "" "" "" "" "" 
Concatenated as string: "this_is_an_example"
```

And that is the output I get when compiling with `-std=c++17`. So the
expectation is that the moved from strings become empty.

However, when using C++20, where `std::accumulate` is now defined to use
`std::move`, the output is this instead:

```
Old contents of the vector: "this" "_" "is" "_" "an" "_" "example" 
New contents of the vector: "this" "_" "is" "_" "an" "_" "example" 
Concatenated as string: "this_is_an_example"
```

A few thoughts:

1. I'm not sure this is a "bug" since moved from objects are in an unspecified,
valid state. Being "unchanged" seems to fit that description just fine.

2. I'm guessing that this is a weird situation where the `operator+` overload
that has a `basic_string &&` and the lhs isn't stealing the guts as expected
for some reason.

Tested with both clang and gcc trunk, so it seems to be a library level issue.

C++20 Example: https://godbolt.org/z/7MMqWbxEr
C++17 Example: https://godbolt.org/z/6YnrrE5zf

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108645] Change in behavior, std::accumulate doesn't always work as expected in C++20 builds
  2023-02-02 21:00 [Bug libstdc++/108645] New: Change in behavior, std::accumulate doesn't always work as expected in C++20 builds eteran at alum dot rit.edu
@ 2023-02-02 21:05 ` eteran at alum dot rit.edu
  2023-02-03  1:40 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: eteran at alum dot rit.edu @ 2023-02-02 21:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108645

--- Comment #1 from Evan Teran <eteran at alum dot rit.edu> ---
To further experiment, i factored out `std::accumulate`:

```
#include <vector>
#include <iostream>
#include <string>
#include <iterator>

void print_v(const char *rem, const std::vector<std::string> &v) {
        std::cout << rem;
        for (const std::string &s : v)
                std::cout << '"' << s << '"' << ' ';
        std::cout << '\n';
}

int main() {

        std::vector<std::string> v = {"this", "_", "is", "_", "an", "_",
"example"};

        print_v("Old contents of the vector: ", v);

        std::string concat;
        auto first = std::make_move_iterator(v.begin());
        auto last  = std::make_move_iterator(v.end());
        for (; first != last; ++first) {
#if __cplusplus >= 202002L
                concat = std::move(concat) + *first;
#else
                concat = concat + *first;
#endif
        }


        print_v("New contents of the vector: ", v);
        std::cout << "Concatenated as string: " << '"' << concat << '"' <<
'\n';
}
```

Which results in the same behavior, so it appears to be that the:

```
basic_string operator+(basic_string &&, basic_string &&)
```

Overload doesn't steal the guts of the rhs at all?

But the 

```
basic_string operator+(const basic_string &, basic_string &&)
```

overload does?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108645] Change in behavior, std::accumulate doesn't always work as expected in C++20 builds
  2023-02-02 21:00 [Bug libstdc++/108645] New: Change in behavior, std::accumulate doesn't always work as expected in C++20 builds eteran at alum dot rit.edu
  2023-02-02 21:05 ` [Bug libstdc++/108645] " eteran at alum dot rit.edu
@ 2023-02-03  1:40 ` pinskia at gcc dot gnu.org
  2023-02-03  9:32 ` redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-02-03  1:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108645

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>Tested with both clang and gcc trunk, so it seems to be a library level issue.

I tested clang with -stdlib=libc++ and it produces the same results as without
that option (which is it uses gcc's libstdc++v3) so it might not be a library
level issue.
https://godbolt.org/z/1qhjMKWhG

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108645] Change in behavior, std::accumulate doesn't always work as expected in C++20 builds
  2023-02-02 21:00 [Bug libstdc++/108645] New: Change in behavior, std::accumulate doesn't always work as expected in C++20 builds eteran at alum dot rit.edu
  2023-02-02 21:05 ` [Bug libstdc++/108645] " eteran at alum dot rit.edu
  2023-02-03  1:40 ` pinskia at gcc dot gnu.org
@ 2023-02-03  9:32 ` redi at gcc dot gnu.org
  2023-02-03 10:01 ` redi at gcc dot gnu.org
  2023-02-03 10:08 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2023-02-03  9:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108645

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Evan Teran from comment #1)
> Which results in the same behavior, so it appears to be that the:
> 
> ```
> basic_string operator+(basic_string &&, basic_string &&)
> ```
> 
> Overload doesn't steal the guts of the rhs at all?

It does if:
- the allocators are equal,
- the LHS does not have sufficient capacity for the result, and
- the RHS does have sufficient capacity for the result.

In your example, all your strings fit in the SSO buffer inside the std::string
object, so the LHS has sufficient capacity for the result. And there's nothing
to steal from the RHS anyway, as it doesn't own any allocated memory.

The observed behaviour is not a bug, because as you say, leaving the RHS
non-empty is a valid state. The implementation is pretty clear:

    {
#if _GLIBCXX_USE_CXX11_ABI
      using _Alloc_traits = allocator_traits<_Alloc>;
      bool __use_rhs = false;
      if _GLIBCXX17_CONSTEXPR (typename _Alloc_traits::is_always_equal{})
        __use_rhs = true;
      else if (__lhs.get_allocator() == __rhs.get_allocator())
        __use_rhs = true;
      if (__use_rhs)
#endif
        {
          const auto __size = __lhs.size() + __rhs.size();
          if (__size > __lhs.capacity() && __size <= __rhs.capacity())
            return std::move(__rhs.insert(0, __lhs));
        }
      return std::move(__lhs.append(__rhs));
    }

If the RHS object is used (because all the conditions above are true) then it
will be moved into the return value and so left empty.

If the LHS is used then the RHS is unchanged.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108645] Change in behavior, std::accumulate doesn't always work as expected in C++20 builds
  2023-02-02 21:00 [Bug libstdc++/108645] New: Change in behavior, std::accumulate doesn't always work as expected in C++20 builds eteran at alum dot rit.edu
                   ` (2 preceding siblings ...)
  2023-02-03  9:32 ` redi at gcc dot gnu.org
@ 2023-02-03 10:01 ` redi at gcc dot gnu.org
  2023-02-03 10:08 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2023-02-03 10:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108645

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jonathan Wakely from comment #3)
> 	  const auto __size = __lhs.size() + __rhs.size();
> 	  if (__size > __lhs.capacity() && __size <= __rhs.capacity())
> 	    return std::move(__rhs.insert(0, __lhs));

It would be possible to change that logic to:

          const auto __size = __lhs.size() + __rhs.size();
          if (__size <= __rhs.capacity() && __rhs.capacity() >
__lhs.capacity())
            return std::move(__rhs.insert(0, __lhs));

That way if both strings have sufficient capacity, we would return the larger
of the two capacities. In general, the returned string is more likely to be the
one that gets reused, and this way it would be more likely to have additional
spare capacity.

In some usage patterns this would be a pessimization, because it would tend to
coalesce all the larger capacities into one string, leaving smaller, less
useful strings behind. Also, inserting into the beginning of the RHS takes more
work than simply appending to the end of the LHS. So I don't think it's clear
that this would be a definite improvement.

It wouldn't change anything in your example anyway, because there is no
dynamically allocated memory anywhere except the accumulator value, which is
always the LHS.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug libstdc++/108645] Change in behavior, std::accumulate doesn't always work as expected in C++20 builds
  2023-02-02 21:00 [Bug libstdc++/108645] New: Change in behavior, std::accumulate doesn't always work as expected in C++20 builds eteran at alum dot rit.edu
                   ` (3 preceding siblings ...)
  2023-02-03 10:01 ` redi at gcc dot gnu.org
@ 2023-02-03 10:08 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2023-02-03 10:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108645

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jonathan Wakely from comment #3)
> In your example, all your strings fit in the SSO buffer inside the
> std::string object, so the LHS has sufficient capacity for the result.

Actually that's only true until the last step of the algorithm. On the last
step (when appending "example" to "this_is_an_") neither the LHS nor the RHS
has capacity for the result, so a reallocation is needed. But it's true that
the RHS has nothing to steal at any step in the algorithm, so we append to the
LHS and let that reallocate itself if needed.

> And
> there's nothing to steal from the RHS anyway, as it doesn't own any
> allocated memory.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-02-03 10:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-02 21:00 [Bug libstdc++/108645] New: Change in behavior, std::accumulate doesn't always work as expected in C++20 builds eteran at alum dot rit.edu
2023-02-02 21:05 ` [Bug libstdc++/108645] " eteran at alum dot rit.edu
2023-02-03  1:40 ` pinskia at gcc dot gnu.org
2023-02-03  9:32 ` redi at gcc dot gnu.org
2023-02-03 10:01 ` redi at gcc dot gnu.org
2023-02-03 10:08 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).