public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/107376] New: regex executor requires allocator to be default constructible
@ 2022-10-24 12:46 listcrawler at gmail dot com
  2022-10-24 23:33 ` [Bug libstdc++/107376] " redi at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: listcrawler at gmail dot com @ 2022-10-24 12:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107376

            Bug ID: 107376
           Summary: regex executor requires allocator to be default
                    constructible
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: listcrawler at gmail dot com
  Target Milestone: ---

To recreate problem:

#include <regex>
#include <memory>

template <typename T>
struct Alloc
{
    using value_type = T;

    T* allocate(std::size_t n)
    {
        return std::allocator<T>{}.allocate(n);
    }

    void deallocate(T* p, std::size_t n)
    {
        std::allocator<T>{}.deallocate(p, n);
    }

    ~Alloc() = default;
  //explicit Alloc() = default;
    explicit Alloc(int) {}

    Alloc(Alloc const&) = default;
    Alloc(Alloc     && rhs) = default;
    Alloc& operator=(Alloc const&) = default;
    Alloc& operator=(Alloc     && rhs) = default;

    template<typename U> Alloc(Alloc<U> const& rhs) {}
    template<typename U> Alloc(Alloc<U>     && rhs) {}
    template<typename U> Alloc& operator=(Alloc<U> const& rhs) {}
    template<typename U> Alloc& operator=(Alloc<U>     && rhs) {}
};

template<typename T, typename U>
bool operator==(Alloc<T> const&, Alloc<U> const&)
{
    return true;
}

template<typename T, typename U>
bool operator!=(Alloc<T> const&, Alloc<U> const&)
{
    return false;
}

// ===========================================================================

template<typename T> using A = Alloc<T>;

using S        = std::string;
using SIt      = typename S::const_iterator;
using SubMatch = std::sub_match<SIt>;
using Match    = std::match_results<SIt, A<SubMatch>>;

int main()
{
    S          s {"foo"};
    Match      m {A<SubMatch>{0}};
    std::regex r {"foo"};
    std::regex_match(s, m, r);
}

Suggested patch:

diff --git a/libstdc++-v3/include/bits/regex_executor.h
b/libstdc++-v3/include/bits/regex_executor.h
index dc0878ce6..080a1134e 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -71,7 +71,8 @@ namespace __detail
                _ResultsVec&    __results,
                const _RegexT&  __re,
                _FlagT          __flags)
-      : _M_begin(__begin),
+      : _M_cur_results(__results.get_allocator())
+      , _M_begin(__begin),
       _M_end(__end),
       _M_re(__re),
       _M_nfa(*__re._M_automaton),

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/107376] regex executor requires allocator to be default constructible
  2022-10-24 12:46 [Bug libstdc++/107376] New: regex executor requires allocator to be default constructible listcrawler at gmail dot com
@ 2022-10-24 23:33 ` redi at gcc dot gnu.org
  2022-10-25 10:18 ` redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2022-10-24 23:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107376

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-10-24

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/107376] regex executor requires allocator to be default constructible
  2022-10-24 12:46 [Bug libstdc++/107376] New: regex executor requires allocator to be default constructible listcrawler at gmail dot com
  2022-10-24 23:33 ` [Bug libstdc++/107376] " redi at gcc dot gnu.org
@ 2022-10-25 10:18 ` redi at gcc dot gnu.org
  2022-10-28 14:27 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2022-10-25 10:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107376

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |redi at gcc dot gnu.org

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Thanks for the report. The patch is necessary, but not sufficient.

We should probably have used the same allocator for _M_rep_count but doing that
now would be an ABI break.

We should definitely have used the same allocator for _M_states, and for its
_ResultsVec members, which would also be an ABI break, but we can mitigate that
with:

--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -124,9 +124,10 @@ namespace __detail
            break;
          std::fill_n(_M_states._M_visited_states, _M_nfa.size(), false);
          auto __old_queue = std::move(_M_states._M_match_queue);
+         auto __alloc = _M_cur_results.get_allocator();
          for (auto& __task : __old_queue)
            {
-             _M_cur_results = std::move(__task.second);
+             _M_cur_results = _ResultsVec(std::move(__task.second), __alloc);
              _M_dfs(__match_mode, __task.first);
            }
          if (__match_mode == _Match_mode::_Prefix)

With this change and the original patch suggestion, _M_cur_results always has
the same allocator as _M_results and so _Executor::_M_handle_accept won't
replace the allocator in _M_results when assigning the new results to it.

We could also use the same allocator for the local vector here:

    bool _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
    _M_lookahead(_StateIdT __next)
    {
      // Backreferences may refer to captured content.
      // We may want to make this faster by not copying,
      // but let's not be clever prematurely.
      _ResultsVec __what(_M_cur_results);

Or we can rely on select_on_container_copy_construction() to do "the right
thing". Users of non-propagating allocators would probably prefer that, but it
means that matching with std::pmr::match_results will always use new/delete for
allocating intermediate results. But that's just one of the costs of the PMR
design.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/107376] regex executor requires allocator to be default constructible
  2022-10-24 12:46 [Bug libstdc++/107376] New: regex executor requires allocator to be default constructible listcrawler at gmail dot com
  2022-10-24 23:33 ` [Bug libstdc++/107376] " redi at gcc dot gnu.org
  2022-10-25 10:18 ` redi at gcc dot gnu.org
@ 2022-10-28 14:27 ` cvs-commit at gcc dot gnu.org
  2022-10-28 14:48 ` redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-10-28 14:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107376

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:988dd22ec6665117e8587389ac85389f1c321c45

commit r13-3548-g988dd22ec6665117e8587389ac85389f1c321c45
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Oct 25 13:03:12 2022 +0100

    libstdc++: Fix allocator propagation in regex algorithms [PR107376]

    The PR points out that we assume the match_results allocator is default
    constuctible, which might not be true. We also have a related issue with
    unwanted propagation from an object that might have an unequal
    allocator.

    Ideally we use the same allocator type for _State_info::_M_match_queue
    but that would be an ABI change now. We should investigate if that can
    be done without breaking anything, which might be possible because the
    _Executor object is short-lived and never leaks out of the regex_match,
    regex_search, and regex_replace algorithms. If we change the mangled
    name for _Executor then there would be no ODR violations when mixing old
    and new definitions. This commit does not attempt that.

    libstdc++-v3/ChangeLog:

            PR libstdc++/107376
            * include/bits/regex_executor.h (_Executor::_Executor): Use same
            allocator for _M_cur_results and _M_results.
            * include/bits/regex_executor.tcc (_Executor::_M_main_dispatch):
            Prevent possibly incorrect allocator propagating to
            _M_cur_results.
            * testsuite/28_regex/algorithms/regex_match/107376.cc: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/107376] regex executor requires allocator to be default constructible
  2022-10-24 12:46 [Bug libstdc++/107376] New: regex executor requires allocator to be default constructible listcrawler at gmail dot com
                   ` (2 preceding siblings ...)
  2022-10-28 14:27 ` cvs-commit at gcc dot gnu.org
@ 2022-10-28 14:48 ` redi at gcc dot gnu.org
  2024-03-18 14:06 ` cvs-commit at gcc dot gnu.org
  2024-03-18 14:15 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2022-10-28 14:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107376

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |13.0

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Fixed for GCC 13. Might be worth backporting too.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/107376] regex executor requires allocator to be default constructible
  2022-10-24 12:46 [Bug libstdc++/107376] New: regex executor requires allocator to be default constructible listcrawler at gmail dot com
                   ` (3 preceding siblings ...)
  2022-10-28 14:48 ` redi at gcc dot gnu.org
@ 2024-03-18 14:06 ` cvs-commit at gcc dot gnu.org
  2024-03-18 14:15 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-03-18 14:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107376

--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:5c156f5be49007a5434452335f29844eb17868a6

commit r12-10258-g5c156f5be49007a5434452335f29844eb17868a6
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Oct 25 13:03:12 2022 +0100

    libstdc++: Fix allocator propagation in regex algorithms [PR107376]

    The PR points out that we assume the match_results allocator is default
    constuctible, which might not be true. We also have a related issue with
    unwanted propagation from an object that might have an unequal
    allocator.

    Ideally we use the same allocator type for _State_info::_M_match_queue
    but that would be an ABI change now. We should investigate if that can
    be done without breaking anything, which might be possible because the
    _Executor object is short-lived and never leaks out of the regex_match,
    regex_search, and regex_replace algorithms. If we change the mangled
    name for _Executor then there would be no ODR violations when mixing old
    and new definitions. This commit does not attempt that.

    libstdc++-v3/ChangeLog:

            PR libstdc++/107376
            * include/bits/regex_executor.h (_Executor::_Executor): Use same
            allocator for _M_cur_results and _M_results.
            * include/bits/regex_executor.tcc (_Executor::_M_main_dispatch):
            Prevent possibly incorrect allocator propagating to
            _M_cur_results.
            * testsuite/28_regex/algorithms/regex_match/107376.cc: New test.

    (cherry picked from commit 988dd22ec6665117e8587389ac85389f1c321c45)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/107376] regex executor requires allocator to be default constructible
  2022-10-24 12:46 [Bug libstdc++/107376] New: regex executor requires allocator to be default constructible listcrawler at gmail dot com
                   ` (4 preceding siblings ...)
  2024-03-18 14:06 ` cvs-commit at gcc dot gnu.org
@ 2024-03-18 14:15 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2024-03-18 14:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107376

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|13.0                        |12.4

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Backported for 12.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-18 14:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-24 12:46 [Bug libstdc++/107376] New: regex executor requires allocator to be default constructible listcrawler at gmail dot com
2022-10-24 23:33 ` [Bug libstdc++/107376] " redi at gcc dot gnu.org
2022-10-25 10:18 ` redi at gcc dot gnu.org
2022-10-28 14:27 ` cvs-commit at gcc dot gnu.org
2022-10-28 14:48 ` redi at gcc dot gnu.org
2024-03-18 14:06 ` cvs-commit at gcc dot gnu.org
2024-03-18 14:15 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).