public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [libstdc++] Improve M_check_len
       [not found]       ` <CACb0b4nt=mNKYS7+Fwnrf0Kq4umLu0M7Z-SotzCbx7=oEs021g@mail.gmail.com>
@ 2023-06-19 15:14         ` Jonathan Wakely
  2023-06-19 15:35         ` Jonathan Wakely
  1 sibling, 0 replies; 8+ messages in thread
From: Jonathan Wakely @ 2023-06-19 15:14 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jan Hubicka, gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 2664 bytes --]

P.S. please CC libstdc++@gcc.gnu.org for all libstdc++ patches.

On Mon, 19 Jun 2023 at 16:13, Jonathan Wakely <jwakely@redhat.com> wrote:

> On Mon, 19 Jun 2023 at 12:20, Jakub Jelinek wrote:
>
>> On Mon, Jun 19, 2023 at 01:05:36PM +0200, Jan Hubicka via Gcc-patches
>> wrote:
>> > -     if (max_size() - size() < __n)
>> > -       __throw_length_error(__N(__s));
>> > +     const size_type __max_size = max_size();
>> > +     // On 64bit systems vectors can not reach overflow by growing
>> > +     // by small sizes; before this happens, we will run out of memory.
>> > +     if (__builtin_constant_p(__n)
>> > +         && __builtin_constant_p(__max_size)
>> > +         && sizeof(ptrdiff_t) >= 8
>> > +         && __max_size * sizeof(_Tp) >= ((ptrdiff_t)1 << 60)
>>
>> Isn't there a risk of overlow in the __max_size * sizeof(_Tp) computation?
>>
>
> For std::allocator, no, because max_size() is size_t(-1) / sizeof(_Tp).
> But for a user-defined allocator that has a silly max_size(), yes, that's
> possible.
>
> I still don't really understand why any change is needed here. The PR says
> that the current _M_check_len brings in the EH code, but how/why does that
> happen? The __throw_length_error function is not inline, it's defined in
> libstdc++.so, so why isn't it just an extern call? Is the problem that it
> makes _M_check_len potentially-throwing? Because that's basically the
> entire point of _M_check_len: to throw the exception that is required by
> the C++ standard. We need to be very careful about removing that required
> throw! And after we call _M_check_len we call allocate unconditionally, so
> _M_realloc_insert can always throw (we only call _M_realloc_insert in the
> case where we've already decided a reallocation is definitely needed).
>
> Would this version of _M_check_len help?
>
>       size_type
>       _M_check_len(size_type __n, const char* __s) const
>       {
>         const size_type __size = size();
>         const size_type __max_size = max_size();
>
>         if (__is_same(allocator_type, allocator<_Tp>)
>               && __size > __max_size / 2)
>           __builtin_unreachable(); // Assume std::allocator can't fill
> memory.
>         else if (__size > __max_size)
>           __builtin_unreachable();
>
>         if (__max_size - __size < __n)
>           __throw_length_error(__N(__s));
>
>         const size_type __len = __size + (std::max)(__size, __n);
>         return (__len < __size || __len > __max_size) ? __max_size : __len;
>       }
>
> This only applies to std::allocator, not user-defined allocators (because
> we don't know their semantics). It also seems like less of a big hack!
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [libstdc++] Improve M_check_len
       [not found]       ` <CACb0b4nt=mNKYS7+Fwnrf0Kq4umLu0M7Z-SotzCbx7=oEs021g@mail.gmail.com>
  2023-06-19 15:14         ` [libstdc++] Improve M_check_len Jonathan Wakely
@ 2023-06-19 15:35         ` Jonathan Wakely
  2023-06-20  7:50           ` Jan Hubicka
  1 sibling, 1 reply; 8+ messages in thread
From: Jonathan Wakely @ 2023-06-19 15:35 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jan Hubicka, gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 2367 bytes --]

On Mon, 19 Jun 2023 at 16:13, Jonathan Wakely <jwakely@redhat.com> wrote:

> On Mon, 19 Jun 2023 at 12:20, Jakub Jelinek wrote:
>
>> On Mon, Jun 19, 2023 at 01:05:36PM +0200, Jan Hubicka via Gcc-patches
>> wrote:
>> > -     if (max_size() - size() < __n)
>> > -       __throw_length_error(__N(__s));
>> > +     const size_type __max_size = max_size();
>> > +     // On 64bit systems vectors can not reach overflow by growing
>> > +     // by small sizes; before this happens, we will run out of memory.
>> > +     if (__builtin_constant_p(__n)
>> > +         && __builtin_constant_p(__max_size)
>> > +         && sizeof(ptrdiff_t) >= 8
>> > +         && __max_size * sizeof(_Tp) >= ((ptrdiff_t)1 << 60)
>>
>> Isn't there a risk of overlow in the __max_size * sizeof(_Tp) computation?
>>
>
> For std::allocator, no, because max_size() is size_t(-1) / sizeof(_Tp).
> But for a user-defined allocator that has a silly max_size(), yes, that's
> possible.
>
> I still don't really understand why any change is needed here. The PR says
> that the current _M_check_len brings in the EH code, but how/why does that
> happen? The __throw_length_error function is not inline, it's defined in
> libstdc++.so, so why isn't it just an extern call? Is the problem that it
> makes _M_check_len potentially-throwing? Because that's basically the
> entire point of _M_check_len: to throw the exception that is required by
> the C++ standard. We need to be very careful about removing that required
> throw! And after we call _M_check_len we call allocate unconditionally, so
> _M_realloc_insert can always throw (we only call _M_realloc_insert in the
> case where we've already decided a reallocation is definitely needed).
>
> Would this version of _M_check_len help?
>
>       size_type
>       _M_check_len(size_type __n, const char* __s) const
>       {
>         const size_type __size = size();
>         const size_type __max_size = max_size();
>
>         if (__is_same(allocator_type, allocator<_Tp>)
>               && __size > __max_size / 2)
>

This check is wrong for C++17 and older standards, because max_size()
changed value in C++20.

In C++17 it was PTRDIFF_MAX / sizeof(T) but in C++20 it's SIZE_MAX /
sizeof(T). So on 32-bit targets using C++17, it's possible a std::vector
could use PTRDIFF_MAX/2 bytes, and then the size <= max_size/2 assumption
would not hold.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [libstdc++] Improve M_check_len
  2023-06-19 15:35         ` Jonathan Wakely
@ 2023-06-20  7:50           ` Jan Hubicka
  2023-06-20  8:05             ` Jan Hubicka
  2023-06-20  8:07             ` Jakub Jelinek
  0 siblings, 2 replies; 8+ messages in thread
From: Jan Hubicka @ 2023-06-20  7:50 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, gcc-patches, libstdc++

> >
> >       size_type
> >       _M_check_len(size_type __n, const char* __s) const
> >       {
> >         const size_type __size = size();
> >         const size_type __max_size = max_size();
> >
> >         if (__is_same(allocator_type, allocator<_Tp>)
> >               && __size > __max_size / 2)
> >
> 
> This check is wrong for C++17 and older standards, because max_size()
> changed value in C++20.
> 
> In C++17 it was PTRDIFF_MAX / sizeof(T) but in C++20 it's SIZE_MAX /
> sizeof(T). So on 32-bit targets using C++17, it's possible a std::vector
> could use PTRDIFF_MAX/2 bytes, and then the size <= max_size/2 assumption
> would not hold.

Can we go with this perhaps only for 64bit targets?
I am not sure how completely safe this idea is in 32bit world: I guess
one can have OS that lets you to allocate half of address space as one
allocation.

Thanks!
Honza

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [libstdc++] Improve M_check_len
  2023-06-20  7:50           ` Jan Hubicka
@ 2023-06-20  8:05             ` Jan Hubicka
  2023-06-20  8:07             ` Jakub Jelinek
  1 sibling, 0 replies; 8+ messages in thread
From: Jan Hubicka @ 2023-06-20  8:05 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, gcc-patches, libstdc++

> > >
> > >       size_type
> > >       _M_check_len(size_type __n, const char* __s) const
> > >       {
> > >         const size_type __size = size();
> > >         const size_type __max_size = max_size();
> > >
> > >         if (__is_same(allocator_type, allocator<_Tp>)
> > >               && __size > __max_size / 2)
> > >
> > 
> > This check is wrong for C++17 and older standards, because max_size()
> > changed value in C++20.
> > 
> > In C++17 it was PTRDIFF_MAX / sizeof(T) but in C++20 it's SIZE_MAX /
> > sizeof(T). So on 32-bit targets using C++17, it's possible a std::vector
> > could use PTRDIFF_MAX/2 bytes, and then the size <= max_size/2 assumption
> > would not hold.
> 
> Can we go with this perhaps only for 64bit targets?
> I am not sure how completely safe this idea is in 32bit world: I guess
> one can have OS that lets you to allocate half of address space as one
> allocation.

Perhaps something like:
  size > std::min ((uint64_t)__max_size, ((uint64_t)1 << 62) / sizeof (_Tp))
is safe for all allocators and 32bit, so we won't need __is_same test
and test for 64bit?

Honza
> 
> Thanks!
> Honza

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [libstdc++] Improve M_check_len
  2023-06-20  7:50           ` Jan Hubicka
  2023-06-20  8:05             ` Jan Hubicka
@ 2023-06-20  8:07             ` Jakub Jelinek
  2023-06-20  8:21               ` Andreas Schwab
  1 sibling, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2023-06-20  8:07 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jonathan Wakely, gcc-patches, libstdc++

On Tue, Jun 20, 2023 at 09:50:25AM +0200, Jan Hubicka wrote:
> > >
> > >       size_type
> > >       _M_check_len(size_type __n, const char* __s) const
> > >       {
> > >         const size_type __size = size();
> > >         const size_type __max_size = max_size();
> > >
> > >         if (__is_same(allocator_type, allocator<_Tp>)
> > >               && __size > __max_size / 2)
> > >
> > 
> > This check is wrong for C++17 and older standards, because max_size()
> > changed value in C++20.
> > 
> > In C++17 it was PTRDIFF_MAX / sizeof(T) but in C++20 it's SIZE_MAX /
> > sizeof(T). So on 32-bit targets using C++17, it's possible a std::vector
> > could use PTRDIFF_MAX/2 bytes, and then the size <= max_size/2 assumption
> > would not hold.
> 
> Can we go with this perhaps only for 64bit targets?
> I am not sure how completely safe this idea is in 32bit world: I guess
> one can have OS that lets you to allocate half of address space as one
> allocation.

Is it safe even on 64bit targets?  I mean, doesn't say PowerPC already allow
full 64-bit virtual address space?  The assumption that one can't have
more than half of virtual address space allocations is true right now at
least on x86-64, aarch64 and others, but isn't that something that can
change with newer versions of CPUs without the need to recompile
applications (add another level or two of page tables)?
By being hardcoded in libstdc++ headers those assumptions will be hardcoded
in lots of applications.

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [libstdc++] Improve M_check_len
  2023-06-20  8:07             ` Jakub Jelinek
@ 2023-06-20  8:21               ` Andreas Schwab
  2023-06-20 10:45                 ` Jonathan Wakely
  0 siblings, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2023-06-20  8:21 UTC (permalink / raw)
  To: Jakub Jelinek via Gcc-patches
  Cc: Jan Hubicka, Jakub Jelinek, Jonathan Wakely, libstdc++

On Jun 20 2023, Jakub Jelinek via Gcc-patches wrote:

> Is it safe even on 64bit targets?  I mean, doesn't say PowerPC already allow
> full 64-bit virtual address space?  The assumption that one can't have
> more than half of virtual address space allocations is true right now at
> least on x86-64, aarch64 and others, but isn't that something that can
> change with newer versions of CPUs without the need to recompile
> applications (add another level or two of page tables)?

At least s390 can allocate more than half the address space.  That
triggered a failure in gawk.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [libstdc++] Improve M_check_len
  2023-06-20  8:21               ` Andreas Schwab
@ 2023-06-20 10:45                 ` Jonathan Wakely
  2023-06-20 10:50                   ` Jonathan Wakely
  0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Wakely @ 2023-06-20 10:45 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Jakub Jelinek via Gcc-patches, Jan Hubicka, Jakub Jelinek, libstdc++

[-- Attachment #1: Type: text/plain, Size: 1035 bytes --]

On Tue, 20 Jun 2023 at 09:21, Andreas Schwab wrote:

> On Jun 20 2023, Jakub Jelinek via Gcc-patches wrote:
>
> > Is it safe even on 64bit targets?  I mean, doesn't say PowerPC already
> allow
> > full 64-bit virtual address space?  The assumption that one can't have
> > more than half of virtual address space allocations is true right now at
> > least on x86-64, aarch64 and others, but isn't that something that can
> > change with newer versions of CPUs without the need to recompile
> > applications (add another level or two of page tables)?
>
> At least s390 can allocate more than half the address space.  That
> triggered a failure in gawk.
>

Is PTRDIFF_MAX large enough to represent the difference between any two
pointers?

What we're considering for libstdc++ is treating PTRDIFF_MAX as an upper
limit on allocation size. If there are targets that can really allocate a
2^63 byte array, they won't be able to represent the difference between the
first element and the last element unless ptrdiff_t is wider than 64 bits.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [libstdc++] Improve M_check_len
  2023-06-20 10:45                 ` Jonathan Wakely
@ 2023-06-20 10:50                   ` Jonathan Wakely
  0 siblings, 0 replies; 8+ messages in thread
From: Jonathan Wakely @ 2023-06-20 10:50 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Jakub Jelinek via Gcc-patches, Jan Hubicka, Jakub Jelinek, libstdc++

[-- Attachment #1: Type: text/plain, Size: 1456 bytes --]

On Tue, 20 Jun 2023 at 11:45, Jonathan Wakely <jwakely@redhat.com> wrote:

> On Tue, 20 Jun 2023 at 09:21, Andreas Schwab wrote:
>
>> On Jun 20 2023, Jakub Jelinek via Gcc-patches wrote:
>>
>> > Is it safe even on 64bit targets?  I mean, doesn't say PowerPC already
>> allow
>> > full 64-bit virtual address space?  The assumption that one can't have
>> > more than half of virtual address space allocations is true right now at
>> > least on x86-64, aarch64 and others, but isn't that something that can
>> > change with newer versions of CPUs without the need to recompile
>> > applications (add another level or two of page tables)?
>>
>> At least s390 can allocate more than half the address space.  That
>> triggered a failure in gawk.
>>
>
> Is PTRDIFF_MAX large enough to represent the difference between any two
> pointers?
>
> What we're considering for libstdc++ is treating PTRDIFF_MAX as an upper
> limit on allocation size. If there are targets that can really allocate a
> 2^63 byte array, they won't be able to represent the difference between the
> first element and the last element unless ptrdiff_t is wider than 64 bits.
>

Of course if we're talking about 32-bit targets then you only need a 64-bit
ptrdiff_t to allow arrays larger than 2^31 bytes. In any case, PTRDIFF_MAX
is still an upper limit (although for a 64-bit ptrdiff_t and a 32-bit
address space, it's not a useful limit, because it's much much larger than
the real limit).

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-06-20 10:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <ZI9MmdQ+OMehcdeg@kam.mff.cuni.cz>
     [not found] ` <CACb0b4kgHajswPwumWtjLeXOMt75tt0jJhmb1-JZ4wzrfXvB4A@mail.gmail.com>
     [not found]   ` <ZJA2gJzMkigiwqkZ@kam.mff.cuni.cz>
     [not found]     ` <ZJA5/fuVUNpkc3ms@tucnak>
     [not found]       ` <CACb0b4nt=mNKYS7+Fwnrf0Kq4umLu0M7Z-SotzCbx7=oEs021g@mail.gmail.com>
2023-06-19 15:14         ` [libstdc++] Improve M_check_len Jonathan Wakely
2023-06-19 15:35         ` Jonathan Wakely
2023-06-20  7:50           ` Jan Hubicka
2023-06-20  8:05             ` Jan Hubicka
2023-06-20  8:07             ` Jakub Jelinek
2023-06-20  8:21               ` Andreas Schwab
2023-06-20 10:45                 ` Jonathan Wakely
2023-06-20 10:50                   ` Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).