public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/108886] New: Add basic_string throw logic_error when assigned a nullptr
@ 2023-02-22 14:31 jg at jguk dot org
  2023-02-22 15:57 ` [Bug libstdc++/108886] " redi at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: jg at jguk dot org @ 2023-02-22 14:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108886

            Bug ID: 108886
           Summary: Add basic_string throw logic_error when assigned a
                    nullptr
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jg at jguk dot org
  Target Milestone: ---

Checked this on godbolt trunk today.
https://godbolt.org/z/6xxEc85c9

basic_string.h will throw a logic_error at runtime if a nullptr gets through to
the basic_string() constructor. But assignment doesn't throw a logic_error, it
gives SEGV.

Could I suggest two improvements please:

1) Add throw logic_error to basic_string.h:815 if pointer is nullptr.
      _GLIBCXX20_CONSTEXPR
      basic_string&
      operator=(const _CharT* __s)
      { return this->assign(__s); }

2) Add throw logic_error to basic_string:1647

_GLIBCXX20_CONSTEXPR
      basic_string&
      assign(const _CharT* __s)
      {
        __glibcxx_requires_string(__s);
        return _M_replace(size_type(0), this->size(), __s,
                          traits_type::length(__s));
      }


This is what basic_string.h has for normal construction
std::__throw_logic_error(__N("basic_string: "
                                       "construction from null is not valid"));



The basic_string assignment = uses char_traits to check the length using
__builtin_strlen and then SEGV.

I believe it is the actual __builtin_strlen that does the nullptr dereference.

GDB output
Core was generated by `./str2'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/strlen-sse2.S:142
142     ../sysdeps/x86_64/multiarch/strlen-sse2.S: No such file or directory.
(gdb) bt
#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/strlen-sse2.S:142
#1  0x000055ca94a8327e in std::char_traits<char>::length (__s=0x0) at
/usr/include/c++/12/bits/char_traits.h:395
#2  std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::assign (__s=0x0, this=0x7fff26f1e370) at
/usr/include/c++/12/bits/basic_string.h:1647
#3  std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::operator= (__s=0x0, this=0x7fff26f1e370) at
/usr/include/c++/12/bits/basic_string.h:815
#4  make_string (str=str@entry=0x0, out_string="") at str2.cpp:8
#5  0x000055ca94a832d9 in main () at str2.cpp:15






Therefore I propose

1) Add throw logic_error to basic_string.h:815 if pointer is nullptr.
      _GLIBCXX20_CONSTEXPR
      basic_string&
      operator=(const _CharT* __s)
      { return this->assign(__s); }

2) Add throw logic_error to basic_string:1647

_GLIBCXX20_CONSTEXPR
      basic_string&
      assign(const _CharT* __s)
      {
        __glibcxx_requires_string(__s);
        return _M_replace(size_type(0), this->size(), __s,
                          traits_type::length(__s));
      }

3) I don't think char_traits allows exceptions, so I can't suggest a
logic_error
Is there anything else that could be added here? Maybe just a
_GLIBCXX_DEBUG_PEDASSERT ? strlen() doesn't have a way to even set errno and
return -1


This is what char_traits.h has


/usr/include/c++/12/bits/char_traits.h:395:25: runtime error: null pointer
passed as argument 1, which is declared to never be null
AddressSanitizer:DEADLYSIGNAL

/usr/include/c++/12/bits/char_traits.h

      static _GLIBCXX17_CONSTEXPR size_t
      length(const char_type* __s)
      {
#if __cplusplus >= 201703L
        if (std::__is_constant_evaluated())
          return __gnu_cxx::char_traits<char_type>::length(__s);
#endif
        return __builtin_strlen(__s);
      }






Example:

#include <string>
#include <cstdio>

void make_string(const char * const str, std::string & out_string)
{
    out_string = str;
}

int main()
{
    const char * a = NULL;
    std::string str;
    make_string(a, str);
    printf("%s\n", str.c_str());
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/108886] Add basic_string throw logic_error when assigned a nullptr
  2023-02-22 14:31 [Bug libstdc++/108886] New: Add basic_string throw logic_error when assigned a nullptr jg at jguk dot org
@ 2023-02-22 15:57 ` redi at gcc dot gnu.org
  2023-02-26 23:02 ` jg at jguk dot org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2023-02-22 15:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108886

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Why are you suggesting adding a check in two places when the first one just
calls the second one?

What would be the point of _GLIBCXX_DEBUG_PEDASSERT when there's already a
debug assertion there? Compiling with _GLIBCXX_DEBUG will already abort.

I don't see the point of adding anything to char_traits at all. It's a very
low-level interface and you probably shouldn't be (and aren't) using it
directly anyway.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/108886] Add basic_string throw logic_error when assigned a nullptr
  2023-02-22 14:31 [Bug libstdc++/108886] New: Add basic_string throw logic_error when assigned a nullptr jg at jguk dot org
  2023-02-22 15:57 ` [Bug libstdc++/108886] " redi at gcc dot gnu.org
@ 2023-02-26 23:02 ` jg at jguk dot org
  2023-02-27  9:23 ` redi at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jg at jguk dot org @ 2023-02-26 23:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108886

--- Comment #2 from Jonny Grant <jg at jguk dot org> ---
(In reply to Jonathan Wakely from comment #1)
> Why are you suggesting adding a check in two places when the first one just
> calls the second one?

Feels clearer in the callstack if operator= throws when nullptr is passed.
Likewise if assign() is called with nullptr it looks clearer if the throw is
from there. Of course it may be enough just to put it in assign() if that is
your preference.

I was taught to validate parameters at University. Personally I always follow
defensive programming approaches to avoid crashes. So I would check parameters
on all interface methods and operators. I would rely on the compiler to remove
any unnecessary duplicate sanity checks.

> What would be the point of _GLIBCXX_DEBUG_PEDASSERT when there's already a
> debug assertion there? Compiling with _GLIBCXX_DEBUG will already abort.

I don't see a debug assertion for _GLIBCXX_DEBUG_PEDASSERT  could you point out
the file and line number to me please.

Just compiled with -D_GLIBCXX_DEBUG but I don't get any abort, just the same
SEGV
https://godbolt.org/z/rjYG8Yrnh

> 
> I don't see the point of adding anything to char_traits at all. It's a very
> low-level interface and you probably shouldn't be (and aren't) using it
> directly anyway.

Sounds reasonable, it's true I'm not using char_traits.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/108886] Add basic_string throw logic_error when assigned a nullptr
  2023-02-22 14:31 [Bug libstdc++/108886] New: Add basic_string throw logic_error when assigned a nullptr jg at jguk dot org
  2023-02-22 15:57 ` [Bug libstdc++/108886] " redi at gcc dot gnu.org
  2023-02-26 23:02 ` jg at jguk dot org
@ 2023-02-27  9:23 ` redi at gcc dot gnu.org
  2023-03-12 23:03 ` jg at jguk dot org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2023-02-27  9:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108886

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jonny Grant from comment #2)
> I was taught to validate parameters at University. Personally I always
> follow defensive programming approaches to avoid crashes.

So stop passing null to these functions then :-)

> So I would check
> parameters on all interface methods and operators. I would rely on the
> compiler to remove any unnecessary duplicate sanity checks.

That doesn't necessarily work when the member functions are separately
compiled, as with most std::string member functions.

> > What would be the point of _GLIBCXX_DEBUG_PEDASSERT when there's already a
> > debug assertion there? Compiling with _GLIBCXX_DEBUG will already abort.
> 
> I don't see a debug assertion for _GLIBCXX_DEBUG_PEDASSERT  could you point
> out the file and line number to me please.

You already quoted it in your comment 0 above, it's right there in assign(const
_CharT*)!

      basic_string&
      assign(const _CharT* __s)
      {
        __glibcxx_requires_string(__s);


> Just compiled with -D_GLIBCXX_DEBUG but I don't get any abort, just the same
> SEGV
> https://godbolt.org/z/rjYG8Yrnh

If you want a PEDASSERT to fire you need to actually request pedantic
assertions.

https://godbolt.org/z/874x18G1G

/opt/compiler-explorer/gcc-trunk-20230227/include/c++/13.0.1/bits/basic_string.h:1645:
constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc>&
std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::assign(const _CharT*)
[with _CharT = char; _Traits = std::char_traits<char>; _Alloc =
std::allocator<char>]: Assertion '__s != nullptr' failed.

I'm not persuaded to change anything here. The performance of string
assignments is very important and adding an extra branch and throwing an
exception isn't free.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/108886] Add basic_string throw logic_error when assigned a nullptr
  2023-02-22 14:31 [Bug libstdc++/108886] New: Add basic_string throw logic_error when assigned a nullptr jg at jguk dot org
                   ` (2 preceding siblings ...)
  2023-02-27  9:23 ` redi at gcc dot gnu.org
@ 2023-03-12 23:03 ` jg at jguk dot org
  2023-03-13 10:20 ` redi at gcc dot gnu.org
  2023-03-13 10:26 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jg at jguk dot org @ 2023-03-12 23:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108886

--- Comment #4 from Jonny Grant <jg at jguk dot org> ---
(In reply to Jonathan Wakely from comment #3)
> (In reply to Jonny Grant from comment #2)
> > I was taught to validate parameters at University. Personally I always
> > follow defensive programming approaches to avoid crashes.
>
> So stop passing null to these functions then :-)

Feels like we have a different opinion.  I know you know more about C++ than
me.

Defensive programming is carefulness in libraries too; particularly interfaces.
The C++ standard group have a paper "P2698R0 Unconditional termination is a
serious problem" you may have seen.

What I see is that the nullptr keyword is part of the language, and is used in
many places to indicate an object is not available. For me, a function that
requires a valid pointer, should check for the nullptr constant. The crash is
the code that de-references the nullptr because it requires a valid pointer. Of
course I would like all code to check for nullptr before passing to
std::string, and I would do those checks myself as well :-)

A nullptr is different from passing a use-after-free pointer.

Wrapping std::string is the workaround, as it doesn't always throw logic_error
for the nullptr constant.

It would just use up time when the STL implementation could handle it, as need
to check every method on std::string in the codebase to check for nullptr,
because it doesn't throw logic_error say for assign or other operators.

> > So I would check
> > parameters on all interface methods and operators. I would rely on the
> > compiler to remove any unnecessary duplicate sanity checks.
>
> That doesn't necessarily work when the member functions are separately
> compiled, as with most std::string member functions.

Ok, that's a shame. If it showed up in some performance measurements it could
be adjusted.

> > > What would be the point of _GLIBCXX_DEBUG_PEDASSERT when there's already a
> > > debug assertion there? Compiling with _GLIBCXX_DEBUG will already abort.
> >
> > I don't see a debug assertion for _GLIBCXX_DEBUG_PEDASSERT  could you point
> > out the file and line number to me please.
>
> You already quoted it in your comment 0 above, it's right there in
> assign(const _CharT*)!
>
>       basic_string&
>       assign(const _CharT* __s)
>       {
> 	__glibcxx_requires_string(__s);

Ok I see. Thank you for that example __glibcxx_requires_string is a macro that
compiles down to _GLIBCXX_DEBUG_PEDASSERT. I wonder why it wasn't written in
capitals as __GLIBCXX_REQUIRES_STRING_ASSERT?

Anyway, it is good there are asserts that report nullptr issues at runtime.

> > Just compiled with -D_GLIBCXX_DEBUG but I don't get any abort, just the same
> > SEGV
> > https://godbolt.org/z/rjYG8Yrnh
>
> If you want a PEDASSERT to fire you need to actually request pedantic
> assertions.
>
> https://godbolt.org/z/874x18G1G
>
> /opt/compiler-explorer/gcc-trunk-20230227/include/c++/13.0.1/bits/
> basic_string.h:1645: constexpr std::__cxx11::basic_string<_CharT, _Traits,
> _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::assign(const
> _CharT*) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc =
> std::allocator<char>]: Assertion '__s != nullptr' failed.
>
> I'm not persuaded to change anything here. The performance of string
> assignments is very important and adding an extra branch and throwing an
> exception isn't free.

It sounds like you know the implementation really well.

Personally on embedded systems I don't find the cost of checking for NULL an
issue. The 'throw' should not impact performance, as it is within the if() and
code shouldn't often throw, still better than SEGV! Maybe you have some
performance comparison statistics showing that a NULL check really impacts
performance? STL string heap allocations and actual memory read/write are the
biggest performance impact on performance in my experience on embedded.

I doubt all uses of std::string are performance intensive. For performance
critical code identified by a profiler I just re-write that function in C.
Usually graphics code.

There is what might be a performance issue, I don't know if anyone has
measured, _M_construct() appears to copy copies byte by byte in
basic_string.tcc:177 rather than using an optimized memcpy(). I've not measured
this myself.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/108886] Add basic_string throw logic_error when assigned a nullptr
  2023-02-22 14:31 [Bug libstdc++/108886] New: Add basic_string throw logic_error when assigned a nullptr jg at jguk dot org
                   ` (3 preceding siblings ...)
  2023-03-12 23:03 ` jg at jguk dot org
@ 2023-03-13 10:20 ` redi at gcc dot gnu.org
  2023-03-13 10:26 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2023-03-13 10:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108886

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jonny Grant from comment #4)
> It sounds like you know the implementation really well.

I wrote it.

> I doubt all uses of std::string are performance intensive.

Just because some aren't, doesn't mean we can ignore the ones that are.

> For performance
> critical code identified by a profiler I just re-write that function in C.

No. std::string is probably the most widely used vocabulary type in the entire
C++ standard library, and "just use C instead" is utterly unacceptable as a
response to performance concerns.

> Usually graphics code.
> 
> There is what might be a performance issue, I don't know if anyone has
> measured, _M_construct() appears to copy copies byte by byte in
> basic_string.tcc:177 rather than using an optimized memcpy(). I've not
> measured this myself.

You can't use memcpy from an input iterator, the data to be copied isn't
contiguous in memory.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/108886] Add basic_string throw logic_error when assigned a nullptr
  2023-02-22 14:31 [Bug libstdc++/108886] New: Add basic_string throw logic_error when assigned a nullptr jg at jguk dot org
                   ` (4 preceding siblings ...)
  2023-03-13 10:20 ` redi at gcc dot gnu.org
@ 2023-03-13 10:26 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2023-03-13 10:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108886

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jonny Grant from comment #4)
> Defensive programming is carefulness in libraries too; particularly
> interfaces.

And libstdc++ tries to check every checkable precondition with debug
assertions.

> The C++ standard group have a paper "P2698R0 Unconditional termination is a
> serious problem" you may have seen.

That's about C++ contracts. When the library starts to use contracts, you'll be
able to choose how your program handles contract violations, without forcing
that choice on everybody. Until then, libstdc++ uses assertions for
precondition checks.

I'm not going to change basic_string::assign to throw.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-03-13 10:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-22 14:31 [Bug libstdc++/108886] New: Add basic_string throw logic_error when assigned a nullptr jg at jguk dot org
2023-02-22 15:57 ` [Bug libstdc++/108886] " redi at gcc dot gnu.org
2023-02-26 23:02 ` jg at jguk dot org
2023-02-27  9:23 ` redi at gcc dot gnu.org
2023-03-12 23:03 ` jg at jguk dot org
2023-03-13 10:20 ` redi at gcc dot gnu.org
2023-03-13 10:26 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).