public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO
@ 2023-09-08 22:52 foom at fuhm dot net
  2023-09-09  3:28 ` [Bug libstdc++/111351] " arthur.j.odwyer at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: foom at fuhm dot net @ 2023-09-08 22:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111351

            Bug ID: 111351
           Summary: constexpr std::string objects permitted to escape
                    constant evaluation when SSO
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: foom at fuhm dot net
  Target Milestone: ---

In C++20, libstdc++ currently allows an std::string instance to escape from
constant evaluation to runtime, as long as the string fit within the SSO
length.

E.g., as a global variable, compiled with -std=c++20:
constexpr std::string s1; // OK
constexpr std::string s1 = ""; // OK
constexpr std::string s1 = "0123456789abcde"; // OK
constexpr std::string s2 = "0123456789abcdef"; // FAIL

I believe all of the above ought to fail to compile.

This will result in user code which can be built or not based on whether their
string happens to fit within the SSO string length. I find that quite
unfortunate, since it is supposed to be an internal implementation
detail/optimization, and this makes it effectively part of the API that code
will grow to depend on.

As comparison, libc++ rejects all the above examples, by forcing the SSO-size
to zero in constant evaluation context, so that a pointer to an external
allocation is always used.

This was brought to my attention from 
https://quuxplusone.github.io/blog/2023/09/08/constexpr-string-firewall/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/111351] constexpr std::string objects permitted to escape constant evaluation when SSO
  2023-09-08 22:52 [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO foom at fuhm dot net
@ 2023-09-09  3:28 ` arthur.j.odwyer at gmail dot com
  2023-09-11 12:31 ` de34 at live dot cn
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: arthur.j.odwyer at gmail dot com @ 2023-09-09  3:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111351

--- Comment #1 from Arthur O'Dwyer <arthur.j.odwyer at gmail dot com> ---
(Author of the blog post here.)
In contrast to James' view, I think the libstdc++/MSVC behavior is relatively
easy to explain; I think libc++'s `if consteval` approach is baroque and
confusing. [That is, _both_ behaviors are confusing to the newbie and need
expert explanation, but libc++'s choice is confusing even for the experts, who
have to maintain its split-brain SSO logic forever because Hyrum's Law. If you
have to maintain something forever, you should at least choose to make it
_simple_! As I say in the blog post, in hindsight I think libc++ screwed up
here.]

IMHO it is a feature, not a bug, that I can write these lines:

    constinit std::string s1;
    constinit std::vector<char> v1;

libstdc++ would be within its rights, paper-Standard-wise, to reject both of
these lines; but I don't think libstdc++ _should_ reject either of them.
They're both fine code as far as I'm concerned. I think libc++ is the
user-hostile/broken implementation here, not libstdc++.

Anyone who thinks libstdc++ ought to reject `s1` above should at least be
forced to explain what libstdc++ ought to do about `v1`. From the
user-programmer's POV, there's no difference between a default-initialized
string and a default-initialized vector. Users don't care about these SSO
details; they just want the code to work. That's what libstdc++ currently does.
Good, IMO.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/111351] constexpr std::string objects permitted to escape constant evaluation when SSO
  2023-09-08 22:52 [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO foom at fuhm dot net
  2023-09-09  3:28 ` [Bug libstdc++/111351] " arthur.j.odwyer at gmail dot com
@ 2023-09-11 12:31 ` de34 at live dot cn
  2023-09-11 13:23 ` redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: de34 at live dot cn @ 2023-09-11 12:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111351

Jiang An <de34 at live dot cn> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |de34 at live dot cn

--- Comment #2 from Jiang An <de34 at live dot cn> ---
> This will result in user code which can be built or not based on whether their string happens to fit within the SSO string length. I find that quite unfortunate, since it is supposed to be an internal implementation detail/optimization, and this makes it effectively part of the API that code will grow to depend on.

This comes from the nature of SSO and constexpr variables. I don't think it
worthwhile to revert this PR (https://github.com/microsoft/STL/pull/1735) for
MSVC STL - which makes the codes harder to understand and slows down the
compilation.

> As comparison, libc++ rejects all the above examples, by forcing the SSO-size to zero in constant evaluation context, so that a pointer to an external allocation is always used.


libc++ is, unfortunately, currently unable to implement constexpr SSO, IIUC. I
failed to find a constexpr-compatible way to determine libc++'s std::string is
in short mode if the short mode were enabled in constant evaluation. (As a
result, libc++'s string always uses long mode during constant evaluation.)

https://github.com/llvm/llvm-project/blob/0954dc3fb9214b994623f5306473de075f8e3593/libcxx/include/string#L829-L837

Note that there's no tag outside of the union, so we can't know which variant
member is active without causing constant evaluation failure - unless the
mechanism of std::is_within_lifetime (https://wg21.link/p2641r4) gets
implemented.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/111351] constexpr std::string objects permitted to escape constant evaluation when SSO
  2023-09-08 22:52 [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO foom at fuhm dot net
  2023-09-09  3:28 ` [Bug libstdc++/111351] " arthur.j.odwyer at gmail dot com
  2023-09-11 12:31 ` de34 at live dot cn
@ 2023-09-11 13:23 ` redi at gcc dot gnu.org
  2023-09-11 20:28 ` foom at fuhm dot net
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: redi at gcc dot gnu.org @ 2023-09-11 13:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111351

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Does using __builtin_is_constant_p on the union member not work?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/111351] constexpr std::string objects permitted to escape constant evaluation when SSO
  2023-09-08 22:52 [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO foom at fuhm dot net
                   ` (2 preceding siblings ...)
  2023-09-11 13:23 ` redi at gcc dot gnu.org
@ 2023-09-11 20:28 ` foom at fuhm dot net
  2023-09-12 21:43 ` foom at fuhm dot net
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: foom at fuhm dot net @ 2023-09-11 20:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111351

--- Comment #4 from James Y Knight <foom at fuhm dot net> ---
vector and string are different in one key way: a zero-sized vector has no
accessible storage, while a zero-sized string has 1 byte of readable storage --
the NUL terminator. Because of that, I don't think it's unreasonable for a
zero-length vector to be constinit'able, while a zero-length string is not.

But, certainly the _more_ concerning issue is with non-zero-sized strings where
the validity of the program depends on what exact SSO size was chosen by the
implementation.

> libc++'s choice is confusing even for the experts, who have to maintain its split-brain SSO logic forever because Hyrum's Law

"Hyrum's Law" is exactly why I think it's a mistake to permit SSO-allocated
strings. It makes the SSO length a critical part of the interface. People will
start writing code which assumes that a constexpr global string of size "N"
works, and that will cause problems for other standard libraries which use a
different SSO size "M", if M < N. E.g. if libc++ starts allowing this, then
people who first target libc++ will find that strings up to 22 characters work,
and be surprised/annoyed by libstdc++ failing to build their program.

It is much simpler to say to users "you cannot make a constexpr std::string
unless it lives fully within constant-evaluation-time." then to also add "...OR
unless the size is short enough, where that size depends on your
implementation."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/111351] constexpr std::string objects permitted to escape constant evaluation when SSO
  2023-09-08 22:52 [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO foom at fuhm dot net
                   ` (3 preceding siblings ...)
  2023-09-11 20:28 ` foom at fuhm dot net
@ 2023-09-12 21:43 ` foom at fuhm dot net
  2023-09-12 23:14 ` arthur.j.odwyer at gmail dot com
  2023-09-27  2:09 ` foom at fuhm dot net
  6 siblings, 0 replies; 8+ messages in thread
From: foom at fuhm dot net @ 2023-09-12 21:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111351

--- Comment #5 from James Y Knight <foom at fuhm dot net> ---
> Does using __builtin_is_constant_p on the union member not work?

I've created a proof-of-concept patch for libc++ to support SSO strings during
constant evaluation. It works.

If everyone disagrees with me and believes that this is a really awesome
foot-gun to give to users, I will go ahead and propose that patch to libc++
maintainers. (As mentioned, that'll cause more code to be compilable under
libc++ than is possible to permit under libstdc++/MSVC implementations).

However, I continue to believe the opposite outcome, prohibiting this
everywhere, would be preferable.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/111351] constexpr std::string objects permitted to escape constant evaluation when SSO
  2023-09-08 22:52 [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO foom at fuhm dot net
                   ` (4 preceding siblings ...)
  2023-09-12 21:43 ` foom at fuhm dot net
@ 2023-09-12 23:14 ` arthur.j.odwyer at gmail dot com
  2023-09-27  2:09 ` foom at fuhm dot net
  6 siblings, 0 replies; 8+ messages in thread
From: arthur.j.odwyer at gmail dot com @ 2023-09-12 23:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111351

--- Comment #6 from Arthur O'Dwyer <arthur.j.odwyer at gmail dot com> ---
(In reply to James Y Knight from comment #5)
> > Does using __builtin_is_constant_p on the union member not work?
> 
> I've created a proof-of-concept patch for libc++ to support SSO strings
> during constant evaluation. It works.
> 
> If everyone disagrees with me and believes that this is a really awesome
> foot-gun to give to users, I will go ahead and propose that patch to libc++
> maintainers. (As mentioned, that'll cause more code to be compilable under
> libc++ than is possible to permit under libstdc++/MSVC implementations).

FWIW #1: Personally I would be weakly in favor of that patch, but I would also
be pessimistic about its chances of getting accepted in the current libc++
climate.

FWIW #2: A worst-of-both-worlds option ;) would be for your patch to `if
consteval` the SSO buffer size so that it would be 24 at runtime (matching
libc++'s current behavior) but 16 at compile time (matching libstdc++ and
Microsoft if I'm not mistaken, so you'd get your cross-vendor portability at
compile time). *I* would still consider that an unnecessary-and-thus-bad
crippling of libc++ string's cool 24-byte-SSO feature; but I could imagine
someone else finding it more palatable than any other alternative. 
["Worst-of-both-worlds" in the sense that you're paying to change the code at
all, but the end result still has two codepaths that both need to be
maintained, and divergence between compile-time and runtime SSO behavior.]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/111351] constexpr std::string objects permitted to escape constant evaluation when SSO
  2023-09-08 22:52 [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO foom at fuhm dot net
                   ` (5 preceding siblings ...)
  2023-09-12 23:14 ` arthur.j.odwyer at gmail dot com
@ 2023-09-27  2:09 ` foom at fuhm dot net
  6 siblings, 0 replies; 8+ messages in thread
From: foom at fuhm dot net @ 2023-09-27  2:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111351

--- Comment #7 from James Y Knight <foom at fuhm dot net> ---
On the libc++ side, a suggestion was given that instead of making this an
_error_, we could instead emit a warning if "a constexpr or constinit object is
a basic_string or contains a basic_string subobject, or the definition of a
constexpr or constinit variable extends the lifetime of a temporary object that
meets the previous condition."

I think that was a really great suggestion -- diagnosing via a warning is a
nicer solution than putting is_constant_evaluated hacks into the library (as
MSVC had, and libc++ currently has but will likely remove).

One could either hardcode std::basic_string for the diagnostic, or add a new
type-attribute to permit any type to opt-in to such a warning. You'd want to
use it if you have a type where you don't _intend_ to support
constant-initialization, but where it may sometimes be possible as an
implementation detail, and you want to tell users not to rely on that
implementation detail.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-09-27  2:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-08 22:52 [Bug libstdc++/111351] New: constexpr std::string objects permitted to escape constant evaluation when SSO foom at fuhm dot net
2023-09-09  3:28 ` [Bug libstdc++/111351] " arthur.j.odwyer at gmail dot com
2023-09-11 12:31 ` de34 at live dot cn
2023-09-11 13:23 ` redi at gcc dot gnu.org
2023-09-11 20:28 ` foom at fuhm dot net
2023-09-12 21:43 ` foom at fuhm dot net
2023-09-12 23:14 ` arthur.j.odwyer at gmail dot com
2023-09-27  2:09 ` foom at fuhm dot net

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).