public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[]
@ 2023-02-01  9:50 marat at slonopotamus dot org
  2023-02-01 16:29 ` [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[] pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: marat at slonopotamus dot org @ 2023-02-01  9:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

            Bug ID: 108626
           Summary: GCC doesn't combine string literals for const
                    char*const and const char b[]
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: marat at slonopotamus dot org
  Target Milestone: ---

Given the following code


----
#include <stdio.h>

static const char* const a = "bla";
static const char b[] = "bla";

int main() {
    puts(a);
    puts(b);
}
----

GCC 12.2 with -O2 produces

----
.LC0:
        .string "bla"
main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:.LC0
        call    puts
        mov     edi, OFFSET FLAT:_ZL1b
        call    puts
        xor     eax, eax
        add     rsp, 8
        ret
_ZL1b:
        .string "bla"
----

On the other hand, clang 15.0 with -O1 and higher produces

----
main:                                   # @main
        push    rbx
        lea     rbx, [rip + .L.str]
        mov     rdi, rbx
        call    puts@PLT
        mov     rdi, rbx
        call    puts@PLT
        xor     eax, eax
        pop     rbx
        ret
.L.str:
        .asciz  "bla"
----

For some reason, GCC doesn't want to combine readonly data for these two string
literals.

Godbolt playground: https://godbolt.org/z/WaeazezE6

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
@ 2023-02-01 16:29 ` pinskia at gcc dot gnu.org
  2023-02-01 18:19 ` marat at slonopotamus dot org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-02-01 16:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think what clang did is an invalid transformation.
You can use -fmerge-all-constants to get this same transformation in gcc.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
  2023-02-01 16:29 ` [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[] pinskia at gcc dot gnu.org
@ 2023-02-01 18:19 ` marat at slonopotamus dot org
  2023-02-01 18:57 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: marat at slonopotamus dot org @ 2023-02-01 18:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

--- Comment #2 from Marat Radchenko <marat at slonopotamus dot org> ---
Thanks for the pointer to -fmerge-all-constants, that helped me to google out
why such transformation is invalid: https://stackoverflow.com/a/70328102

Should I close this issue as INVALID (and probably open one against clang)?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
  2023-02-01 16:29 ` [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[] pinskia at gcc dot gnu.org
  2023-02-01 18:19 ` marat at slonopotamus dot org
@ 2023-02-01 18:57 ` jakub at gcc dot gnu.org
  2023-02-02  3:01 ` de34 at live dot cn
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-02-01 18:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Strictly speaking, in this case I think it isn't invalid, puts is a standard
function and implementation could treat it as a builtin with known behavior
(that it doesn't care about pointer equality of what it prints).
But clang++ does that even for
static const char* const a = "bla";
static const char b[] = "bla";
extern void foo (const char *);

int main() {
  foo(a);
  foo(b);
}
Another TU could have
#include <stdlib.h>

void
foo (const char *p)
{
  static const char *q;
  if (!q)
    q = p;
  else if (q == p)
    abort ();
}

This will fail even with GCC -fmerge-all-constants, but in that case we
document that the option may cause non-conforming behavior.
C++ says https://eel.is/c++draft/lex.string#9 and my reading of it is that
different
string literals can be overlapping or partially overlapping etc., but am not
convinced
it is ok to make those overlap other const objects.
C 17 says:
"String literals, and compound literals with const-qualified types, need not
designate distinct objects."
and
"This allows implementations to share storage for string literals and constant
compound literals with the same or overlapping representations."
note, but that still doesn't look like allowing them to overlap non-string
literal non-constant compound literals objects.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
                   ` (2 preceding siblings ...)
  2023-02-01 18:57 ` jakub at gcc dot gnu.org
@ 2023-02-02  3:01 ` de34 at live dot cn
  2023-02-02 10:09 ` marat at slonopotamus dot org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: de34 at live dot cn @ 2023-02-02  3:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

Jiang An <de34 at live dot cn> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |de34 at live dot cn

--- Comment #4 from Jiang An <de34 at live dot cn> ---
This case is tricky...

`a` points to the string literal object which is usable in constant expression
([expr.const]/4.6), while the object `b` is not usable in constant expression.
I'm afraid that there would be new issues involving constant evaluation if `b`
and the string literal object are merged.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
                   ` (3 preceding siblings ...)
  2023-02-02  3:01 ` de34 at live dot cn
@ 2023-02-02 10:09 ` marat at slonopotamus dot org
  2023-02-02 11:36 ` de34 at live dot cn
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: marat at slonopotamus dot org @ 2023-02-02 10:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

--- Comment #5 from Marat Radchenko <marat at slonopotamus dot org> ---
So, does "String literals, and compound literals with const-qualified types,
need not designate distinct objects." apply here or not? If not, how does the
case where it applies look like?

https://en.cppreference.com/w/c/language/compound_literal explicitly says
(though I do understand this is not a 100% reliable source):

Compound literals of const-qualified character or wide character array types
may share storage with string literals.

----
(const char []){"abc"} == "abc" // might be 1 or 0, implementation-defined
----

Comparison with string literals isn't actually implementation-defined but
instead unspecified behavior, but the point here is that they think it is
possible memory is shared.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
                   ` (4 preceding siblings ...)
  2023-02-02 10:09 ` marat at slonopotamus dot org
@ 2023-02-02 11:36 ` de34 at live dot cn
  2023-02-02 13:25 ` marat at slonopotamus dot org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: de34 at live dot cn @ 2023-02-02 11:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

--- Comment #6 from Jiang An <de34 at live dot cn> ---
(In reply to Marat Radchenko from comment #5)
> So, does "String literals, and compound literals with const-qualified types,
> need not designate distinct objects." apply here or not? If not, how does
> the case where it applies look like?

I strongly believe that when compiling as C, the objects in your example are
allowed to be merged. But I'm not sure whether this is applicable to C++.

In C++ one may write:

constexpr char x = a[0]; // OK.
if constexpr (a == b) { // If the objects are merged then the following code is
in affect.
    constexpr char y = b[0]; // Error! Because `b` is not usable in constant
expressions.
}

I'm afraid that it would be unreasonable for compiler to handle such confusing
case.

> https://en.cppreference.com/w/c/language/compound_literal explicitly says
> (though I do understand this is not a 100% reliable source):
> 
> Compound literals of const-qualified character or wide character array types
> may share storage with string literals.
> 
> ----
> (const char []){"abc"} == "abc" // might be 1 or 0, implementation-defined
> ----
> 
> Comparison with string literals isn't actually implementation-defined but
> instead unspecified behavior, but the point here is that they think it is
> possible memory is shared.


I've corrected the comment on cppreference. Now it's saying "unspecified".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
                   ` (5 preceding siblings ...)
  2023-02-02 11:36 ` de34 at live dot cn
@ 2023-02-02 13:25 ` marat at slonopotamus dot org
  2023-02-02 14:31 ` marat at slonopotamus dot org
  2023-07-24  2:21 ` de34 at live dot cn
  8 siblings, 0 replies; 10+ messages in thread
From: marat at slonopotamus dot org @ 2023-02-02 13:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

--- Comment #7 from Marat Radchenko <marat at slonopotamus dot org> ---
While playing with it more, I found that clang behaves in a very strange way.
While they do combine `const char* const` + `const char[]`, the DO NOT combine
two `const char[]` together. I don't have any explanation for that. Reported as
https://github.com/llvm/llvm-project/issues/60476.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
                   ` (6 preceding siblings ...)
  2023-02-02 13:25 ` marat at slonopotamus dot org
@ 2023-02-02 14:31 ` marat at slonopotamus dot org
  2023-07-24  2:21 ` de34 at live dot cn
  8 siblings, 0 replies; 10+ messages in thread
From: marat at slonopotamus dot org @ 2023-02-02 14:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

--- Comment #8 from Marat Radchenko <marat at slonopotamus dot org> ---
Also, quote from C17 standard:

Like string literals, const-qualified compound literals can be placed into
read-only memory and *can even be shared*. (6.5.2.5 p 13).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[]
  2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
                   ` (7 preceding siblings ...)
  2023-02-02 14:31 ` marat at slonopotamus dot org
@ 2023-07-24  2:21 ` de34 at live dot cn
  8 siblings, 0 replies; 10+ messages in thread
From: de34 at live dot cn @ 2023-07-24  2:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108626

--- Comment #9 from Jiang An <de34 at live dot cn> ---
See also CWG2753.
https://cplusplus.github.io/CWG/issues/2753.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-07-24  2:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-01  9:50 [Bug c++/108626] New: GCC doesn't combine string literals for const char*const and const char b[] marat at slonopotamus dot org
2023-02-01 16:29 ` [Bug c++/108626] GCC doesn't deduplicate string literals for const char*const and const char[] pinskia at gcc dot gnu.org
2023-02-01 18:19 ` marat at slonopotamus dot org
2023-02-01 18:57 ` jakub at gcc dot gnu.org
2023-02-02  3:01 ` de34 at live dot cn
2023-02-02 10:09 ` marat at slonopotamus dot org
2023-02-02 11:36 ` de34 at live dot cn
2023-02-02 13:25 ` marat at slonopotamus dot org
2023-02-02 14:31 ` marat at slonopotamus dot org
2023-07-24  2:21 ` de34 at live dot cn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).