public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/114088] New: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw
@ 2024-02-24 15:59 thiago at kde dot org
  2024-02-24 17:31 ` [Bug c/114088] " pinskia at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: thiago at kde dot org @ 2024-02-24 15:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088

            Bug ID: 114088
           Summary: Please provide __builtin_c16slen and __builtin_c32slen
                    to complement __builtin_wcslenw
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thiago at kde dot org
  Target Milestone: ---

Actually, GCC doesn't have __builtin_wcslen, but Clang does. Providing these
extra two builtins would allow implementing __builtin_wcslen too. The names are
not part of the C standard, but follow the current naming construction rules
for it, similar to how "mbrtowc" and "wcslen" parallel.

My specific need is actually to implement char16_t string containers in C++.
I'm particularly interested in QString/QStringView, but this applies to
std::basic_string{_view} too.

For example:

std::string_view f1() { return "Hello"; }
std::wstring_view fw() { return L"Hello"; }
std::u16string_view f16() { return u"Hello"; }
std::u32string_view f32() { return U"Hello"; }

With GCC and libstdc++, the first function produces optimal code:
        movl    $5, %eax
        leaq    .LC0(%rip), %rdx
        ret

For wchar_t case, GCC emits an out-of-line call to wcslen:
        pushq   %rbx
        leaq    .LC2(%rip), %rbx
        movq    %rbx, %rdi
        call    wcslen@PLT
        movq    %rbx, %rdx
        popq    %rbx
        ret

The next two, because of the absence of a C library function, emit a loop:
        xorl    %eax, %eax
        leaq    .LC1(%rip), %rcx
.L4:
        incq    %rax
        cmpw    $0, (%rcx,%rax,2)
        jne     .L4
        movq    %rcx, %rdx
        ret

Clang, meanwhile, emits optimal code for all four and so did the pre-Clang
Intel compiler. See https://gcc.godbolt.org/z/qvj7qnYbz. MSVC emits optimal for
the char and wchar_t versions, but loops for the other two.

Clang gives up when the string gets longer, though. See
https://gcc.godbolt.org/z/54j3zr6e6. That indicates that it gave up on guessing
the loop run and would do better if the intrinsic were present.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/114088] Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw
  2024-02-24 15:59 [Bug c/114088] New: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw thiago at kde dot org
@ 2024-02-24 17:31 ` pinskia at gcc dot gnu.org
  2024-02-24 18:28 ` redi at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-24 17:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/114088] Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw
  2024-02-24 15:59 [Bug c/114088] New: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw thiago at kde dot org
  2024-02-24 17:31 ` [Bug c/114088] " pinskia at gcc dot gnu.org
@ 2024-02-24 18:28 ` redi at gcc dot gnu.org
  2024-02-25  5:15 ` xry111 at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2024-02-24 18:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
GCC built-ins like __builtin_strlen just wrap a libc function. __builtin_wcslen
would generally just be a call to wcslen, which doesn't give you much. I assume
what you want is to recognize wcslen and replace it with inline assembly code.

Similarly, if libc doesn't provide c16slen then a __builtin_c16slen isn't going
to do much.

I think what you want is better code for finding char16_t(0) or char32_t(0),
not a new built-in.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/114088] Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw
  2024-02-24 15:59 [Bug c/114088] New: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw thiago at kde dot org
  2024-02-24 17:31 ` [Bug c/114088] " pinskia at gcc dot gnu.org
  2024-02-24 18:28 ` redi at gcc dot gnu.org
@ 2024-02-25  5:15 ` xry111 at gcc dot gnu.org
  2024-02-25  6:22 ` thiago at kde dot org
  2024-02-25 13:27 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-02-25  5:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xry111 at gcc dot gnu.org

--- Comment #2 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to Jonathan Wakely from comment #1)
> GCC built-ins like __builtin_strlen just wrap a libc function. __builtin_wcslen would generally just be a call to wcslen, which doesn't give you much.

But __builtin_strlen *does* get optimized when the input is a string literal. 
Not sure about wcslen though.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/114088] Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw
  2024-02-24 15:59 [Bug c/114088] New: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw thiago at kde dot org
                   ` (2 preceding siblings ...)
  2024-02-25  5:15 ` xry111 at gcc dot gnu.org
@ 2024-02-25  6:22 ` thiago at kde dot org
  2024-02-25 13:27 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: thiago at kde dot org @ 2024-02-25  6:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088

--- Comment #3 from Thiago Macieira <thiago at kde dot org> ---
> But __builtin_strlen *does* get optimized when the input is a string literal.  Not sure about wcslen though.

It appears not to, in the test above. std::char_trait<wchar_t>::length() calls
wcslen() whereas the char specialisation uses __builtin_strlen() explicitly.
But if the intrinsics are enabled, the two would be the same, wouldn't they?

Anyway, in the absence of a library function to call, inserting the loop is
fine; it's what is there already.

Though it would be nice to be able to provide such a function. I wrote it for
Qt (it's called qustrlen). I would try with __builtin_constant_p first to see
if the string is a literal.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/114088] Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw
  2024-02-24 15:59 [Bug c/114088] New: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw thiago at kde dot org
                   ` (3 preceding siblings ...)
  2024-02-25  6:22 ` thiago at kde dot org
@ 2024-02-25 13:27 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2024-02-25 13:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Xi Ruoyao from comment #2)
> But __builtin_strlen *does* get optimized when the input is a string
> literal.

But so does strlen, because GCC knows about it. That's my point.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-02-25 13:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-24 15:59 [Bug c/114088] New: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw thiago at kde dot org
2024-02-24 17:31 ` [Bug c/114088] " pinskia at gcc dot gnu.org
2024-02-24 18:28 ` redi at gcc dot gnu.org
2024-02-25  5:15 ` xry111 at gcc dot gnu.org
2024-02-25  6:22 ` thiago at kde dot org
2024-02-25 13:27 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).