public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/103798] New: Missed optimization: char_traits<char>::find (and thus string_view::find_first_of) is slow when invoked with short strings
@ 2021-12-22 7:03 zhaoweiliew at gmail dot com
2021-12-22 7:24 ` [Bug tree-optimization/103798] memchr with a (small) constant string should be expanded inline pinskia at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: zhaoweiliew at gmail dot com @ 2021-12-22 7:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103798
Bug ID: 103798
Summary: Missed optimization: char_traits<char>::find (and thus
string_view::find_first_of) is slow when invoked with
short strings
Product: gcc
Version: 11.2.1
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: zhaoweiliew at gmail dot com
Target Milestone: ---
Given the following code:
#include <string_view>
size_t findFirstE_slow(std::string_view sv) {
return sv.find_first_of("eE");
}
size_t findFirstE_fast(std::string_view sv) {
auto it{sv.begin()};
for (; it != sv.end() && *it != 'e' && *it != 'E'; ++it)
;
return it == sv.end() ? std::string_view::npos : size_t(it - sv.begin());
}
x86-64 gcc (trunk) -std=c++20 -O3 generates the following assembly:
.LC0:
.string "eE"
findFirstE_slow(std::basic_string_view<char, std::char_traits<char> >):
push r12
push rbp
push rbx
test rdi, rdi
je .L4
mov rbx, rdi
mov rbp, rsi
xor r12d, r12d
jmp .L3
.L8:
add r12, 1
cmp rbx, r12
je .L4
.L3:
movsx esi, BYTE PTR [rbp+0+r12]
mov edx, 2
mov edi, OFFSET FLAT:.LC0
call memchr
test rax, rax
je .L8
mov rax, r12
pop rbx
pop rbp
pop r12
ret
.L4:
mov r12, -1
pop rbx
pop rbp
mov rax, r12
pop r12
ret
findFirstE_fast(std::basic_string_view<char, std::char_traits<char> >):
add rdi, rsi
cmp rdi, rsi
je .L13
mov rax, rsi
jmp .L12
.L15:
add rax, 1
cmp rdi, rax
je .L13
.L12:
movzx edx, BYTE PTR [rax]
and edx, -33
cmp dl, 69
jne .L15
sub rax, rsi
ret
.L13:
mov rax, -1
ret
Even though both findFirstE_slow() and findFirstE_fast() are logically
equivalent, findFirstE_slow() calls memchr() for every character in the input
string.
findFirstE_fast() does what we would reasonably expect, comparing every
character in the input string with 'e' and 'E'.
In practice, when the set of characters to be found is small, findFirstE_slow()
runs noticeably slower than findFirstE_fast(). This seems to be a missed
optimization in char_traits<char>::find(), which affects several find-related
methods in string_view.
Relevant StackOverflow question with quick-bench output, Compiler Explorer
output, and more discussion:
https://stackoverflow.com/questions/70433152/missed-optimization-with-string-viewfind-first-of
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/103798] memchr with a (small) constant string should be expanded inline.
2021-12-22 7:03 [Bug libstdc++/103798] New: Missed optimization: char_traits<char>::find (and thus string_view::find_first_of) is slow when invoked with short strings zhaoweiliew at gmail dot com
@ 2021-12-22 7:24 ` pinskia at gcc dot gnu.org
2022-06-16 17:37 ` hjl.tools at gmail dot com
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-22 7:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103798
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|libstdc++ |tree-optimization
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Severity|normal |enhancement
Summary|Missed optimization: |memchr with a (small)
|char_traits<char>::find |constant string should be
|(and thus |expanded inline.
|string_view::find_first_of) |
|is slow when invoked with |
|short strings |
Last reconfirmed| |2021-12-22
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Really the issue is:
_8 = __builtin_memchr ("eE", _7, 2);
if (_8 != 0B)
Should just be expanded to _7 == 'e' || _7 == 'E' .
That is:
int f(char a)
{
return __builtin_memchr ("e", a, 1) != 0;
}
int f1(char a)
{
return a == 'e';
}
int g(char a)
{
return __builtin_memchr ("eE", a, 2) != 0;
}
int g1(char a)
{
return a == 'e' || a == 'E';
}
f and f1 should produce the same assembly
and g and g1 should produce the same (similar) assembly.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/103798] memchr with a (small) constant string should be expanded inline.
2021-12-22 7:03 [Bug libstdc++/103798] New: Missed optimization: char_traits<char>::find (and thus string_view::find_first_of) is slow when invoked with short strings zhaoweiliew at gmail dot com
2021-12-22 7:24 ` [Bug tree-optimization/103798] memchr with a (small) constant string should be expanded inline pinskia at gcc dot gnu.org
@ 2022-06-16 17:37 ` hjl.tools at gmail dot com
2022-07-14 21:10 ` cvs-commit at gcc dot gnu.org
2022-08-28 7:42 ` jbglaw@lug-owl.de
3 siblings, 0 replies; 5+ messages in thread
From: hjl.tools at gmail dot com @ 2022-06-16 17:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103798
--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 53154
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53154&action=edit
A patch
Like this?
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/103798] memchr with a (small) constant string should be expanded inline.
2021-12-22 7:03 [Bug libstdc++/103798] New: Missed optimization: char_traits<char>::find (and thus string_view::find_first_of) is slow when invoked with short strings zhaoweiliew at gmail dot com
2021-12-22 7:24 ` [Bug tree-optimization/103798] memchr with a (small) constant string should be expanded inline pinskia at gcc dot gnu.org
2022-06-16 17:37 ` hjl.tools at gmail dot com
@ 2022-07-14 21:10 ` cvs-commit at gcc dot gnu.org
2022-08-28 7:42 ` jbglaw@lug-owl.de
3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-14 21:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103798
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:c6cf555a88f8ffb8ec85c2b94fe656ab217260ea
commit r13-1699-gc6cf555a88f8ffb8ec85c2b94fe656ab217260ea
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Jun 17 07:33:06 2022 -0700
Simplify memchr with small constant strings
When memchr is applied on a constant string of no more than the bytes of
a word, simplify memchr by checking each byte in the constant string.
int f (int a)
{
return __builtin_memchr ("AE", a, 2) != 0;
}
is simplified to
int f (int a)
{
return ((char) a == 'A' || (char) a == 'E') != 0;
}
gcc/
PR tree-optimization/103798
* tree-ssa-forwprop.cc: Include "tree-ssa-strlen.h".
(simplify_builtin_call): Inline memchr with constant strings of
no more than the bytes of a word.
* tree-ssa-strlen.cc (use_in_zero_equality): Make it global.
* tree-ssa-strlen.h (use_in_zero_equality): New.
gcc/testsuite/
PR tree-optimization/103798
* c-c++-common/pr103798-1.c: New test.
* c-c++-common/pr103798-2.c: Likewise.
* c-c++-common/pr103798-3.c: Likewise.
* c-c++-common/pr103798-4.c: Likewise.
* c-c++-common/pr103798-5.c: Likewise.
* c-c++-common/pr103798-6.c: Likewise.
* c-c++-common/pr103798-7.c: Likewise.
* c-c++-common/pr103798-8.c: Likewise.
* c-c++-common/pr103798-9.c: Likewise.
* c-c++-common/pr103798-10.c: Likewise.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/103798] memchr with a (small) constant string should be expanded inline.
2021-12-22 7:03 [Bug libstdc++/103798] New: Missed optimization: char_traits<char>::find (and thus string_view::find_first_of) is slow when invoked with short strings zhaoweiliew at gmail dot com
` (2 preceding siblings ...)
2022-07-14 21:10 ` cvs-commit at gcc dot gnu.org
@ 2022-08-28 7:42 ` jbglaw@lug-owl.de
3 siblings, 0 replies; 5+ messages in thread
From: jbglaw@lug-owl.de @ 2022-08-28 7:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103798
Jan-Benedict Glaw <jbglaw@lug-owl.de> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jbglaw@lug-owl.de
--- Comment #4 from Jan-Benedict Glaw <jbglaw@lug-owl.de> ---
I just started looking through the automated builds logs and noticed that three
configurations (`.../configure --enable-werror-always --enable-languages=all
--disable-gcov --disable-shared --disable-threads --target=rl78-elf
--without-headers`, and with --target=avr-elf as well as with
--target=gcc-pru-elf) break during `make V=1 all-gcc`:
[...]
/usr/lib/gcc-snapshot/bin/g++ -fno-PIE -c -g -O2 -DIN_GCC
-DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -W
cast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common
-DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/.
-I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include
-I../../gcc/gcc/../libcody -I../../gcc/gcc/../libdecnumber
-I../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber
-I../../gcc/gcc/../libbacktrace -o tree-ssa-forwprop.o -MT
tree-ssa-forwprop.o -MMD -MP -MF ./.deps/tree-ssa-forwprop.TPo
../../gcc/gcc/tree-ssa-forwprop.cc
../../gcc/gcc/tree-ssa-forwprop.cc: In function 'bool
simplify_builtin_call(gimple_stmt_iterator*, tree)':
../../gcc/gcc/tree-ssa-forwprop.cc:1258:42: error: array subscript 1 is outside
array bounds of 'tree_node* [1]' [-Werror=array-bounds]
1258 | op[i - 1] = fold_convert_loc (loc, boolean_type_node,
| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
1259 | fold_build2_loc (loc,
| ~~~~~~~~~~~~~~~~~~~~~
1260 |
BIT_IOR_EXPR,
|
~~~~~~~~~~~~~
1261 |
boolean_type_node,
|
~~~~~~~~~~~~~~~~~~
1262 | op[i - 1],
| ~~~~~~~~~~
1263 | op[i]));
| ~~~~~~~
In file included from ../../gcc/gcc/system.h:707,
from ../../gcc/gcc/tree-ssa-forwprop.cc:21:
../../gcc/gcc/../include/libiberty.h:733:36: note: at offset 8 into object of
size [0, 8] allocated by '__builtin_alloca'
733 | # define alloca(x) __builtin_alloca(x)
| ~~~~~~~~~~~~~~~~^~~
../../gcc/gcc/../include/libiberty.h:365:40: note: in expansion of macro
'alloca'
365 | #define XALLOCAVEC(T, N) ((T *) alloca (sizeof (T) * (N)))
| ^~~~~~
../../gcc/gcc/tree-ssa-forwprop.cc:1250:22: note: in expansion of macro
'XALLOCAVEC'
1250 | tree *op = XALLOCAVEC (tree, isize);
| ^~~~~~~~~~
cc1plus: all warnings being treated as errors
make[1]: *** [Makefile:1146: tree-ssa-forwprop.o] Error 1
This also happens for --target=avr-elf and --target=gcc-pru-elf .
Thank,
Jan-Benedict
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-08-28 7:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22 7:03 [Bug libstdc++/103798] New: Missed optimization: char_traits<char>::find (and thus string_view::find_first_of) is slow when invoked with short strings zhaoweiliew at gmail dot com
2021-12-22 7:24 ` [Bug tree-optimization/103798] memchr with a (small) constant string should be expanded inline pinskia at gcc dot gnu.org
2022-06-16 17:37 ` hjl.tools at gmail dot com
2022-07-14 21:10 ` cvs-commit at gcc dot gnu.org
2022-08-28 7:42 ` jbglaw@lug-owl.de
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).