public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114814] New: Reduction sum of comparison should be better
@ 2024-04-22 22:15 pinskia at gcc dot gnu.org
0 siblings, 0 replies; only message in thread
From: pinskia at gcc dot gnu.org @ 2024-04-22 22:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114814
Bug ID: 114814
Summary: Reduction sum of comparison should be better
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: pinskia at gcc dot gnu.org
Target Milestone: ---
Target: aarch64
Take the example from PR 114809:
```
#include <cstdint>
#include <cstdlib>
size_t count_chars(const char *src, size_t len, char c) {
size_t count = 0;
for (size_t i=0; i < len; i++) {
count += src[i] == c;
}
return count;
}
```
For aarch64 we produce currently for the inner loop:
```
.L4:
ldr q31, [x3], 16
cmeq v31.16b, v31.16b, v22.16b
and v31.16b, v23.16b, v31.16b
zip1 v27.16b, v31.16b, v29.16b
zip2 v31.16b, v31.16b, v29.16b
zip1 v25.8h, v27.8h, v29.8h
zip2 v27.8h, v27.8h, v29.8h
zip1 v26.8h, v31.8h, v29.8h
zip2 v31.8h, v31.8h, v29.8h
zip2 v30.4s, v25.4s, v29.4s
zip2 v28.4s, v27.4s, v29.4s
uaddw v30.2d, v30.2d, v25.2s
uaddw v28.2d, v28.2d, v27.2s
uaddw v30.2d, v30.2d, v26.2s
uaddw2 v28.2d, v28.2d, v26.4s
uaddw v30.2d, v30.2d, v31.2s
uaddw2 v31.2d, v28.2d, v31.4s
add v31.2d, v30.2d, v31.2d
add v24.2d, v24.2d, v31.2d
cmp x5, x3
bne .L4
```
But instead we should be able to just do:
```
.L4:
ldr q31, [x3], 16
cmeq v31.16b, v31.16b, v22.16b
and v31.16b, v23.16b, v31.16b
addv b31, v31.16b
fmov x0, d31
add x1, x1, x0
cmp x5, x3
bne .L4
```
Instead. That is do the reduction of the sum of the compare inside the loop
rather than outside.
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2024-04-22 22:15 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-22 22:15 [Bug tree-optimization/114814] New: Reduction sum of comparison should be better pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).