public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search
@ 2023-05-27 3:20 richard.yao at alumni dot stonybrook.edu
2023-05-27 3:27 ` [Bug target/110001] " pinskia at gcc dot gnu.org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: richard.yao at alumni dot stonybrook.edu @ 2023-05-27 3:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
Bug ID: 110001
Summary: [13 regression] Suboptimal code generation for
branchless binary search
Product: gcc
Version: 13.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: richard.yao at alumni dot stonybrook.edu
Target Milestone: ---
GCC 12.3 generated beautiful code for this, with all but the last of the
unrolled loop iterations using only 3 instructions:
https://gcc.godbolt.org/z/eGbEj9YKd
Currently, GCC generates 4 instructions:
https://gcc.godbolt.org/z/Ebczq8jjx
This probably does not make a huge difference given the data hazard, but there
is something awe-inspiring from seeing GCC generate only 3 instructions per
unrolled loop iteration for binary search. It would be nice if future versions
went back to generating three instructions.
This function was inspired by this D code:
https://godbolt.org/z/5En7xajzc
The bsearch1000() function is entirely branchless and has no more than 2
instructions for every cmov, excluding ret. I wrote a more general version in C
that can handle variable array sizes, and to my pleasant surprise, GCC 12.3
generated a similar 3 instruction sequence for all but the last of the unrolled
loop iterations. I was saddened when I saw the output from GCC 13.1 and trunk.
Anyway, all recent versions of GCC that I cared to check generate a branch for
the last unrolled iteration on line 58. That branch is unpredictable, so GCC
would generate better code here if it used predication to avoid the branch. I
had been able to give GCC a hint to avoid a similar branch at the end using
__builtin_expect_with_probability(), but that trick did not work for line 58.
Also, if anyone is interested in where that D code originated, here is the
source:
https://muscar.eu/shar-binary-search-meta.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
@ 2023-05-27 3:27 ` pinskia at gcc dot gnu.org
2023-05-27 3:27 ` pinskia at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27 3:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Component|tree-optimization |target
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The tree level is the same between 11.x, 12.x, 13.x and the trunk.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
2023-05-27 3:27 ` [Bug target/110001] " pinskia at gcc dot gnu.org
@ 2023-05-27 3:27 ` pinskia at gcc dot gnu.org
2023-05-27 3:30 ` pinskia at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27 3:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |x86_64-linux-gnu
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note mov are sometimes free and only take up decode space.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
2023-05-27 3:27 ` [Bug target/110001] " pinskia at gcc dot gnu.org
2023-05-27 3:27 ` pinskia at gcc dot gnu.org
@ 2023-05-27 3:30 ` pinskia at gcc dot gnu.org
2023-05-27 3:31 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27 3:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 55176
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55176&action=edit
testcase
Next time please also attach the source (if it uses headers the preprocessed
source).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
` (2 preceding siblings ...)
2023-05-27 3:30 ` pinskia at gcc dot gnu.org
@ 2023-05-27 3:31 ` pinskia at gcc dot gnu.org
2023-05-30 7:42 ` [Bug target/110001] [13/14 " rguenth at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27 3:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
It is looking like a register allocation issue or something changed in
expanding to rtl. maybe just it was ok on accident before GCC 13.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13/14 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
` (3 preceding siblings ...)
2023-05-27 3:31 ` pinskia at gcc dot gnu.org
@ 2023-05-30 7:42 ` rguenth at gcc dot gnu.org
2023-07-27 9:26 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-30 7:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |13.2
Summary|[13 regression] Suboptimal |[13/14 regression]
|code generation for |Suboptimal code generation
|branchless binary search |for branchless binary
| |search
Keywords| |needs-bisection
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13/14 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
` (4 preceding siblings ...)
2023-05-30 7:42 ` [Bug target/110001] [13/14 " rguenth at gcc dot gnu.org
@ 2023-07-27 9:26 ` rguenth at gcc dot gnu.org
2024-01-10 17:03 ` jamborm at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-27 9:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|13.2 |13.3
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 13.2 is being released, retargeting bugs to GCC 13.3.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13/14 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
` (5 preceding siblings ...)
2023-07-27 9:26 ` rguenth at gcc dot gnu.org
@ 2024-01-10 17:03 ` jamborm at gcc dot gnu.org
2024-03-08 15:36 ` law at gcc dot gnu.org
2024-05-21 9:15 ` [Bug target/110001] [13/14/15 " jakub at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: jamborm at gcc dot gnu.org @ 2024-01-10 17:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
Martin Jambor <jamborm at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords|needs-bisection |
CC| |amacleod at redhat dot com
--- Comment #6 from Martin Jambor <jamborm at gcc dot gnu.org> ---
Even though I can confirm the observation from comment #1 that the optimized
tree dump does not seem to change in any meaningful way, bisection leads to
commit r12-4871-g502ffb1f389011 (Andrew MacLeod: Switch vrp2 to ranger).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13/14 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
` (6 preceding siblings ...)
2024-01-10 17:03 ` jamborm at gcc dot gnu.org
@ 2024-03-08 15:36 ` law at gcc dot gnu.org
2024-05-21 9:15 ` [Bug target/110001] [13/14/15 " jakub at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-08 15:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
Jeffrey A. Law <law at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
CC| |law at gcc dot gnu.org
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/110001] [13/14/15 regression] Suboptimal code generation for branchless binary search
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
` (7 preceding siblings ...)
2024-03-08 15:36 ` law at gcc dot gnu.org
@ 2024-05-21 9:15 ` jakub at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-05-21 9:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|13.3 |13.4
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 13.3 is being released, retargeting bugs to GCC 13.4.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-05-21 9:15 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-27 3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
2023-05-27 3:27 ` [Bug target/110001] " pinskia at gcc dot gnu.org
2023-05-27 3:27 ` pinskia at gcc dot gnu.org
2023-05-27 3:30 ` pinskia at gcc dot gnu.org
2023-05-27 3:31 ` pinskia at gcc dot gnu.org
2023-05-30 7:42 ` [Bug target/110001] [13/14 " rguenth at gcc dot gnu.org
2023-07-27 9:26 ` rguenth at gcc dot gnu.org
2024-01-10 17:03 ` jamborm at gcc dot gnu.org
2024-03-08 15:36 ` law at gcc dot gnu.org
2024-05-21 9:15 ` [Bug target/110001] [13/14/15 " jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).