public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search
@ 2023-05-27  3:20 richard.yao at alumni dot stonybrook.edu
  2023-05-27  3:27 ` [Bug target/110001] " pinskia at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: richard.yao at alumni dot stonybrook.edu @ 2023-05-27  3:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

            Bug ID: 110001
           Summary: [13 regression] Suboptimal code generation for
                    branchless binary search
           Product: gcc
           Version: 13.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: richard.yao at alumni dot stonybrook.edu
  Target Milestone: ---

GCC 12.3 generated beautiful code for this, with all but the last of the
unrolled loop iterations using only 3 instructions:

https://gcc.godbolt.org/z/eGbEj9YKd

Currently, GCC generates 4 instructions:

https://gcc.godbolt.org/z/Ebczq8jjx

This probably does not make a huge difference given the data hazard, but there
is something awe-inspiring from seeing GCC generate only 3 instructions per
unrolled loop iteration for binary search. It would be nice if future versions
went back to generating three instructions.

This function was inspired by this D code:

https://godbolt.org/z/5En7xajzc

The bsearch1000() function is entirely branchless and has no more than 2
instructions for every cmov, excluding ret. I wrote a more general version in C
that can handle variable array sizes, and to my pleasant surprise, GCC 12.3
generated a similar 3 instruction sequence for all but the last of the unrolled
loop iterations. I was saddened when I saw the output from GCC 13.1 and trunk.

Anyway, all recent versions of GCC that I cared to check generate a branch for
the last unrolled iteration on line 58. That branch is unpredictable, so GCC
would generate better code here if it used predication to avoid the branch. I
had been able to give GCC a hint to avoid a similar branch at the end using
__builtin_expect_with_probability(), but that trick did not work for line 58.

Also, if anyone is interested in where that D code originated, here is the
source:

https://muscar.eu/shar-binary-search-meta.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search
  2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
@ 2023-05-27  3:27 ` pinskia at gcc dot gnu.org
  2023-05-27  3:27 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27  3:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
          Component|tree-optimization           |target

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The tree level is the same between 11.x, 12.x, 13.x and the trunk.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search
  2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
  2023-05-27  3:27 ` [Bug target/110001] " pinskia at gcc dot gnu.org
@ 2023-05-27  3:27 ` pinskia at gcc dot gnu.org
  2023-05-27  3:30 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27  3:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-linux-gnu

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note mov are sometimes free and only take up decode space.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search
  2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
  2023-05-27  3:27 ` [Bug target/110001] " pinskia at gcc dot gnu.org
  2023-05-27  3:27 ` pinskia at gcc dot gnu.org
@ 2023-05-27  3:30 ` pinskia at gcc dot gnu.org
  2023-05-27  3:31 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27  3:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 55176
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55176&action=edit
testcase

Next time please also attach the source (if it uses headers the preprocessed
source).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/110001] [13 regression] Suboptimal code generation for branchless binary search
  2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
                   ` (2 preceding siblings ...)
  2023-05-27  3:30 ` pinskia at gcc dot gnu.org
@ 2023-05-27  3:31 ` pinskia at gcc dot gnu.org
  2023-05-30  7:42 ` [Bug target/110001] [13/14 " rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27  3:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
It is looking like a register allocation issue or something changed in
expanding to rtl. maybe just it was ok on accident before GCC 13.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/110001] [13/14 regression] Suboptimal code generation for branchless binary search
  2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
                   ` (3 preceding siblings ...)
  2023-05-27  3:31 ` pinskia at gcc dot gnu.org
@ 2023-05-30  7:42 ` rguenth at gcc dot gnu.org
  2023-07-27  9:26 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-30  7:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |13.2
            Summary|[13 regression] Suboptimal  |[13/14 regression]
                   |code generation for         |Suboptimal code generation
                   |branchless binary search    |for branchless binary
                   |                            |search
           Keywords|                            |needs-bisection

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/110001] [13/14 regression] Suboptimal code generation for branchless binary search
  2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
                   ` (4 preceding siblings ...)
  2023-05-30  7:42 ` [Bug target/110001] [13/14 " rguenth at gcc dot gnu.org
@ 2023-07-27  9:26 ` rguenth at gcc dot gnu.org
  2024-01-10 17:03 ` jamborm at gcc dot gnu.org
  2024-03-08 15:36 ` law at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-27  9:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|13.2                        |13.3

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 13.2 is being released, retargeting bugs to GCC 13.3.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/110001] [13/14 regression] Suboptimal code generation for branchless binary search
  2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
                   ` (5 preceding siblings ...)
  2023-07-27  9:26 ` rguenth at gcc dot gnu.org
@ 2024-01-10 17:03 ` jamborm at gcc dot gnu.org
  2024-03-08 15:36 ` law at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jamborm at gcc dot gnu.org @ 2024-01-10 17:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|needs-bisection             |
                 CC|                            |amacleod at redhat dot com

--- Comment #6 from Martin Jambor <jamborm at gcc dot gnu.org> ---
Even though I can confirm the observation from comment #1 that the optimized
tree dump does not seem to change in any meaningful way, bisection leads to
commit r12-4871-g502ffb1f389011 (Andrew MacLeod: Switch vrp2 to ranger).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/110001] [13/14 regression] Suboptimal code generation for branchless binary search
  2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
                   ` (6 preceding siblings ...)
  2024-01-10 17:03 ` jamborm at gcc dot gnu.org
@ 2024-03-08 15:36 ` law at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-08 15:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110001

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
                 CC|                            |law at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-03-08 15:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-27  3:20 [Bug tree-optimization/110001] New: [13 regression] Suboptimal code generation for branchless binary search richard.yao at alumni dot stonybrook.edu
2023-05-27  3:27 ` [Bug target/110001] " pinskia at gcc dot gnu.org
2023-05-27  3:27 ` pinskia at gcc dot gnu.org
2023-05-27  3:30 ` pinskia at gcc dot gnu.org
2023-05-27  3:31 ` pinskia at gcc dot gnu.org
2023-05-30  7:42 ` [Bug target/110001] [13/14 " rguenth at gcc dot gnu.org
2023-07-27  9:26 ` rguenth at gcc dot gnu.org
2024-01-10 17:03 ` jamborm at gcc dot gnu.org
2024-03-08 15:36 ` law at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).