public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/68837] PowerPC switch statement performance
       [not found] <bug-68837-4@http.gcc.gnu.org/bugzilla/>
@ 2020-06-03  9:08 ` guihaoc at gcc dot gnu.org
  2020-06-04 15:43 ` dje at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 4+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2020-06-03  9:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68837

HaoChen Gui <guihaoc at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |guihaoc at gcc dot gnu.org

--- Comment #2 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
(In reply to David Edelsohn from comment #0)
> Improve performance of switch statements:
> 
> 1) Heuristics for decision tree vs tablejump
> 
> 2) Avoid sign extended lwa for offset

I think #1 was already implemented in current GC. Jump tables coexists with
conditional jumps. jump-table-max-growth-ratio-for-speed/size decide how large
a jump table could be and case-values-threshold defines if a jump table is
beneficial. 

An example of GIMPLE switch
;; GIMPLE switch case clusters: 35 37 JT(values:5 comparisons:5 range:8
density: 62.50%):65-72 JT(values:6 comparisons:6 range:12 density:
50.00%):111-122

For #2, the offset could be negative in a multiple jump table case. Right now
it uses lwax and there is no overhead for sing extend, I think. Please correct
me if I am wrong.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/68837] PowerPC switch statement performance
       [not found] <bug-68837-4@http.gcc.gnu.org/bugzilla/>
  2020-06-03  9:08 ` [Bug target/68837] PowerPC switch statement performance guihaoc at gcc dot gnu.org
@ 2020-06-04 15:43 ` dje at gcc dot gnu.org
  2020-06-04 20:18 ` segher at gcc dot gnu.org
  2020-06-11  5:15 ` guihaoc at gcc dot gnu.org
  3 siblings, 0 replies; 4+ messages in thread
From: dje at gcc dot gnu.org @ 2020-06-04 15:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68837

--- Comment #3 from David Edelsohn <dje at gcc dot gnu.org> ---
The PR was from 2015, before Martin's improvements.

Also, sign-extend load instructions were less efficient at the time.  We need
to re-examine the sequence on more recent microarchitectures.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/68837] PowerPC switch statement performance
       [not found] <bug-68837-4@http.gcc.gnu.org/bugzilla/>
  2020-06-03  9:08 ` [Bug target/68837] PowerPC switch statement performance guihaoc at gcc dot gnu.org
  2020-06-04 15:43 ` dje at gcc dot gnu.org
@ 2020-06-04 20:18 ` segher at gcc dot gnu.org
  2020-06-11  5:15 ` guihaoc at gcc dot gnu.org
  3 siblings, 0 replies; 4+ messages in thread
From: segher at gcc dot gnu.org @ 2020-06-04 20:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68837

--- Comment #4 from Segher Boessenkool <segher at gcc dot gnu.org> ---
On Power9 the lwa insn is cracked into an lwz and an extsw, just like
on older CPUs.  Cracked instructions have fewer constraints on p9 than
they did on most older CPUs though (it doesn't have to be first in a
fetch group any more).

It still saves one insn to execute if lwz would be enough.  Another
option may be to just do an ld even: this saves the number of insns
needed, but it takes more space for the table (so it won't be cached /
prefetched as well), and absolute addresses require relocations.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/68837] PowerPC switch statement performance
       [not found] <bug-68837-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2020-06-04 20:18 ` segher at gcc dot gnu.org
@ 2020-06-11  5:15 ` guihaoc at gcc dot gnu.org
  3 siblings, 0 replies; 4+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2020-06-11  5:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68837

--- Comment #5 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
I think there are two ways avoiding sign extension for offset loading.

a. Make sure all offsets be positive.  There exists backward jumps as well as
STC will reorder the basic block. So the offset might be negative. We can check
the offsets after bb-reorder pass and remove sign extension if all offsets are
positive. It may need a new pass.

b. Use absolute addresses instead of offsets. It's controlled by macro
CASE_VECTOR_PC_RELATIVE and flag_pic. x86 uses absolute address(labels) in
their jump table, though the size of jump is bigger. Power haven't supported
absolute address yet.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-06-11  5:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-68837-4@http.gcc.gnu.org/bugzilla/>
2020-06-03  9:08 ` [Bug target/68837] PowerPC switch statement performance guihaoc at gcc dot gnu.org
2020-06-04 15:43 ` dje at gcc dot gnu.org
2020-06-04 20:18 ` segher at gcc dot gnu.org
2020-06-11  5:15 ` guihaoc at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).