public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/81594] Optimize PowerPC vector set and store
       [not found] <bug-81594-4@http.gcc.gnu.org/bugzilla/>
@ 2020-03-18 16:44 ` meissner at gcc dot gnu.org
  2020-03-25  8:59 ` segher at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 4+ messages in thread
From: meissner at gcc dot gnu.org @ 2020-03-18 16:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81594

Michael Meissner <meissner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #41854|0                           |1
        is obsolete|                            |

--- Comment #4 from Michael Meissner <meissner at gcc dot gnu.org> ---
Created attachment 48057
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48057&action=edit
Update proposed patch to fix the problem

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/81594] Optimize PowerPC vector set and store
       [not found] <bug-81594-4@http.gcc.gnu.org/bugzilla/>
  2020-03-18 16:44 ` [Bug target/81594] Optimize PowerPC vector set and store meissner at gcc dot gnu.org
@ 2020-03-25  8:59 ` segher at gcc dot gnu.org
  2020-03-25 15:38 ` meissner at gcc dot gnu.org
  2020-03-25 18:40 ` segher at gcc dot gnu.org
  3 siblings, 0 replies; 4+ messages in thread
From: segher at gcc dot gnu.org @ 2020-03-25  8:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81594

--- Comment #5 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Hi Mike,

Please explain (in the code!) why we need a peephole here, why we cannot
generate the faster code earlier?  Or why we choose not to?  Etc.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/81594] Optimize PowerPC vector set and store
       [not found] <bug-81594-4@http.gcc.gnu.org/bugzilla/>
  2020-03-18 16:44 ` [Bug target/81594] Optimize PowerPC vector set and store meissner at gcc dot gnu.org
  2020-03-25  8:59 ` segher at gcc dot gnu.org
@ 2020-03-25 15:38 ` meissner at gcc dot gnu.org
  2020-03-25 18:40 ` segher at gcc dot gnu.org
  3 siblings, 0 replies; 4+ messages in thread
From: meissner at gcc dot gnu.org @ 2020-03-25 15:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81594

--- Comment #6 from Michael Meissner <meissner at gcc dot gnu.org> ---
If you look at the original patch, it did try to do this optimization.  When I
looked at it some time later, the combiner no longer generated the sequence
because it thought it was slower (due to length, etc.).

You could spend a lot of time tuning the code so eventually the combiner will
generate it again, but it was simpler to just put the peephole in to catch the
cases that show up.  If you want to take on the bug and do it earlier, go
ahead.

A peephole2 might not catch all uses, but it prevents whack-a-mole, where a
change causes other code generation changes down the pike.

Note, the original patch was written in the power8 time frame, and it would
need to be adjust to power9 and future systems now (i.e. the patch only does
the splitting if the value is a FPR or GPR, while in power9 it could be a
traditional Altivec register).

However, the splitter uses reload_completed that you always seem to object to. 
It could be done before register allocation, but then you would need to make
sure that no other pass recombines the two separate items back into a vector
once again.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/81594] Optimize PowerPC vector set and store
       [not found] <bug-81594-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2020-03-25 15:38 ` meissner at gcc dot gnu.org
@ 2020-03-25 18:40 ` segher at gcc dot gnu.org
  3 siblings, 0 replies; 4+ messages in thread
From: segher at gcc dot gnu.org @ 2020-03-25 18:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81594

--- Comment #7 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Peepholes catch fewer cases, and it is very hard to write correct peepholes.
The only reason to use peepholes is when the other passes leave some important
optimisation on the table, and you cannot feasibly fix that problem.

They are not a substitute for proper optimisation (for example, almost all
"interesting" optimisation happen before it, so you cannot rely on combine
or cse or *prop etc. to do anything -- so, if you want some simple
optimisations
with it you need to write that manually (an exponential amount of work).  This
even applies to "trivial" things like constant arguments.

Peepholes are nice for mopping up those cases that for one reason or the other
the other compiler passes cannot / do not get right.  They can "tune" the
compiler output to be just a teeny bit better.  They cannot do anything more
than that.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-03-25 18:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-81594-4@http.gcc.gnu.org/bugzilla/>
2020-03-18 16:44 ` [Bug target/81594] Optimize PowerPC vector set and store meissner at gcc dot gnu.org
2020-03-25  8:59 ` segher at gcc dot gnu.org
2020-03-25 15:38 ` meissner at gcc dot gnu.org
2020-03-25 18:40 ` segher at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).