public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/95674] New: Unnecessary move when doing division-by-multiplication
@ 2020-06-14 18:21 josephcsible at gmail dot com
  2020-06-15  8:50 ` [Bug target/95674] " jakub at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: josephcsible at gmail dot com @ 2020-06-14 18:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95674

            Bug ID: 95674
           Summary: Unnecessary move when doing division-by-multiplication
           Product: gcc
           Version: 10.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: josephcsible at gmail dot com
  Target Milestone: ---

When GCC does division-by-multiplication, it seems to forget that
multiplication is commutative. It moves the numerator into %rax, but if it
moved the magic number there instead, then it wouldn't have to move the
numerator at all.

Input:

unsigned long f(unsigned long x) {
        return x / 7;
}

Actual assembly:

f:
        movabsq $2635249153387078803, %rdx
        movq    %rdi, %rax
        mulq    %rdx
        subq    %rdx, %rdi
        shrq    %rdi
        leaq    (%rdx,%rdi), %rax
        shrq    $2, %rax
        ret

Desired assembly:

f:
        movabsq $2635249153387078803, %rax
        mulq    %rdi
        subq    %rdx, %rdi
        shrq    %rdi
        leaq    (%rdx,%rdi), %rax
        shrq    $2, %rax
        ret

https://godbolt.org/z/RE45qg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/95674] Unnecessary move when doing division-by-multiplication
  2020-06-14 18:21 [Bug middle-end/95674] New: Unnecessary move when doing division-by-multiplication josephcsible at gmail dot com
@ 2020-06-15  8:50 ` jakub at gcc dot gnu.org
  2020-06-15 13:51 ` hjl.tools at gmail dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-15  8:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95674

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |vmakarov at gcc dot gnu.org
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-06-15

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The insn in question is:
(define_insn ("*umuldi3_highpart_1")
     [
        (set (match_operand:DI 0 ("register_operand") ("=d"))
            (truncate:DI (lshiftrt:TI (mult:TI (zero_extend:TI
(match_operand:DI 1 ("nonimmediate_operand") ("%a")))
                        (zero_extend:TI (match_operand:DI 2
("nonimmediate_operand") ("rm"))))
                    (const_int 64 [0x40]))))
        (clobber (match_scratch:DI 3 ("=1")))
        (clobber (reg:CC 17))
    ] ("TARGET_64BIT
and I don't see why the RA couldn't choose the args the other way, perhaps the
match_scratch =1 (i.e. "a" constraint) complicates it, but the constant is
REG_DEAD in that insn.

Vlad, could you please have a look?  Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/95674] Unnecessary move when doing division-by-multiplication
  2020-06-14 18:21 [Bug middle-end/95674] New: Unnecessary move when doing division-by-multiplication josephcsible at gmail dot com
  2020-06-15  8:50 ` [Bug target/95674] " jakub at gcc dot gnu.org
@ 2020-06-15 13:51 ` hjl.tools at gmail dot com
  2020-06-26 22:14 ` vmakarov at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: hjl.tools at gmail dot com @ 2020-06-15 13:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95674

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
PR 95442 is also REG_DEAD related.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/95674] Unnecessary move when doing division-by-multiplication
  2020-06-14 18:21 [Bug middle-end/95674] New: Unnecessary move when doing division-by-multiplication josephcsible at gmail dot com
  2020-06-15  8:50 ` [Bug target/95674] " jakub at gcc dot gnu.org
  2020-06-15 13:51 ` hjl.tools at gmail dot com
@ 2020-06-26 22:14 ` vmakarov at gcc dot gnu.org
  2020-06-27 10:22 ` rsandifo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2020-06-26 22:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95674

--- Comment #3 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
I looked at this problem.

All assignments are done in IRA (LRA does not change them).  We can not make a
better assignment because scratches do not permit to store any preferences from
instruction constraints (pseudo-registers permits to do this). So storing and
getting this info for scratches is the first step to solving the problem.

LRA changes scratches to pseudo-registers to generate the correct code
satisfying the insn constraints and turn them back to scratches when the
corresponding pseudo-registers do not get hard registers.  Moving change of
scratches to pseudo-regs from LRA to IRA could help but it is a big work.

Another solution is to not use scratches in machine-descriptions and use
pseudo-registers instead.  Scratches are bad and should be avoided as much as
possible.  I expressed this several times.  Besides it is not possible to keep
hard register preferences for them, they also can not be taken into account in
conflict graph and this results in overoptimistic assignments which LRA has to
correct.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/95674] Unnecessary move when doing division-by-multiplication
  2020-06-14 18:21 [Bug middle-end/95674] New: Unnecessary move when doing division-by-multiplication josephcsible at gmail dot com
                   ` (2 preceding siblings ...)
  2020-06-26 22:14 ` vmakarov at gcc dot gnu.org
@ 2020-06-27 10:22 ` rsandifo at gcc dot gnu.org
  2020-06-29 14:05 ` vmakarov at gcc dot gnu.org
  2023-04-22 16:55 ` roger at nextmovesoftware dot com
  5 siblings, 0 replies; 7+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-06-27 10:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95674

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #4 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
(In reply to Vladimir Makarov from comment #3)
> I looked at this problem.
> 
> All assignments are done in IRA (LRA does not change them).  We can not make
> a better assignment because scratches do not permit to store any preferences
> from instruction constraints (pseudo-registers permits to do this). So
> storing and getting this info for scratches is the first step to solving the
> problem.
> 
> LRA changes scratches to pseudo-registers to generate the correct code
> satisfying the insn constraints and turn them back to scratches when the
> corresponding pseudo-registers do not get hard registers.  Moving change of
> scratches to pseudo-regs from LRA to IRA could help but it is a big work.
Yeah, sounds useful!  Could it be done as a prepass before IRA proper?
(Maybe not -- haven't really thought about it much.)

> Another solution is to not use scratches in machine-descriptions and use
> pseudo-registers instead.  Scratches are bad and should be avoided as much
> as possible.  I expressed this several times.  Besides it is not possible to
> keep hard register preferences for them, they also can not be taken into
> account in conflict graph and this results in overoptimistic assignments
> which LRA has to correct.
I can see that scratches make life harder for the register allocators,
at least until the above change is made.  But they're very useful for
pre-RA passes.  They make it possible for any pass to use recog to match
a candidate instruction and have recog automatically add the required
scratches, without the passes having to cope with new registers being
created “spontaneously”.  (Combine is of course the main example,
but others like fwprop benefit too.)

Although it would be possible to use pseudos instead, I'm not sure
the end result would be better than what we have now:
- Each pass would need to cope with recog creating new registers
- We'd need some way of deferring the creation of pseudo registers
  for successful but speculative recogs that might later be rejected
  for cost reasons, or at least have some kind of recycling scheme.
- Even then, we'll have more pseudos before RA, and so most of the
  DF datastructures will be larger and perhaps sparser.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/95674] Unnecessary move when doing division-by-multiplication
  2020-06-14 18:21 [Bug middle-end/95674] New: Unnecessary move when doing division-by-multiplication josephcsible at gmail dot com
                   ` (3 preceding siblings ...)
  2020-06-27 10:22 ` rsandifo at gcc dot gnu.org
@ 2020-06-29 14:05 ` vmakarov at gcc dot gnu.org
  2023-04-22 16:55 ` roger at nextmovesoftware dot com
  5 siblings, 0 replies; 7+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2020-06-29 14:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95674

--- Comment #5 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to rsandifo@gcc.gnu.org from comment #4)
> (In reply to Vladimir Makarov from comment #3)
> > I looked at this problem.
> > 
> > All assignments are done in IRA (LRA does not change them).  We can not make
> > a better assignment because scratches do not permit to store any preferences
> > from instruction constraints (pseudo-registers permits to do this). So
> > storing and getting this info for scratches is the first step to solving the
> > problem.
> > 
> > LRA changes scratches to pseudo-registers to generate the correct code
> > satisfying the insn constraints and turn them back to scratches when the
> > corresponding pseudo-registers do not get hard registers.  Moving change of
> > scratches to pseudo-regs from LRA to IRA could help but it is a big work.
> Yeah, sounds useful!  Could it be done as a prepass before IRA proper?

Currently scratches are removed at the beginning of LRA and some of them (which
did not get hard registers) restored at the LRA end.  So to deal with the
scratches in IRA, they should be removed in IRA and some of them should be
restored at the end of LRA.

> (Maybe not -- haven't really thought about it much.)
> 
> > Another solution is to not use scratches in machine-descriptions and use
> > pseudo-registers instead.
> Although it would be possible to use pseudos instead, I'm not sure
> the end result would be better than what we have now:
> - Each pass would need to cope with recog creating new registers
> - We'd need some way of deferring the creation of pseudo registers
>   for successful but speculative recogs that might later be rejected
>   for cost reasons, or at least have some kind of recycling scheme.
> - Even then, we'll have more pseudos before RA, and so most of the
>   DF datastructures will be larger and perhaps sparser.

Yes, there are some disadvantages to removing scratches.  The biggest one is a
lot of work for target maintainers which are interested in better RA.

Of course it is better to solve this problem in one place.  So I'll investigate
removing scratches in IRA more.  May be it is not that big work.  Thank you for
expressing your opinion, Richard.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/95674] Unnecessary move when doing division-by-multiplication
  2020-06-14 18:21 [Bug middle-end/95674] New: Unnecessary move when doing division-by-multiplication josephcsible at gmail dot com
                   ` (4 preceding siblings ...)
  2020-06-29 14:05 ` vmakarov at gcc dot gnu.org
@ 2023-04-22 16:55 ` roger at nextmovesoftware dot com
  5 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-04-22 16:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95674

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
   Target Milestone|---                         |12.0
                 CC|                            |roger at nextmovesoftware dot com
      Known to work|                            |12.0
             Status|NEW                         |RESOLVED

--- Comment #6 from Roger Sayle <roger at nextmovesoftware dot com> ---
This (testcase) has now been fixed on mainline (and back in GCC 12 and 13).

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-04-22 16:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-14 18:21 [Bug middle-end/95674] New: Unnecessary move when doing division-by-multiplication josephcsible at gmail dot com
2020-06-15  8:50 ` [Bug target/95674] " jakub at gcc dot gnu.org
2020-06-15 13:51 ` hjl.tools at gmail dot com
2020-06-26 22:14 ` vmakarov at gcc dot gnu.org
2020-06-27 10:22 ` rsandifo at gcc dot gnu.org
2020-06-29 14:05 ` vmakarov at gcc dot gnu.org
2023-04-22 16:55 ` roger at nextmovesoftware dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).