public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/104405] New: Inefficient register allocation on complex arithmetic
@ 2022-02-06  7:20 tnfchris at gcc dot gnu.org
  2022-02-06  7:29 ` [Bug middle-end/104405] " pinskia at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-02-06  7:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

            Bug ID: 104405
           Summary: Inefficient register allocation on complex arithmetic
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---

The following testcase

#include <complex.h>

complex double f (complex double a, complex double b)
{
    return a * b * I;
}

compiled at -Ofast has unneeded copies of input

i.e.

f:
        fmov    d4, d1
        fmov    d1, d0
        fmul    d0, d4, d2
        fmul    d4, d4, d3
        fnmadd  d0, d1, d3, d0
        fnmsub  d1, d1, d2, d4
        ret

the first two moves are unneeded and looks to be an artifact of how
IMAGPART_EXPR and REALPART_EXPR are expanded.  This seems to be a generic issue
as both x86 and Arm targets seem to have the same problem.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/104405] Inefficient register allocation on complex arithmetic
  2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
@ 2022-02-06  7:29 ` pinskia at gcc dot gnu.org
  2022-02-06  9:19 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-06  7:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I am almost positive there are duplicates of this bug already. It is similar to
the struct argument passing one too.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/104405] Inefficient register allocation on complex arithmetic
  2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
  2022-02-06  7:29 ` [Bug middle-end/104405] " pinskia at gcc dot gnu.org
@ 2022-02-06  9:19 ` pinskia at gcc dot gnu.org
  2022-02-06  9:25 ` [Bug rtl-optimization/104405] " pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-06  9:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-02-06
           Keywords|                            |ra
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This has nothing to do with expansion of IMAGPART_EXPR here but rather the
reuturn side.
Note the x86 issue is different from the aarch64 issue

Here is a testcase which shows it is just return side related:
_Complex double 
 f1 ( double ar, double ai, double br, double bi, double *t)
{
    double _14, _16, _17, _3;
  _14 = ai * bi;
  _16 = ai * br;
  _17 = -(ar*br)+_14;
  _3 = (-(ar*bi)-_16);
  return __builtin_complex(_3, _17);
}

Also adding -fno-schedule-insns for the above testcase removes all of the extra
move instructions.

The big question becomes now is really an issue in real world code or just a
toy benchmark which is testing argument/return passing optimizations?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/104405] Inefficient register allocation on complex arithmetic
  2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
  2022-02-06  7:29 ` [Bug middle-end/104405] " pinskia at gcc dot gnu.org
  2022-02-06  9:19 ` pinskia at gcc dot gnu.org
@ 2022-02-06  9:25 ` pinskia at gcc dot gnu.org
  2022-02-06  9:33 ` tnfchris at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-06  9:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|middle-end                  |rtl-optimization
             Blocks|101926                      |

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note for the original testcase and x86_64 is there is just a lot of extra
register moves. Basically the register allocator seems not tuned to remove
them.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101926
[Bug 101926] [meta-bug] struct/complex argument passing and return should be
improved

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/104405] Inefficient register allocation on complex arithmetic
  2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-02-06  9:25 ` [Bug rtl-optimization/104405] " pinskia at gcc dot gnu.org
@ 2022-02-06  9:33 ` tnfchris at gcc dot gnu.org
  2022-02-06  9:38 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-02-06  9:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

--- Comment #4 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> The big question becomes now is really an issue in real world code or just a
> toy benchmark which is testing argument/return passing optimizations?

Can't say I've gotten it from real world code, I'm just cataloging issues I'm
finding while vectorization support for complex numbers.

But seems to me a simple enough thing that we should be able to handle.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/104405] Inefficient register allocation on complex arithmetic
  2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-02-06  9:33 ` tnfchris at gcc dot gnu.org
@ 2022-02-06  9:38 ` pinskia at gcc dot gnu.org
  2022-02-06 10:06 ` ebotcazou at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-06  9:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #4)
> But seems to me a simple enough thing that we should be able to handle.

It looks simple but register allocation especially with demands on some things
in specific registers is not so simple really. There might be other bugs
related to register allocation around return registers too.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/104405] Inefficient register allocation on complex arithmetic
  2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-02-06  9:38 ` pinskia at gcc dot gnu.org
@ 2022-02-06 10:06 ` ebotcazou at gcc dot gnu.org
  2022-02-07  8:44 ` rguenth at gcc dot gnu.org
  2022-02-07  9:02 ` ebotcazou at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2022-02-06 10:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
                 CC|                            |ebotcazou at gcc dot gnu.org

--- Comment #6 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> But seems to me a simple enough thing that we should be able to handle.

Register allocation is a global problem though, so what happens on toy examples
is not always representative of what happens in real world code because there
is a lot of heuristics involved and you want to tune it for the latter case,
not for the former case.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/104405] Inefficient register allocation on complex arithmetic
  2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-02-06 10:06 ` ebotcazou at gcc dot gnu.org
@ 2022-02-07  8:44 ` rguenth at gcc dot gnu.org
  2022-02-07  9:02 ` ebotcazou at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-07  8:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Eric Botcazou from comment #6)
> > But seems to me a simple enough thing that we should be able to handle.
> 
> Register allocation is a global problem though, so what happens on toy
> examples is not always representative of what happens in real world code
> because there is a lot of heuristics involved and you want to tune it for
> the latter case, not for the former case.

There is of course the option to switch to alternate heuristics if, by
heuristic, the argument/return part of the function is a big part of it (aka
for toy examples or small functions which do happen in real-life).

Also toy examples can highlight issues that also exist in the real world.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/104405] Inefficient register allocation on complex arithmetic
  2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-02-07  8:44 ` rguenth at gcc dot gnu.org
@ 2022-02-07  9:02 ` ebotcazou at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2022-02-07  9:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104405

--- Comment #8 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> There is of course the option to switch to alternate heuristics if, by
> heuristic, the argument/return part of the function is a big part of it (aka
> for toy examples or small functions which do happen in real-life).

Indeed a solution to be considered, but then...

> Also toy examples can highlight issues that also exist in the real world.

...you break the connection between toy examples and real world even more.

In any case, I'm not a RA specialist so I'll stop there, but my understanding
is that you don't build a good RA for real world code by examining a list of
reduced cases and trying to make it optimal for each of them.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-07  9:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-06  7:20 [Bug c/104405] New: Inefficient register allocation on complex arithmetic tnfchris at gcc dot gnu.org
2022-02-06  7:29 ` [Bug middle-end/104405] " pinskia at gcc dot gnu.org
2022-02-06  9:19 ` pinskia at gcc dot gnu.org
2022-02-06  9:25 ` [Bug rtl-optimization/104405] " pinskia at gcc dot gnu.org
2022-02-06  9:33 ` tnfchris at gcc dot gnu.org
2022-02-06  9:38 ` pinskia at gcc dot gnu.org
2022-02-06 10:06 ` ebotcazou at gcc dot gnu.org
2022-02-07  8:44 ` rguenth at gcc dot gnu.org
2022-02-07  9:02 ` ebotcazou at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).