public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102135] New: (ARM Cortex-M3 and newer) changing operation order  may reduce number of instructions needed
@ 2021-08-30 20:24 jankowski938 at gmail dot com
  2021-08-31 13:27 ` [Bug target/102135] " rearnsha at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: jankowski938 at gmail dot com @ 2021-08-30 20:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102135

            Bug ID: 102135
           Summary: (ARM Cortex-M3 and newer) changing operation order
                    may reduce number of instructions needed
           Product: gcc
           Version: 10.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jankowski938 at gmail dot com
  Target Milestone: ---

uint64_t foo64(const uint8_t *rData1)
{
    uint64_t buffer;
    buffer =  (((uint64_t)rData1[7]) << 56)|((uint64_t)(rData1[6]) <<
48)|((uint64_t)(rData1[5]) << 40)|(((uint64_t)rData1[4]) << 32)|
                            (((uint64_t)rData1[3]) <<
24)|(((uint64_t)rData1[2]) << 16)|((uint64_t)(rData1[1]) << 8)|rData1[0];

----
foo64:
        mov     r3, r0
        ldr     r0, [r0]  @ unaligned
        ldr     r1, [r3, #4]      @ unaligned
        bx      lr

Only 3 instructions are needed:

        ldr     r1, [r0, #4]      @ unaligned
        ldr     r0, [r0]  @ unaligned
        bx      lr

Options:
-O3 -mthumb -mcpu=cortex-M4

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug target/102135] (ARM Cortex-M3 and newer) changing operation order  may reduce number of instructions needed
  2021-08-30 20:24 [Bug target/102135] New: (ARM Cortex-M3 and newer) changing operation order may reduce number of instructions needed jankowski938 at gmail dot com
@ 2021-08-31 13:27 ` rearnsha at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2021-08-31 13:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102135

--- Comment #1 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
A small change to the testcase shows that this is highly dependent on the
constrained registers from the calling convention.  

uint64_t foo64(int dummy, const uint8_t *rData1)
{
    uint64_t buffer;
    buffer =  (((uint64_t)rData1[7]) << 56)|((uint64_t)(rData1[6]) <<
48)|((uint64_t)(rData1[5]) << 40)|(((uint64_t)rData1[4]) << 32)|
                            (((uint64_t)rData1[3]) <<
24)|(((uint64_t)rData1[2]) << 16)|((uint64_t)(rData1[1]) << 8)|rData1[0];

}

Register allocation does not re-order code in order to reduce the conflicts, so
this is not easy to fix.

This is also a problem that is more obvious in micro-testcases such as this
example, in real code it is more common for the register allocator to have more
freedom and to be able to avoid issues like this.  If your programming style is
to write functions like this you'd likely get better code overall by marking
these very small functions as inline, so that they do not incur the call setup
and call/return overhead, which can be significant when you take into account
the number of registers that must be saved over a function call.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-08-31 13:27 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-30 20:24 [Bug target/102135] New: (ARM Cortex-M3 and newer) changing operation order may reduce number of instructions needed jankowski938 at gmail dot com
2021-08-31 13:27 ` [Bug target/102135] " rearnsha at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).