public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays
@ 2011-09-06 14:59 tom at deltasystem dot hu
  2011-09-06 16:02 ` [Bug target/50304] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: tom at deltasystem dot hu @ 2011-09-06 14:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50304

             Bug #: 50304
           Summary: poor code for accessing certain element of arrays
    Classification: Unclassified
           Product: gcc
           Version: 4.5.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: tom@deltasystem.hu


Array elements on known (hardcoded) positions are accessed by inefficient code.
The addresses are computed runtime, waisting both runtime and code size. For
example:

int a[4][4][16];

void foo( void )
{
    int c;

    c = a[2][3][4];
    ...
}
compiles this way in -O1, ARM Cortex M0:
   0:    4b02          ldr    r3, [pc, #8]    ; (c <foo+0xc>)
   2:    22b4          movs    r2, #180    ; 0xb4
   4:    0092          lsls    r2, r2, #2
   6:    589a          ldr    r2, [r3, r2]
-----------
I tried to convert this operation to be precompiled by forming a constant
address.

int a[4][4][16];
int *const b=&a[2][3][4];

void foo( void )
{
    int c;

    c = *b;
    ...
}
However, it is ignored by gcc, and compiles this way in -O1:
   0:    4b03          ldr    r3, [pc, #12]    ; (10 <foo+0x10>)
   2:    21b4          movs    r1, #180    ; 0xb4
   4:    0089          lsls    r1, r1, #2
   6:    185a          adds    r2, r3, r1
   8:    6812          ldr    r2, [r2, #0]

The actual code can vary, depending on the situation. For example:
   0:    4b02          ldr    r3, [pc, #8]    ; (c <foo+0xc>)
   2:    4a03          ldr    r2, [pc, #12]    ; (10 <foo+0x10>)
   4:    589a          ldr    r2, [r3, r2]

or
   0:    4b02          ldr    r3, [pc, #8]    ; (c <foo+0xc>)
   2:    4903          ldr    r1, [pc, #12]    ; (10 <foo+0x10>)
   4:    185a          adds    r2, r3, r1    a[0][0][0]=c;
   6:    6812          ldr    r2, [r2, #0]

Sometimes (I don't know yet why and exactly when) it can be even much worse,
introducing bytewide accessing of e.g. an int32_t, dissassembling and
reassembling it. (It's definetely not an alignment problem.) This waists about
40 instructions for a read-modify-write access.
------------------

It should have look like this:
   0:    4b02          ldr    r3, [pc, #8]    ; (c <foo+0xc>)
   2:    681b          ldr    r3, [r3, #0]
   4:    681a          ldr    r2, [r3, #0]
where the constant (at 0x0c) at the first row is a precalculated address.

------------------
With variable address, like this:

int a[4][4][16];
int *b=&a[2][3][4];
...
it work nicely. However, it is really not equivalent solution at flash based
processors.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-03-04  8:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-06 14:59 [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays tom at deltasystem dot hu
2011-09-06 16:02 ` [Bug target/50304] " rguenth at gcc dot gnu.org
2011-09-08 13:16 ` tom at deltasystem dot hu
2011-09-08 13:17 ` tom at deltasystem dot hu
2011-09-08 13:27 ` tom at deltasystem dot hu
2013-03-04  8:06 ` rearnsha at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).