public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays
@ 2011-09-06 14:59 tom at deltasystem dot hu
  2011-09-06 16:02 ` [Bug target/50304] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: tom at deltasystem dot hu @ 2011-09-06 14:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50304

             Bug #: 50304
           Summary: poor code for accessing certain element of arrays
    Classification: Unclassified
           Product: gcc
           Version: 4.5.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: tom@deltasystem.hu


Array elements on known (hardcoded) positions are accessed by inefficient code.
The addresses are computed runtime, waisting both runtime and code size. For
example:

int a[4][4][16];

void foo( void )
{
    int c;

    c = a[2][3][4];
    ...
}
compiles this way in -O1, ARM Cortex M0:
   0:    4b02          ldr    r3, [pc, #8]    ; (c <foo+0xc>)
   2:    22b4          movs    r2, #180    ; 0xb4
   4:    0092          lsls    r2, r2, #2
   6:    589a          ldr    r2, [r3, r2]
-----------
I tried to convert this operation to be precompiled by forming a constant
address.

int a[4][4][16];
int *const b=&a[2][3][4];

void foo( void )
{
    int c;

    c = *b;
    ...
}
However, it is ignored by gcc, and compiles this way in -O1:
   0:    4b03          ldr    r3, [pc, #12]    ; (10 <foo+0x10>)
   2:    21b4          movs    r1, #180    ; 0xb4
   4:    0089          lsls    r1, r1, #2
   6:    185a          adds    r2, r3, r1
   8:    6812          ldr    r2, [r2, #0]

The actual code can vary, depending on the situation. For example:
   0:    4b02          ldr    r3, [pc, #8]    ; (c <foo+0xc>)
   2:    4a03          ldr    r2, [pc, #12]    ; (10 <foo+0x10>)
   4:    589a          ldr    r2, [r3, r2]

or
   0:    4b02          ldr    r3, [pc, #8]    ; (c <foo+0xc>)
   2:    4903          ldr    r1, [pc, #12]    ; (10 <foo+0x10>)
   4:    185a          adds    r2, r3, r1    a[0][0][0]=c;
   6:    6812          ldr    r2, [r2, #0]

Sometimes (I don't know yet why and exactly when) it can be even much worse,
introducing bytewide accessing of e.g. an int32_t, dissassembling and
reassembling it. (It's definetely not an alignment problem.) This waists about
40 instructions for a read-modify-write access.
------------------

It should have look like this:
   0:    4b02          ldr    r3, [pc, #8]    ; (c <foo+0xc>)
   2:    681b          ldr    r3, [r3, #0]
   4:    681a          ldr    r2, [r3, #0]
where the constant (at 0x0c) at the first row is a precalculated address.

------------------
With variable address, like this:

int a[4][4][16];
int *b=&a[2][3][4];
...
it work nicely. However, it is really not equivalent solution at flash based
processors.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/50304] poor code for accessing certain element of arrays
  2011-09-06 14:59 [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays tom at deltasystem dot hu
@ 2011-09-06 16:02 ` rguenth at gcc dot gnu.org
  2011-09-08 13:16 ` tom at deltasystem dot hu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-09-06 16:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50304

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2011-09-06
          Component|tree-optimization           |target
     Ever Confirmed|0                           |1

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-09-06 16:01:31 UTC ---
Isn't it just materializing the constant in an "efficient" way?

Btw, you lack a compilable testcase.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/50304] poor code for accessing certain element of arrays
  2011-09-06 14:59 [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays tom at deltasystem dot hu
  2011-09-06 16:02 ` [Bug target/50304] " rguenth at gcc dot gnu.org
@ 2011-09-08 13:16 ` tom at deltasystem dot hu
  2011-09-08 13:17 ` tom at deltasystem dot hu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: tom at deltasystem dot hu @ 2011-09-08 13:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50304

--- Comment #2 from Tamas Fenyvesi <tom at deltasystem dot hu> 2011-09-08 13:14:44 UTC ---
Created attachment 25225
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25225
some testcases


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/50304] poor code for accessing certain element of arrays
  2011-09-06 14:59 [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays tom at deltasystem dot hu
  2011-09-06 16:02 ` [Bug target/50304] " rguenth at gcc dot gnu.org
  2011-09-08 13:16 ` tom at deltasystem dot hu
@ 2011-09-08 13:17 ` tom at deltasystem dot hu
  2011-09-08 13:27 ` tom at deltasystem dot hu
  2013-03-04  8:06 ` rearnsha at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: tom at deltasystem dot hu @ 2011-09-08 13:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50304

--- Comment #3 from Tamas Fenyvesi <tom at deltasystem dot hu> 2011-09-08 13:16:20 UTC ---
Created attachment 25226
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25226
-O1..-O3 compiled objdumped result


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/50304] poor code for accessing certain element of arrays
  2011-09-06 14:59 [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays tom at deltasystem dot hu
                   ` (2 preceding siblings ...)
  2011-09-08 13:17 ` tom at deltasystem dot hu
@ 2011-09-08 13:27 ` tom at deltasystem dot hu
  2013-03-04  8:06 ` rearnsha at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: tom at deltasystem dot hu @ 2011-09-08 13:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50304

--- Comment #4 from Tamas Fenyvesi <tom at deltasystem dot hu> 2011-09-08 13:16:53 UTC ---
Please find a sample code and its objdump-ed asm in the attachment.

The command line is:
arm-none-eabi-gcc -D__REDLIB__ -DDEBUG -D__USE_CMSIS -D__CODE_RED -I"../cmsis"
-I"../config" -I"../driver" -O1 -g3 -Wall -c -fmessage-length=0 -fno-builtin
-ffunction-sections -fdata-sections -mcpu=cortex-m0 -mthumb -MMD -MP
-MF"src/bugreport.d" -MT"src/bugreport.d" -o"src/bugreport.o"
"../src/bugreport.c"

Some comments:
-It is the same from -O1 up to -O3. The -O0 is worse.

-Access of an int array differs somewhat but both have adding (or relative)
operation. Access of a struct doesn't differ in anything whether or not uses a
*const.

-All (* const) should have been exist (rather than have been replaced by some
adding of two (or more) other adders). It definetely costs more (in code area)
to have some offset vector and adding code than to have a precomputed ofsetted
address (even if the base address were stored elsewhere), not to mention the
huge wasted runtime. There can appear several cascaded adding operations for
computing a single address. The code is larger and much slower, plus it needs
more register to allocate, which in turn also increase code size and required
runtime.

-Why does the compiler resolve the mode of computing the ofsetted address
instead of simply materializing a *const if the user explicitly instructs it to
hace a *const? The code can have any number of copies of *const address (as it
can't change) if it were required for e.g. a pc-relative loading. There's
simply no point in not direct materializing any *const.

-Precompiling any known addesses and adding only the variables, on the other
hand, is good practice (and not uncommon in other compilers than gcc), even
without any *const. (E.g. for "a[1][2][var]" it should precompile "a[1][2]" and
add only "var".)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/50304] poor code for accessing certain element of arrays
  2011-09-06 14:59 [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays tom at deltasystem dot hu
                   ` (3 preceding siblings ...)
  2011-09-08 13:27 ` tom at deltasystem dot hu
@ 2013-03-04  8:06 ` rearnsha at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2013-03-04  8:06 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50304

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
                 CC|                            |joey.ye at arm dot com

--- Comment #5 from Richard Earnshaw <rearnsha at gcc dot gnu.org> 2013-03-04 08:06:16 UTC ---
Confirmed.  Still happening on Trunk as of 2013/03/04.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-03-04  8:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-06 14:59 [Bug tree-optimization/50304] New: poor code for accessing certain element of arrays tom at deltasystem dot hu
2011-09-06 16:02 ` [Bug target/50304] " rguenth at gcc dot gnu.org
2011-09-08 13:16 ` tom at deltasystem dot hu
2011-09-08 13:17 ` tom at deltasystem dot hu
2011-09-08 13:27 ` tom at deltasystem dot hu
2013-03-04  8:06 ` rearnsha at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).