public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits
@ 2012-03-09 15:39 gjl at gcc dot gnu.org
  2012-03-09 15:39 ` [Bug rtl-optimization/52543] " gjl at gcc dot gnu.org
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-03-09 15:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

             Bug #: 52543
           Summary: lower-subreg.c: code bloat of 300%-400% for bulti-word
                    memory splits
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: gjl@gcc.gnu.org
            Target: avr


lower-subreg.c causes extreme code bloat for some memory accesses because it
does not take into account the additional costs caused by the split.

Example code for AVR address spaces:

long readx (const __memx long *p)
{
    return *p;
}

long read1 (const __flash1 long *p)
{
    return *p;
}

long read0 (const __flash long *p)
{
    return *p;
}

Reason is that read from these address spaces need one preparation sequence
(set up segment register) per read. Thus:

For one 4-byte read there will be one preparation sequence.
For 4 one-byte reads there will be 4 preparation sequences.

For the __flash address space no preparation is needed, but a 32-bit read can
use post-increment addressing whereas a split to 4 byte moves won't use
post-increment because GCC is completely afraid of pre-/post-modify addressing.

The only place to hook in could be mode_dependent_address, however, that hook
just passes the address down to the backend but omits the address space in use.
As all 16-bit address spaces (including generic) use HImode as pointer mode,
the target cannot take a decision based on the address alone.

Configured with: ../../gcc.gnu.org/trunk/configure --target=avr
--prefix=/local/gnu/install/gcc-4.7 --disable-nls --with-dwarf2
--enable-checking=yes,rtl --enable-languages=c,c++

gcc version 4.8.0 20120307 (experimental) (GCC)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
@ 2012-03-09 15:39 ` gjl at gcc dot gnu.org
  2012-03-09 15:41 ` gjl at gcc dot gnu.org
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-03-09 15:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #1 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-03-09 15:39:41 UTC ---
Created attachment 26862
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26862
lower.c: C source


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
  2012-03-09 15:39 ` [Bug rtl-optimization/52543] " gjl at gcc dot gnu.org
@ 2012-03-09 15:41 ` gjl at gcc dot gnu.org
  2012-03-09 15:45 ` gjl at gcc dot gnu.org
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-03-09 15:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #2 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-03-09 15:40:58 UTC ---
Created attachment 26863
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26863
lower.s: Assembler output

Compiled with

avr-gcc lower.c -Os -S -dp -mmcu=atmega128


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
  2012-03-09 15:39 ` [Bug rtl-optimization/52543] " gjl at gcc dot gnu.org
  2012-03-09 15:41 ` gjl at gcc dot gnu.org
@ 2012-03-09 15:45 ` gjl at gcc dot gnu.org
  2012-03-09 15:47 ` gjl at gcc dot gnu.org
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-03-09 15:45 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #3 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-03-09 15:44:46 UTC ---
Created attachment 26864
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26864
lower.s: Assembler output with -fno-split-wide-types

Compiled with

avr-gcc lower.c -Os -S -dp -mmcu=atmega128 -fno-split-wide-types

The code size for this module is 38 bytes whereas with lobreg lowering the code
size is 142 bytes, which is a factor of 3.7


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2012-03-09 15:45 ` gjl at gcc dot gnu.org
@ 2012-03-09 15:47 ` gjl at gcc dot gnu.org
  2012-03-21 13:26 ` [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word " gjl at gcc dot gnu.org
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-03-09 15:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P4
             Status|UNCONFIRMED                 |NEW
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2012-03-09
                 CC|                            |eric.weddington at atmel
                   |                            |dot com
     Ever Confirmed|0                           |1
   Target Milestone|---                         |4.7.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2012-03-09 15:47 ` gjl at gcc dot gnu.org
@ 2012-03-21 13:26 ` gjl at gcc dot gnu.org
  2012-03-22 15:13 ` gjl at gcc dot gnu.org
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-03-21 13:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #4 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-03-21 13:20:25 UTC ---
Author: gjl
Date: Wed Mar 21 13:20:20 2012
New Revision: 185605

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=185605
Log:
    PR rtl-optimization/52543
    PR target/52461
    * config/avr/avr-protos.h (avr_load_lpm): New prototype.
    * config/avr/avr.c (avr_mode_dependent_address_p): New function.
    (TARGET_MODE_DEPENDENT_ADDRESS_P): New define.
    (avr_load_libgcc_p): Restrict to __flash loads.
    (avr_out_lpm): Only handle 1-byte loads from __flash.
    (avr_load_lpm): New function.
    (avr_find_unused_d_reg): Remove.
    (avr_out_lpm_no_lpmx): Remove.
    (adjust_insn_length): Handle ADJUST_LEN_LOAD_LPM.
    * config/avr/avr.md (unspec): Add UNSPEC_LPM.
    (load_<mode>_libgcc): Use UNSPEC_LPM instead of MEM.
    (load_<mode>, load_<mode>_clobber): New insns.
    (mov<mode>): For multi-byte move from non-generic
    16-bit address spaces: Expand to load_<mode> resp.
    load_<mode>_clobber.
    (load<mode>_libgcc): Remove expander.
    (split-lpmx): Remove split.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr-protos.h
    trunk/gcc/config/avr/avr.c
    trunk/gcc/config/avr/avr.md


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2012-03-21 13:26 ` [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word " gjl at gcc dot gnu.org
@ 2012-03-22 15:13 ` gjl at gcc dot gnu.org
  2012-03-22 16:08 ` gjl at gcc dot gnu.org
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-03-22 15:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #5 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-03-22 15:07:29 UTC ---
Author: gjl
Date: Thu Mar 22 15:06:57 2012
New Revision: 185697

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=185697
Log:
libgcc/
    Backport from 2012-03-07 mainline r185033.

    PR target/52507
    * config/avr/lib1funcs.S (__movmemx_hi): Fix loop label in RAM-part.

    Backport from 2012-03-07 mainline r185031.

    PR target/52505
    * config/avr/lib1funcs.S (__xload_1): Don't read unintentionally
    from RAM.

    Backport from 2012-03-07 mainline r185030.

    PR target/52461
    PR target/52508
    * config/avr/lib1funcs.S (__do_copy_data): Clear RAMPZ after usage
    if RAMPZ affects reading from RAM.
    (__tablejump_elpm__): Ditto.
    (.xload): Ditto.
    (__movmemx_hi): Ditto.
    (__do_global_ctors): Right condition for RAMPZ usage is "have ELPM".
    (__do_global_dtors): Ditto.
    (__xload_1, __xload_2, __xload_3, __xload_4): Ditto.
    (__movmemx_hi): Ditto.

gcc/
    Backport from 2012-03-22 mainline r185692.

    PR target/52496
    * config/avr/avr.md (unspec): Remove UNSPEC_MEMORY_BARRIER.
    (unspecv): Add UNSPECV_MEMORY_BARRIER.
    (cli_sei): Use unspec_volatile instead of unspec for memory barrier.
    (delay_cycles_1, delay_cycles_2): Ditto.
    (delay_cycles_3, delay_cycles_4): Ditto.
    (nopv, *nopv): Ditto.
    (sleep, *sleep): Ditto.
    (wdr, *wdr): Ditto.

    Backport from 2012-03-21 mainline r185605.

    PR rtl-optimization/52543
    PR target/52461
    * config/avr/avr-protos.h (avr_load_lpm): New prototype.
    * config/avr/avr.c (avr_mode_dependent_address_p): New function.
    (TARGET_MODE_DEPENDENT_ADDRESS_P): New define.
    (avr_load_libgcc_p): Restrict to __flash loads.
    (avr_out_lpm): Only handle 1-byte loads from __flash.
    (avr_load_lpm): New function.
    (avr_find_unused_d_reg): Remove.
    (avr_out_lpm_no_lpmx): Remove.
    (adjust_insn_length): Handle ADJUST_LEN_LOAD_LPM.
    * config/avr/avr.md (unspec): Add UNSPEC_LPM.
    (load_<mode>_libgcc): Use UNSPEC_LPM instead of MEM.
    (load_<mode>, load_<mode>_clobber): New insns.
    (mov<mode>): For multi-byte move from non-generic
    16-bit address spaces: Expand to load_<mode> resp.
    load_<mode>_clobber.
    (load<mode>_libgcc): Remove expander.
    (split-lpmx): Remove split.

    Backport from 2012-03-13 mainline r185329.

    PR target/52488
    * config/avr/avr.c (avr_prologue_setup_frame): Cut down stack
    offset (size) to a value the insns can deal with.
    (expand_epilogue): Ditto.

    Backport from 2012-03-12 mainline r185256.

    PR target/52499
    * config/avr/avr.c (avr_mode_code_base_reg_class): Change return
    type from reg_class_t to enum reg_class.
    * config/avr/avr-protos.h (avr_mode_code_base_reg_class): Ditto.

    Backport from 2012-03-12 mainline r185253.

    PR target/52148
    * config/avr/avr.c (avr_out_movmem): Fix typo in output template
    for the case ADDR_SPACE_FLASH and AVR_HAVE_LPMX introduced in
    r184615 from 2012-02-28.

    Backport from 2012-03-08 mainline r185105.

    * config/avr/avr.md (*addhi3, addhi3_clobber): Add "w" alternative
    for constants in [-63,63].

    Backport from 2012-03-08 mainline r185100.

    PR target/52496
    * config/avr/avr.c (avr_mem_clobber): New static function.
    (avr_expand_delay_cycles): Add memory clobber operand to
    delay_cycles_1, delay_cycles_2, delay_cycles_3, delay_cycles_4.
    * config/avr/avr.md (unspec): Add UNSPEC_MEMORY_BARRIER.
    (enable_interrupt, disable_interrupt): New expander.
    (nopv, sleep, wdr): New expanders.
    (delay_cycles_1): Add memory clobber.
    (delay_cycles_2): Add memory clobber.
    (delay_cycles_3): Add memory clobber.
    (delay_cycles_4): Add memory clobber.
    (cli_sei): New insn from former "enable_interrupt",
    "disable_interrupt" with memory clobber.
    (*wdt): New insn from former "wdt" with memory clobber.
    (*nopv): Similar, but for "nopv".
    (*sleep): Similar, but for "sleep".

    Backport from 2012-03-07 mainline r185043.

    PR target/52484
    * config/avr/avr.md (xload<mode>_A): Add R22... to register footprint.

    Backport from 2012-03-07 mainline r185032.

    PR target/52506
    * gcc/config/avr/avr.c (expand_epilogue): Fix order of restoration
    to: RAMPZ, RAMPY, RAMPX, RAMPD.
    (expand_prologue): Only clear RAMPZ if it has effect on RAM-read.

    Backport from 2012-03-07 mainline r185031.

    PR target/52505
    * config/avr/avr.c (avr_out_xload): Don't read unintentionally
    from RAM.
    * config/avr/avr.md (xload_8): Adjust insn length.

    Backport from 2012-03-07 mainline r185030.

    PR target/52461
    * gcc/config/avr/avr.c (avr_out_lpm): Clear RAMPZ after usage
    if RAMPZ affects reading from RAM.

    Backport from 2012-03-05 mainline r184919.

    * config/avr/avr.md (*umaddqihi4.2): New insn-and-split.


Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/config/avr/avr-protos.h
    branches/gcc-4_7-branch/gcc/config/avr/avr.c
    branches/gcc-4_7-branch/gcc/config/avr/avr.md
    branches/gcc-4_7-branch/libgcc/ChangeLog
    branches/gcc-4_7-branch/libgcc/config/avr/lib1funcs.S


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2012-03-22 15:13 ` gjl at gcc dot gnu.org
@ 2012-03-22 16:08 ` gjl at gcc dot gnu.org
  2012-05-03 17:05 ` rsandifo at gcc dot gnu.org
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-03-22 16:08 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |FIXME

--- Comment #6 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-03-22 15:55:11 UTC ---
FIXME: It works for avr now, but just because that backend hacks around by
using UNSPEC instead of MEM.  This is legitimate because the affected
address-spaces read from Flash, i.e. from non-changing memory.  For respective
FIXMEs see avr.c and avr.md as introduced by the patches above.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2012-03-22 16:08 ` gjl at gcc dot gnu.org
@ 2012-05-03 17:05 ` rsandifo at gcc dot gnu.org
  2012-05-03 17:51 ` rsandifo at gcc dot gnu.org
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2012-05-03 17:05 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #7 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> 2012-05-03 17:04:49 UTC ---
Author: rsandifo
Date: Thu May  3 17:04:41 2012
New Revision: 187110

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187110
Log:
Add PR rtl-optimization/52543 to changelog.

Modified:
    trunk/gcc/ChangeLog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2012-05-03 17:05 ` rsandifo at gcc dot gnu.org
@ 2012-05-03 17:51 ` rsandifo at gcc dot gnu.org
  2012-06-27  9:24 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2012-05-03 17:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #8 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> 2012-05-03 17:50:20 UTC ---
lower-subreg patched committed here:

http://gcc.gnu.org/ml/gcc-cvs/2012-05/msg00011.html


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2012-05-03 17:51 ` rsandifo at gcc dot gnu.org
@ 2012-06-27  9:24 ` pinskia at gcc dot gnu.org
  2012-08-02 18:30 ` gjl at gcc dot gnu.org
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-06-27  9:24 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-06-27 09:24:13 UTC ---
Fixed.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2012-06-27  9:24 ` pinskia at gcc dot gnu.org
@ 2012-08-02 18:30 ` gjl at gcc dot gnu.org
  2012-08-02 18:37 ` gjl at gcc dot gnu.org
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-08-02 18:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |

--- Comment #10 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-08-02 18:29:41 UTC ---
(In reply to comment #9)
> Fixed.

NACK. I cannot confirm that anything changed:

If I revert the Hack from comment #4 that represents a memory access as UNSPEC
instead of as MEM (so that lower-subreg won't split), the code bloat returns.

Just suppose the examples given in comment #0

long readx (const __memx long *p)
{
    return *p;
}

long read0 (const __flash long *p)
{
    return *p;
}


with "GCC: (GNU) 4.8.0 20120802 (experimental)"
and -S -Os -dp -mmcu=avr5 -fno-split-wide-types
the result is fine and there are 32-bit accesses:

readx:
    movw r30,r22     ;  17    *movhi/1    [length = 1]
    mov r21,r24     ;  18    movqi_insn/1    [length = 1]
    call __xload_4     ;  19    xload_si_libgcc    [length = 2]
    ret     ;  24    return    [length = 1]

read0:
    movw r30,r24     ;  19    *movhi/1    [length = 1]
    lpm r22,Z+     ;  20    *movsi/3    [length = 4]
    lpm r23,Z+
    lpm r24,Z+
    lpm r25,Z+
    ret     ;  24    return    [length = 1]


Without -fno-split-wide-types the bloat is still there:

readx:
    push r12     ;  54    pushqi1/1    [length = 1]
    push r13     ;  55    pushqi1/1    [length = 1]
    push r14     ;  56    pushqi1/1    [length = 1]
    mov r26,r24     ;  2    *movpsi/1    [length = 2]
    movw r24,r22
    movw r30,r24     ;  35    *movhi/1    [length = 1]
    sbrc r26,7     ;  37    xload_8/1    [length = 4]
    ld r22,Z
    sbrs r26,7
    lpm r22,Z
    ldi r18,lo8(1)     ;  49    *movpsi/5    [length = 3]
    ldi r19,0
    ldi r20,0
    add r18,r24     ;  19    addpsi3/1    [length = 3]
    adc r19,r25
    adc r20,r26
    movw r30,r18     ;  38    *movhi/1    [length = 1]
    sbrc r20,7     ;  40    xload_8/1    [length = 4]
    ld r23,Z
    sbrs r20,7
    lpm r23,Z
    ldi r18,lo8(2)     ;  64    *reload_inpsi    [length = 4]
    mov r12,r18
    mov r13,__zero_reg__
    mov r14,__zero_reg__
    add r12,r24     ;  22    addpsi3/1    [length = 3]
    adc r13,r25
    adc r14,r26
    ldi r18,lo8(3)     ;  51    *movpsi/5    [length = 3]
    ldi r19,0
    ldi r20,0
    add r18,r24     ;  25    addpsi3/1    [length = 3]
    adc r19,r25
    adc r20,r26
    movw r30,r12     ;  41    *movhi/1    [length = 1]
    sbrc r14,7     ;  43    xload_8/1    [length = 4]
    ld r24,Z
    sbrs r14,7
    lpm r24,Z
    movw r30,r18     ;  44    *movhi/1    [length = 1]
    sbrc r20,7     ;  46    xload_8/1    [length = 4]
    ld r25,Z
    sbrs r20,7
    lpm r25,Z
    pop r14     ;  59    popqi    [length = 1]
    pop r13     ;  60    popqi    [length = 1]
    pop r12     ;  61    popqi    [length = 1]
    ret     ;  62    return_from_epilogue    [length = 1]

read0:
    movw r30,r24     ;  35    *movhi/1    [length = 1]
    lpm r22,Z+     ;  17    movqi_insn/4    [length = 1]
    lpm r23,Z     ;  20    movqi_insn/4    [length = 1]
    movw r18,r24     ;  38    *movhi/1    [length = 1]
    subi r18,-2     ;  22    addhi3_clobber/2    [length = 2]
    sbci r19,-1
    movw r20,r24     ;  39    *movhi/1    [length = 1]
    subi r20,-3     ;  25    addhi3_clobber/2    [length = 2]
    sbci r21,-1
    movw r30,r18     ;  40    *movhi/1    [length = 1]
    lpm r24,Z     ;  33    movqi_insn/4    [length = 1]
    movw r30,r20     ;  41    *movhi/1    [length = 1]
    lpm r25,Z     ;  34    movqi_insn/4    [length = 1]
    ret     ;  44    return    [length = 1]

Each 32-bit move is still split into 4 8-bit moves even though that is
extremely expensive.  This happens both for PSImode and for HImode addresses.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2012-08-02 18:30 ` gjl at gcc dot gnu.org
@ 2012-08-02 18:37 ` gjl at gcc dot gnu.org
  2012-08-02 18:39 ` gjl at gcc dot gnu.org
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-08-02 18:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #11 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-08-02 18:37:18 UTC ---
Created attachment 27928
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27928
/local/gnu/patches/pr52543-undo-185605.diff

Just for reference: Here is the patch that undoes the MEM->UNSPEC workaround
from SVN 185605, comment #4 that I used.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2012-08-02 18:37 ` gjl at gcc dot gnu.org
@ 2012-08-02 18:39 ` gjl at gcc dot gnu.org
  2012-09-06 14:38 ` olegendo at gcc dot gnu.org
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-08-02 18:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu.org

--- Comment #12 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-08-02 18:39:31 UTC ---
CCing Andrew.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2012-08-02 18:39 ` gjl at gcc dot gnu.org
@ 2012-09-06 14:38 ` olegendo at gcc dot gnu.org
  2012-09-06 22:14 ` olegendo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-06 14:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |olegendo at gcc dot gnu.org

--- Comment #13 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-06 14:37:22 UTC ---
I'm not sure, but this looks similar to what is happening in PR 54386, although
there the mem load splitting happens way before lower-subreg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2012-09-06 14:38 ` olegendo at gcc dot gnu.org
@ 2012-09-06 22:14 ` olegendo at gcc dot gnu.org
  2012-09-15  6:07 ` pinskia at gcc dot gnu.org
  2012-09-28  8:21 ` gjl at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-06 22:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #14 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-06 22:13:22 UTC ---
(In reply to comment #0)
>  
> For the __flash address space no preparation is needed, but a 32-bit read can
> use post-increment addressing whereas a split to 4 byte moves won't use
> post-increment because GCC is completely afraid of pre-/post-modify addressing.
> 

BTW, the reason for the auto-inc-dec problem is described in PR 50749.
I've started writing a replacement for the auto-inc-dec pass, but it will take
some time.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2012-09-06 22:14 ` olegendo at gcc dot gnu.org
@ 2012-09-15  6:07 ` pinskia at gcc dot gnu.org
  2012-09-28  8:21 ` gjl at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-09-15  6:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.7.1                       |---


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word memory splits
  2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2012-09-15  6:07 ` pinskia at gcc dot gnu.org
@ 2012-09-28  8:21 ` gjl at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: gjl at gcc dot gnu.org @ 2012-09-28  8:21 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52543

--- Comment #15 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2012-09-28 08:21:19 UTC ---
Author: gjl
Date: Fri Sep 28 08:21:06 2012
New Revision: 191820

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191820
Log:
    PR rtl-optimization/52543
    * config/avr/avr.c (avr_mode_dependent_address_p): Return true for
    all non-generic address spaces.
    (TARGET_SECONDARY_RELOAD): New hook define to...
    (avr_secondary_reload): ...this new static function.
    * config/avr/avr.md (reload_in<mode>): New insns.
    Undo r185605 (mostly):
    * config/avr/avr-protos.h (avr_load_lpm): Remove.
    * config/avr/avr.c (avr_load_libgcc_p): Don't restrict to __flash loads.
    (avr_out_lpm): Also handle loads > 1 byte.
    (avr_load_lpm): Remove.
    (avr_find_unused_d_reg): New static function.
    (avr_out_lpm_no_lpmx): New static function.
    (adjust_insn_length): Remove ADJUST_LEN_LOAD_LPM.
    * config/avr/avr.md (unspec): Remove UNSPEC_LPM.
    (load_<mode>_libgcc): Use MEM instead of UNSPEC_LPM.
    (load_<mode>, load_<mode>_clobber): Remove.
    (mov<mode>): For multi-byte move from non-generic
    16-bit address spaces: Expand to *mov<mode> again.
    (load<mode>_libgcc): New expander.
    (split-lpmx): Remove split.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr-protos.h
    trunk/gcc/config/avr/avr.c
    trunk/gcc/config/avr/avr.md


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-09-28  8:21 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-09 15:39 [Bug rtl-optimization/52543] New: lower-subreg.c: code bloat of 300%-400% for bulti-word memory splits gjl at gcc dot gnu.org
2012-03-09 15:39 ` [Bug rtl-optimization/52543] " gjl at gcc dot gnu.org
2012-03-09 15:41 ` gjl at gcc dot gnu.org
2012-03-09 15:45 ` gjl at gcc dot gnu.org
2012-03-09 15:47 ` gjl at gcc dot gnu.org
2012-03-21 13:26 ` [Bug rtl-optimization/52543] lower-subreg.c: code bloat of 300%-400% for multi-word " gjl at gcc dot gnu.org
2012-03-22 15:13 ` gjl at gcc dot gnu.org
2012-03-22 16:08 ` gjl at gcc dot gnu.org
2012-05-03 17:05 ` rsandifo at gcc dot gnu.org
2012-05-03 17:51 ` rsandifo at gcc dot gnu.org
2012-06-27  9:24 ` pinskia at gcc dot gnu.org
2012-08-02 18:30 ` gjl at gcc dot gnu.org
2012-08-02 18:37 ` gjl at gcc dot gnu.org
2012-08-02 18:39 ` gjl at gcc dot gnu.org
2012-09-06 14:38 ` olegendo at gcc dot gnu.org
2012-09-06 22:14 ` olegendo at gcc dot gnu.org
2012-09-15  6:07 ` pinskia at gcc dot gnu.org
2012-09-28  8:21 ` gjl at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).