public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* avr compilation
@ 2011-03-18  8:48 Paulo J. Matos
  2011-03-18 10:08 ` WANG.Jiong
  2011-03-18 13:26 ` Georg-Johann Lay
  0 siblings, 2 replies; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18  8:48 UTC (permalink / raw)
  To: gcc

Hi all,

I am looking at the avr backend in order to try to sort some things out 
on my own backend.

One of the tests I am doing is by compiling the following:
int x = 0x1010;
int y = 0x0101;

int add(void)
{
   return x+y;
}

It compiles to (in gcc-4.3.5_avr with -Os)
add:
/* prologue: function */
/* frame size = 0 */
         lds r18,y
         lds r19,(y)+1
         lds r24,x
         lds r25,(x)+1
         add r18,r24
         adc r19,r25
         mov r24,r18
         mov r25,r19
/* epilogue start */
         ret

I don't know much avr assembler so bear with me but I would expect this 
to be written:
add:
/* prologue: function */
/* frame size = 0 */
         lds r18,y
         lds r19,(y)+1
         lds r24,x
         lds r25,(x)+1
         add r24,r18
         adc r25,r19
/* epilogue start */
         ret

By inverting the add arguments we save two mov instructions.

If it can be written like this any ideas on why GCC is avoiding it?

Cheers,

--
PMatos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: avr compilation
  2011-03-18  8:48 avr compilation Paulo J. Matos
@ 2011-03-18 10:08 ` WANG.Jiong
  2011-03-18 10:15   ` Paulo J. Matos
  2011-03-18 13:26 ` Georg-Johann Lay
  1 sibling, 1 reply; 9+ messages in thread
From: WANG.Jiong @ 2011-03-18 10:08 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc

This may related with subreg regmove finding
Suggest specifiy -fdump-rtl-regmove to see what happen after this pass
Maybe avr need a target dependent regmove pass to handle this


Best,
Jiong

On 03/18/2011 04:47 PM, Paulo J. Matos wrote:
> Hi all,
>
> I am looking at the avr backend in order to try to sort some things 
> out on my own backend.
>
> One of the tests I am doing is by compiling the following:
> int x = 0x1010;
> int y = 0x0101;
>
> int add(void)
> {
>   return x+y;
> }
>
> It compiles to (in gcc-4.3.5_avr with -Os)
> add:
> /* prologue: function */
> /* frame size = 0 */
>         lds r18,y
>         lds r19,(y)+1
>         lds r24,x
>         lds r25,(x)+1
>         add r18,r24
>         adc r19,r25
>         mov r24,r18
>         mov r25,r19
> /* epilogue start */
>         ret
>
> I don't know much avr assembler so bear with me but I would expect 
> this to be written:
> add:
> /* prologue: function */
> /* frame size = 0 */
>         lds r18,y
>         lds r19,(y)+1
>         lds r24,x
>         lds r25,(x)+1
>         add r24,r18
>         adc r25,r19
> /* epilogue start */
>         ret
>
> By inverting the add arguments we save two mov instructions.
>
> If it can be written like this any ideas on why GCC is avoiding it?
>
> Cheers,
>
> -- 
> PMatos
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: avr compilation
  2011-03-18 10:08 ` WANG.Jiong
@ 2011-03-18 10:15   ` Paulo J. Matos
  2011-03-18 12:11     ` David Brown
  0 siblings, 1 reply; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18 10:15 UTC (permalink / raw)
  To: gcc

On 18/03/11 10:08, WANG.Jiong wrote:
> This may related with subreg regmove finding
> Suggest specifiy -fdump-rtl-regmove to see what happen after this pass
> Maybe avr need a target dependent regmove pass to handle this
>

It doesn't look like it's regmove, whose result looks pretty sane:
;; Pred edge  ENTRY [100.0%]  (fallthru)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)

(insn 5 2 6 2 add.c:5 (set (reg:HI 42)
         (mem/c/i:HI (symbol_ref:HI ("y") [flags 0x2] <var_decl 
0x7f40954201e0 y>) [2 y+0 S2 A8])) 8 {*movhi} (nil))

(insn 6 5 7 2 add.c:5 (set (reg:HI 44 [ x ])
         (mem/c/i:HI (symbol_ref:HI ("x") [flags 0x2] <var_decl 
0x7f4095420140 x>) [2 x+0 S2 A8])) 8 {*movhi} (nil))

(insn 7 6 25 2 add.c:5 (set (reg:HI 42)
         (plus:HI (reg:HI 42)
             (reg:HI 44 [ x ]))) 20 {*addhi3} (expr_list:REG_DEAD 
(reg:HI 44 [ x ])
         (nil)))

(insn 25 7 26 2 add.c:7 (set (reg:QI 24 r24)
         (subreg:QI (reg:HI 42) 0)) 4 {*movqi} (nil))

(insn 26 25 18 2 add.c:7 (set (reg:QI 25 r25 [+1 ])
         (subreg:QI (reg:HI 42) 1)) 4 {*movqi} (expr_list:REG_DEAD 
(reg:HI 42)
         (nil)))

(insn 18 26 0 2 add.c:7 (use (reg/i:HI 24 r24)) -1 (nil))
;; End of basic block 2 -> ( 1)


If psr 42 and 44 are allocated to the proper registers, i.e. 42 is 
allocated to the return registers insn 25/26 could be deleted, however, 
that's not what happens after register allocation:

;; Pred edge  ENTRY [100.0%]  (fallthru)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)

(insn 5 2 6 2 add.c:5 (set (reg:HI 18 r18 [42])
         (mem/c/i:HI (symbol_ref:HI ("y") [flags 0x2] <var_decl 
0x7f40954201e0 y>) [2 y+0 S2 A8])) 8 {*movhi} (nil))

(insn 6 5 7 2 add.c:5 (set (reg:HI 24 r24 [orig:44 x ] [44])
         (mem/c/i:HI (symbol_ref:HI ("x") [flags 0x2] <var_decl 
0x7f4095420140 x>) [2 x+0 S2 A8])) 8 {*movhi} (nil))

(insn 7 6 25 2 add.c:5 (set (reg:HI 18 r18 [42])
         (plus:HI (reg:HI 18 r18 [42])
             (reg:HI 24 r24 [orig:44 x ] [44]))) 20 {*addhi3} (nil))

(insn 25 7 26 2 add.c:7 (set (reg:QI 24 r24)
         (reg:QI 18 r18 [42])) 4 {*movqi} (nil))

(insn 26 25 18 2 add.c:7 (set (reg:QI 25 r25 [+1 ])
         (reg:QI 19 r19 [+1 ])) 4 {*movqi} (nil))

(insn 18 26 27 2 add.c:7 (use (reg/i:HI 24 r24)) -1 (nil))
;; End of basic block 2 -> ( 1)

So, I guess it's probably something else...

>
> Best,
> Jiong
>
> On 03/18/2011 04:47 PM, Paulo J. Matos wrote:
>> Hi all,
>>
>> I am looking at the avr backend in order to try to sort some things
>> out on my own backend.
>>
>> One of the tests I am doing is by compiling the following:
>> int x = 0x1010;
>> int y = 0x0101;
>>
>> int add(void)
>> {
>> return x+y;
>> }
>>
>> It compiles to (in gcc-4.3.5_avr with -Os)
>> add:
>> /* prologue: function */
>> /* frame size = 0 */
>> lds r18,y
>> lds r19,(y)+1
>> lds r24,x
>> lds r25,(x)+1
>> add r18,r24
>> adc r19,r25
>> mov r24,r18
>> mov r25,r19
>> /* epilogue start */
>> ret
>>
>> I don't know much avr assembler so bear with me but I would expect
>> this to be written:
>> add:
>> /* prologue: function */
>> /* frame size = 0 */
>> lds r18,y
>> lds r19,(y)+1
>> lds r24,x
>> lds r25,(x)+1
>> add r24,r18
>> adc r25,r19
>> /* epilogue start */
>> ret
>>
>> By inverting the add arguments we save two mov instructions.
>>
>> If it can be written like this any ideas on why GCC is avoiding it?
>>
>> Cheers,
>>
>> --
>> PMatos
>>
>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: avr compilation
  2011-03-18 10:15   ` Paulo J. Matos
@ 2011-03-18 12:11     ` David Brown
  2011-03-18 13:37       ` Paulo J. Matos
  0 siblings, 1 reply; 9+ messages in thread
From: David Brown @ 2011-03-18 12:11 UTC (permalink / raw)
  To: gcc

On 18/03/2011 11:15, Paulo J. Matos wrote:
> On 18/03/11 10:08, WANG.Jiong wrote:
>> This may related with subreg regmove finding
>> Suggest specifiy -fdump-rtl-regmove to see what happen after this pass
>> Maybe avr need a target dependent regmove pass to handle this
>>
>
> It doesn't look like it's regmove, whose result looks pretty sane:

As far as I can see, you are correct - avr-gcc generates subobtimal code 
here, and your version is better.

There are only a few people who work with the AVR backend, and while 
these people are both clever and dedicated, they are limited in how much 
they can do - correct code generation and support for newer devices or 
features rightly takes priority over optimisation issues.  Thus there 
are a fair number of "missed optimisation" issues filed for the AVR 
backend, and many more cases like this of suboptimal code that don't 
even have issues filed.

There are also a number of patches that are generally applied to avr-gcc 
builds (most of which eventually make it into the main FSF tree).  I 
could not say if any of these are relevant here.

If you are digging through the AVR backend and find ways to improve code 
sequences like this, the avr-gcc community would be very grateful.

There is an avr-gcc mailing list at 
<http://lists.nongnu.org/mailman/listinfo/avr-gcc-list>, which may be of 
interest to you.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: avr compilation
  2011-03-18  8:48 avr compilation Paulo J. Matos
  2011-03-18 10:08 ` WANG.Jiong
@ 2011-03-18 13:26 ` Georg-Johann Lay
  2011-03-18 13:40   ` Paulo J. Matos
  1 sibling, 1 reply; 9+ messages in thread
From: Georg-Johann Lay @ 2011-03-18 13:26 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc

Paulo J. Matos schrieb:
> Hi all,
> 
> I am looking at the avr backend in order to try to sort some things out
> on my own backend.
> 
> One of the tests I am doing is by compiling the following:
> int x = 0x1010;
> int y = 0x0101;
> 
> int add(void)
> {
>   return x+y;
> }
> 
> It compiles to (in gcc-4.3.5_avr with -Os)
> add:
> /* prologue: function */
> /* frame size = 0 */
>         lds r18,y
>         lds r19,(y)+1
>         lds r24,x
>         lds r25,(x)+1
>         add r18,r24
>         adc r19,r25
>         mov r24,r18
>         mov r25,r19
> /* epilogue start */
>         ret
> 
> I don't know much avr assembler so bear with me but I would expect this

note that the last moves are two QI moves, the add is HI.

Without splitting HI the moves will disappear, try -fno-split-wide-types.

Johann

> to be written:
> add:
> /* prologue: function */
> /* frame size = 0 */
>         lds r18,y
>         lds r19,(y)+1
>         lds r24,x
>         lds r25,(x)+1
>         add r24,r18
>         adc r25,r19
> /* epilogue start */
>         ret
> 
> By inverting the add arguments we save two mov instructions.
> 
> If it can be written like this any ideas on why GCC is avoiding it?

Try newer version of gcc, like 4.5.2
> 
> Cheers,
> 
> -- 
> PMatos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: avr compilation
  2011-03-18 12:11     ` David Brown
@ 2011-03-18 13:37       ` Paulo J. Matos
  0 siblings, 0 replies; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18 13:37 UTC (permalink / raw)
  To: gcc

On 18/03/11 12:10, David Brown wrote:
> If you are digging through the AVR backend and find ways to improve code
> sequences like this, the avr-gcc community would be very grateful.
>
> There is an avr-gcc mailing list at
> <http://lists.nongnu.org/mailman/listinfo/avr-gcc-list>, which may be of
> interest to you.
>

Thanks for your reply. Just subscribed to avr-gcc through gmane.
Georg-Johann Lay suggests correctly that -fno-split-wide-types helps. 
With this option the code is as I suggested.

I will try 4.5.2 to see if it works out of the box! :)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: avr compilation
  2011-03-18 13:26 ` Georg-Johann Lay
@ 2011-03-18 13:40   ` Paulo J. Matos
  2011-03-18 14:20     ` Ian Lance Taylor
  0 siblings, 1 reply; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18 13:40 UTC (permalink / raw)
  To: gcc

On 18/03/11 13:26, Georg-Johann Lay wrote:
> note that the last moves are two QI moves, the add is HI.
>

Yes, correct, this seems to cause some confusion on gcc side then... humm!

> Without splitting HI the moves will disappear, try -fno-split-wide-types.
>

It does work! It's enabled by -O1, maybe it should be disabled by -Os if 
it improves the code size consistently.

>
> Try newer version of gcc, like 4.5.2
>

Thanks, I will. Hopefully it will do the optimisation out of the box 
without the need for the extra option.


Cheers,

--
PMatos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: avr compilation
  2011-03-18 13:40   ` Paulo J. Matos
@ 2011-03-18 14:20     ` Ian Lance Taylor
  2011-03-18 14:50       ` Paulo J. Matos
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Lance Taylor @ 2011-03-18 14:20 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc

"Paulo J. Matos" <pocmatos@gmail.com> writes:

>> Without splitting HI the moves will disappear, try -fno-split-wide-types.
>>
>
> It does work! It's enabled by -O1, maybe it should be disabled by -Os
> if it improves the code size consistently.

-fsplit-wide-types is an improvement on most targets, in which ints and
pointers have the size UNITS_PER_WORD.  On AVR that appears not to be
the case, and it seems possible that AVR should set
flag_split_wide_types to 0 in TARGET_OPTION_OPTIMIZATION_TABLE.

Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: avr compilation
  2011-03-18 14:20     ` Ian Lance Taylor
@ 2011-03-18 14:50       ` Paulo J. Matos
  0 siblings, 0 replies; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18 14:50 UTC (permalink / raw)
  To: gcc

On 18/03/11 14:20, Ian Lance Taylor wrote:
>
> -fsplit-wide-types is an improvement on most targets, in which ints and
> pointers have the size UNITS_PER_WORD.  On AVR that appears not to be
> the case, and it seems possible that AVR should set
> flag_split_wide_types to 0 in TARGET_OPTION_OPTIMIZATION_TABLE.

That does explains why split-wide-types causes improvement on avr. Thanks.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-03-18 14:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-18  8:48 avr compilation Paulo J. Matos
2011-03-18 10:08 ` WANG.Jiong
2011-03-18 10:15   ` Paulo J. Matos
2011-03-18 12:11     ` David Brown
2011-03-18 13:37       ` Paulo J. Matos
2011-03-18 13:26 ` Georg-Johann Lay
2011-03-18 13:40   ` Paulo J. Matos
2011-03-18 14:20     ` Ian Lance Taylor
2011-03-18 14:50       ` Paulo J. Matos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).