* avr compilation
@ 2011-03-18 8:48 Paulo J. Matos
2011-03-18 10:08 ` WANG.Jiong
2011-03-18 13:26 ` Georg-Johann Lay
0 siblings, 2 replies; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18 8:48 UTC (permalink / raw)
To: gcc
Hi all,
I am looking at the avr backend in order to try to sort some things out
on my own backend.
One of the tests I am doing is by compiling the following:
int x = 0x1010;
int y = 0x0101;
int add(void)
{
return x+y;
}
It compiles to (in gcc-4.3.5_avr with -Os)
add:
/* prologue: function */
/* frame size = 0 */
lds r18,y
lds r19,(y)+1
lds r24,x
lds r25,(x)+1
add r18,r24
adc r19,r25
mov r24,r18
mov r25,r19
/* epilogue start */
ret
I don't know much avr assembler so bear with me but I would expect this
to be written:
add:
/* prologue: function */
/* frame size = 0 */
lds r18,y
lds r19,(y)+1
lds r24,x
lds r25,(x)+1
add r24,r18
adc r25,r19
/* epilogue start */
ret
By inverting the add arguments we save two mov instructions.
If it can be written like this any ideas on why GCC is avoiding it?
Cheers,
--
PMatos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: avr compilation
2011-03-18 8:48 avr compilation Paulo J. Matos
@ 2011-03-18 10:08 ` WANG.Jiong
2011-03-18 10:15 ` Paulo J. Matos
2011-03-18 13:26 ` Georg-Johann Lay
1 sibling, 1 reply; 9+ messages in thread
From: WANG.Jiong @ 2011-03-18 10:08 UTC (permalink / raw)
To: Paulo J. Matos; +Cc: gcc
This may related with subreg regmove finding
Suggest specifiy -fdump-rtl-regmove to see what happen after this pass
Maybe avr need a target dependent regmove pass to handle this
Best,
Jiong
On 03/18/2011 04:47 PM, Paulo J. Matos wrote:
> Hi all,
>
> I am looking at the avr backend in order to try to sort some things
> out on my own backend.
>
> One of the tests I am doing is by compiling the following:
> int x = 0x1010;
> int y = 0x0101;
>
> int add(void)
> {
> return x+y;
> }
>
> It compiles to (in gcc-4.3.5_avr with -Os)
> add:
> /* prologue: function */
> /* frame size = 0 */
> lds r18,y
> lds r19,(y)+1
> lds r24,x
> lds r25,(x)+1
> add r18,r24
> adc r19,r25
> mov r24,r18
> mov r25,r19
> /* epilogue start */
> ret
>
> I don't know much avr assembler so bear with me but I would expect
> this to be written:
> add:
> /* prologue: function */
> /* frame size = 0 */
> lds r18,y
> lds r19,(y)+1
> lds r24,x
> lds r25,(x)+1
> add r24,r18
> adc r25,r19
> /* epilogue start */
> ret
>
> By inverting the add arguments we save two mov instructions.
>
> If it can be written like this any ideas on why GCC is avoiding it?
>
> Cheers,
>
> --
> PMatos
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: avr compilation
2011-03-18 10:08 ` WANG.Jiong
@ 2011-03-18 10:15 ` Paulo J. Matos
2011-03-18 12:11 ` David Brown
0 siblings, 1 reply; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18 10:15 UTC (permalink / raw)
To: gcc
On 18/03/11 10:08, WANG.Jiong wrote:
> This may related with subreg regmove finding
> Suggest specifiy -fdump-rtl-regmove to see what happen after this pass
> Maybe avr need a target dependent regmove pass to handle this
>
It doesn't look like it's regmove, whose result looks pretty sane:
;; Pred edge ENTRY [100.0%] (fallthru)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 add.c:5 (set (reg:HI 42)
(mem/c/i:HI (symbol_ref:HI ("y") [flags 0x2] <var_decl
0x7f40954201e0 y>) [2 y+0 S2 A8])) 8 {*movhi} (nil))
(insn 6 5 7 2 add.c:5 (set (reg:HI 44 [ x ])
(mem/c/i:HI (symbol_ref:HI ("x") [flags 0x2] <var_decl
0x7f4095420140 x>) [2 x+0 S2 A8])) 8 {*movhi} (nil))
(insn 7 6 25 2 add.c:5 (set (reg:HI 42)
(plus:HI (reg:HI 42)
(reg:HI 44 [ x ]))) 20 {*addhi3} (expr_list:REG_DEAD
(reg:HI 44 [ x ])
(nil)))
(insn 25 7 26 2 add.c:7 (set (reg:QI 24 r24)
(subreg:QI (reg:HI 42) 0)) 4 {*movqi} (nil))
(insn 26 25 18 2 add.c:7 (set (reg:QI 25 r25 [+1 ])
(subreg:QI (reg:HI 42) 1)) 4 {*movqi} (expr_list:REG_DEAD
(reg:HI 42)
(nil)))
(insn 18 26 0 2 add.c:7 (use (reg/i:HI 24 r24)) -1 (nil))
;; End of basic block 2 -> ( 1)
If psr 42 and 44 are allocated to the proper registers, i.e. 42 is
allocated to the return registers insn 25/26 could be deleted, however,
that's not what happens after register allocation:
;; Pred edge ENTRY [100.0%] (fallthru)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 add.c:5 (set (reg:HI 18 r18 [42])
(mem/c/i:HI (symbol_ref:HI ("y") [flags 0x2] <var_decl
0x7f40954201e0 y>) [2 y+0 S2 A8])) 8 {*movhi} (nil))
(insn 6 5 7 2 add.c:5 (set (reg:HI 24 r24 [orig:44 x ] [44])
(mem/c/i:HI (symbol_ref:HI ("x") [flags 0x2] <var_decl
0x7f4095420140 x>) [2 x+0 S2 A8])) 8 {*movhi} (nil))
(insn 7 6 25 2 add.c:5 (set (reg:HI 18 r18 [42])
(plus:HI (reg:HI 18 r18 [42])
(reg:HI 24 r24 [orig:44 x ] [44]))) 20 {*addhi3} (nil))
(insn 25 7 26 2 add.c:7 (set (reg:QI 24 r24)
(reg:QI 18 r18 [42])) 4 {*movqi} (nil))
(insn 26 25 18 2 add.c:7 (set (reg:QI 25 r25 [+1 ])
(reg:QI 19 r19 [+1 ])) 4 {*movqi} (nil))
(insn 18 26 27 2 add.c:7 (use (reg/i:HI 24 r24)) -1 (nil))
;; End of basic block 2 -> ( 1)
So, I guess it's probably something else...
>
> Best,
> Jiong
>
> On 03/18/2011 04:47 PM, Paulo J. Matos wrote:
>> Hi all,
>>
>> I am looking at the avr backend in order to try to sort some things
>> out on my own backend.
>>
>> One of the tests I am doing is by compiling the following:
>> int x = 0x1010;
>> int y = 0x0101;
>>
>> int add(void)
>> {
>> return x+y;
>> }
>>
>> It compiles to (in gcc-4.3.5_avr with -Os)
>> add:
>> /* prologue: function */
>> /* frame size = 0 */
>> lds r18,y
>> lds r19,(y)+1
>> lds r24,x
>> lds r25,(x)+1
>> add r18,r24
>> adc r19,r25
>> mov r24,r18
>> mov r25,r19
>> /* epilogue start */
>> ret
>>
>> I don't know much avr assembler so bear with me but I would expect
>> this to be written:
>> add:
>> /* prologue: function */
>> /* frame size = 0 */
>> lds r18,y
>> lds r19,(y)+1
>> lds r24,x
>> lds r25,(x)+1
>> add r24,r18
>> adc r25,r19
>> /* epilogue start */
>> ret
>>
>> By inverting the add arguments we save two mov instructions.
>>
>> If it can be written like this any ideas on why GCC is avoiding it?
>>
>> Cheers,
>>
>> --
>> PMatos
>>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: avr compilation
2011-03-18 10:15 ` Paulo J. Matos
@ 2011-03-18 12:11 ` David Brown
2011-03-18 13:37 ` Paulo J. Matos
0 siblings, 1 reply; 9+ messages in thread
From: David Brown @ 2011-03-18 12:11 UTC (permalink / raw)
To: gcc
On 18/03/2011 11:15, Paulo J. Matos wrote:
> On 18/03/11 10:08, WANG.Jiong wrote:
>> This may related with subreg regmove finding
>> Suggest specifiy -fdump-rtl-regmove to see what happen after this pass
>> Maybe avr need a target dependent regmove pass to handle this
>>
>
> It doesn't look like it's regmove, whose result looks pretty sane:
As far as I can see, you are correct - avr-gcc generates subobtimal code
here, and your version is better.
There are only a few people who work with the AVR backend, and while
these people are both clever and dedicated, they are limited in how much
they can do - correct code generation and support for newer devices or
features rightly takes priority over optimisation issues. Thus there
are a fair number of "missed optimisation" issues filed for the AVR
backend, and many more cases like this of suboptimal code that don't
even have issues filed.
There are also a number of patches that are generally applied to avr-gcc
builds (most of which eventually make it into the main FSF tree). I
could not say if any of these are relevant here.
If you are digging through the AVR backend and find ways to improve code
sequences like this, the avr-gcc community would be very grateful.
There is an avr-gcc mailing list at
<http://lists.nongnu.org/mailman/listinfo/avr-gcc-list>, which may be of
interest to you.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: avr compilation
2011-03-18 8:48 avr compilation Paulo J. Matos
2011-03-18 10:08 ` WANG.Jiong
@ 2011-03-18 13:26 ` Georg-Johann Lay
2011-03-18 13:40 ` Paulo J. Matos
1 sibling, 1 reply; 9+ messages in thread
From: Georg-Johann Lay @ 2011-03-18 13:26 UTC (permalink / raw)
To: Paulo J. Matos; +Cc: gcc
Paulo J. Matos schrieb:
> Hi all,
>
> I am looking at the avr backend in order to try to sort some things out
> on my own backend.
>
> One of the tests I am doing is by compiling the following:
> int x = 0x1010;
> int y = 0x0101;
>
> int add(void)
> {
> return x+y;
> }
>
> It compiles to (in gcc-4.3.5_avr with -Os)
> add:
> /* prologue: function */
> /* frame size = 0 */
> lds r18,y
> lds r19,(y)+1
> lds r24,x
> lds r25,(x)+1
> add r18,r24
> adc r19,r25
> mov r24,r18
> mov r25,r19
> /* epilogue start */
> ret
>
> I don't know much avr assembler so bear with me but I would expect this
note that the last moves are two QI moves, the add is HI.
Without splitting HI the moves will disappear, try -fno-split-wide-types.
Johann
> to be written:
> add:
> /* prologue: function */
> /* frame size = 0 */
> lds r18,y
> lds r19,(y)+1
> lds r24,x
> lds r25,(x)+1
> add r24,r18
> adc r25,r19
> /* epilogue start */
> ret
>
> By inverting the add arguments we save two mov instructions.
>
> If it can be written like this any ideas on why GCC is avoiding it?
Try newer version of gcc, like 4.5.2
>
> Cheers,
>
> --
> PMatos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: avr compilation
2011-03-18 13:26 ` Georg-Johann Lay
@ 2011-03-18 13:40 ` Paulo J. Matos
2011-03-18 14:20 ` Ian Lance Taylor
0 siblings, 1 reply; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18 13:40 UTC (permalink / raw)
To: gcc
On 18/03/11 13:26, Georg-Johann Lay wrote:
> note that the last moves are two QI moves, the add is HI.
>
Yes, correct, this seems to cause some confusion on gcc side then... humm!
> Without splitting HI the moves will disappear, try -fno-split-wide-types.
>
It does work! It's enabled by -O1, maybe it should be disabled by -Os if
it improves the code size consistently.
>
> Try newer version of gcc, like 4.5.2
>
Thanks, I will. Hopefully it will do the optimisation out of the box
without the need for the extra option.
Cheers,
--
PMatos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: avr compilation
2011-03-18 13:40 ` Paulo J. Matos
@ 2011-03-18 14:20 ` Ian Lance Taylor
2011-03-18 14:50 ` Paulo J. Matos
0 siblings, 1 reply; 9+ messages in thread
From: Ian Lance Taylor @ 2011-03-18 14:20 UTC (permalink / raw)
To: Paulo J. Matos; +Cc: gcc
"Paulo J. Matos" <pocmatos@gmail.com> writes:
>> Without splitting HI the moves will disappear, try -fno-split-wide-types.
>>
>
> It does work! It's enabled by -O1, maybe it should be disabled by -Os
> if it improves the code size consistently.
-fsplit-wide-types is an improvement on most targets, in which ints and
pointers have the size UNITS_PER_WORD. On AVR that appears not to be
the case, and it seems possible that AVR should set
flag_split_wide_types to 0 in TARGET_OPTION_OPTIMIZATION_TABLE.
Ian
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: avr compilation
2011-03-18 14:20 ` Ian Lance Taylor
@ 2011-03-18 14:50 ` Paulo J. Matos
0 siblings, 0 replies; 9+ messages in thread
From: Paulo J. Matos @ 2011-03-18 14:50 UTC (permalink / raw)
To: gcc
On 18/03/11 14:20, Ian Lance Taylor wrote:
>
> -fsplit-wide-types is an improvement on most targets, in which ints and
> pointers have the size UNITS_PER_WORD. On AVR that appears not to be
> the case, and it seems possible that AVR should set
> flag_split_wide_types to 0 in TARGET_OPTION_OPTIMIZATION_TABLE.
That does explains why split-wide-types causes improvement on avr. Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-03-18 14:50 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-18 8:48 avr compilation Paulo J. Matos
2011-03-18 10:08 ` WANG.Jiong
2011-03-18 10:15 ` Paulo J. Matos
2011-03-18 12:11 ` David Brown
2011-03-18 13:37 ` Paulo J. Matos
2011-03-18 13:26 ` Georg-Johann Lay
2011-03-18 13:40 ` Paulo J. Matos
2011-03-18 14:20 ` Ian Lance Taylor
2011-03-18 14:50 ` Paulo J. Matos
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).