* [Bug c/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
@ 2007-11-01 14:10 ` rguenth at gcc dot gnu dot org
2007-11-01 17:09 ` [Bug middle-end/33970] " pinskia at gcc dot gnu dot org
` (11 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-11-01 14:10 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2007-11-01 14:10 -------
I guess you want -fno-tree-loop-ivcanon.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
2007-11-01 14:10 ` [Bug c/33970] " rguenth at gcc dot gnu dot org
@ 2007-11-01 17:09 ` pinskia at gcc dot gnu dot org
2007-11-01 17:28 ` eweddington at cso dot atmel dot com
` (10 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-11-01 17:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2007-11-01 17:09 -------
I don't see it being promoted on x86-linux-gnu at the tree level.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|c |middle-end
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
2007-11-01 14:10 ` [Bug c/33970] " rguenth at gcc dot gnu dot org
2007-11-01 17:09 ` [Bug middle-end/33970] " pinskia at gcc dot gnu dot org
@ 2007-11-01 17:28 ` eweddington at cso dot atmel dot com
2007-11-01 17:45 ` eweddington at cso dot atmel dot com
` (9 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: eweddington at cso dot atmel dot com @ 2007-11-01 17:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from eweddington at cso dot atmel dot com 2007-11-01 17:28 -------
Created an attachment (id=14454)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14454&action=view)
Preprocessed testcase.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (2 preceding siblings ...)
2007-11-01 17:28 ` eweddington at cso dot atmel dot com
@ 2007-11-01 17:45 ` eweddington at cso dot atmel dot com
2007-11-01 17:47 ` eweddington at cso dot atmel dot com
` (8 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: eweddington at cso dot atmel dot com @ 2007-11-01 17:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from eweddington at cso dot atmel dot com 2007-11-01 17:45 -------
Created an attachment (id=14455)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14455&action=view)
Assembly output of test case using 4.1.2.
Maybe I'm wrong, but I don't even see it being promoted on the AVR. See
attached assembly output from 4.1.2 with -dP.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (3 preceding siblings ...)
2007-11-01 17:45 ` eweddington at cso dot atmel dot com
@ 2007-11-01 17:47 ` eweddington at cso dot atmel dot com
2007-11-01 21:26 ` henning dot m at insightbb dot com
` (7 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: eweddington at cso dot atmel dot com @ 2007-11-01 17:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from eweddington at cso dot atmel dot com 2007-11-01 17:47 -------
Mike, can you provide additional information as to where the bug is?
--
eweddington at cso dot atmel dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (4 preceding siblings ...)
2007-11-01 17:47 ` eweddington at cso dot atmel dot com
@ 2007-11-01 21:26 ` henning dot m at insightbb dot com
2007-11-04 23:28 ` eweddington at cso dot atmel dot com
` (6 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: henning dot m at insightbb dot com @ 2007-11-01 21:26 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from henning dot m at insightbb dot com 2007-11-01 21:26 -------
(In reply to comment #5)
> Mike, can you provide additional information as to where the bug is?
>
This is the assembly output I get:
Note that r14,r15 is being reserved for variable x when only a single reg is
needed. Strange enough if call sub2 with (x+1) then only one reg is used and
the code decreases by 12 bytes.
000000e8 <main>:
e8: ef 92 push r14
ea: ff 92 push r15
ec: 1f 93 push r17
ee: cf 93 push r28
f0: df 93 push r29
f2: cd b7 in r28, 0x3d ; 61
f4: de b7 in r29, 0x3e ; 62
f6: 21 97 sbiw r28, 0x01 ; 1
f8: 0f b6 in r0, 0x3f ; 63
fa: f8 94 cli
fc: de bf out 0x3e, r29 ; 62
fe: 0f be out 0x3f, r0 ; 63
100: cd bf out 0x3d, r28 ; 61
102: ee 24 eor r14, r14
104: ff 24 eor r15, r15
106: 19 81 ldd r17, Y+1 ; 0x01
108: 8e 2d mov r24, r14
10a: 0e 94 57 00 call 0xae ; 0xae <sub2>
10e: 18 0f add r17, r24
110: 19 83 std Y+1, r17 ; 0x01
112: 08 94 sec
114: e1 1c adc r14, r1
116: f1 1c adc r15, r1
118: 80 e8 ldi r24, 0x80 ; 128
11a: e8 16 cp r14, r24
11c: f1 04 cpc r15, r1
11e: 99 f7 brne .-26 ; 0x106 <main+0x1e>
120: f0 cf rjmp .-32 ; 0x102 <main+0x1a>
00000122 <_exit>:
122: ff cf rjmp .-2 ; 0x122 <_exit>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (5 preceding siblings ...)
2007-11-01 21:26 ` henning dot m at insightbb dot com
@ 2007-11-04 23:28 ` eweddington at cso dot atmel dot com
2007-11-04 23:28 ` eweddington at cso dot atmel dot com
` (5 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: eweddington at cso dot atmel dot com @ 2007-11-04 23:28 UTC (permalink / raw)
To: gcc-bugs
--
eweddington at cso dot atmel dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |NEW
Ever Confirmed|0 |1
Last reconfirmed|2007-11-04 23:28:15 |2007-11-04 23:28:50
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (6 preceding siblings ...)
2007-11-04 23:28 ` eweddington at cso dot atmel dot com
@ 2007-11-04 23:28 ` eweddington at cso dot atmel dot com
2007-11-05 22:48 ` wvangulik at xs4all dot nl
` (4 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: eweddington at cso dot atmel dot com @ 2007-11-04 23:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from eweddington at cso dot atmel dot com 2007-11-04 23:28 -------
With Mike's description in comment #6, confirmed on 4.1.2 and 4.2.2. AVR GCC
4.2.2 is worse than 4.1.2, in that even if sub2 is called with (x+1), the
variable is still 16 bits.
--
eweddington at cso dot atmel dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |4.1.2 4.2.2
Last reconfirmed|0000-00-00 00:00:00 |2007-11-04 23:28:15
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (7 preceding siblings ...)
2007-11-04 23:28 ` eweddington at cso dot atmel dot com
@ 2007-11-05 22:48 ` wvangulik at xs4all dot nl
2007-11-06 12:37 ` henning dot m at insightbb dot com
` (3 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: wvangulik at xs4all dot nl @ 2007-11-05 22:48 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from wvangulik at xs4all dot nl 2007-11-05 22:48 -------
(In reply to comment #7)
> With Mike's description in comment #6, confirmed on 4.1.2 and 4.2.2. AVR GCC
> 4.2.2 is worse than 4.1.2, in that even if sub2 is called with (x+1), the
> variable is still 16 bits.
>
There is something more going on, this is the assembler output when sub2 is not
in the same file, and calling sub2(x), i.e not x+1:
===================================================================
push r17
push r28
push r29
in r28,__SP_L__
in r29,__SP_H__
sbiw r28,1
in __tmp_reg__,__SREG__
cli
out __SP_H__,r29
out __SREG__,__tmp_reg__
out __SP_L__,r28
/* prologue end (size=11) */
.L3:
sts x.1261,__zero_reg__
ldi r24,lo8(0)
.L4:
ldd r17,Y+1
call sub2
add r17,r24
std Y+1,r17
lds r24,x.1261
subi r24,lo8(-(1))
sts x.1261,r24
sbrs r24,7
rjmp .L4
rjmp .L3
===========================================================
Note that x is now living in memory (as one would expect when it's static in a
function). However it's not when looking at the original report. Is this ok?
The original post assembler places x in r14,r15. Does this still adhere the
static keyword? I don't know, since it's value is always set at the start
what's the point of being static? Unless main is called twice? But then x may
not be in r14/r15!
So it seems to me this is some inline optimation problem.
Note that moving the sub2 function above main also gives other results?!? Is
this standard behaviour?
Also note that when making sub2 static will completely remove it but also
remove x as static, is this allowed?
Tested against avr-gcc 4.1.2
--
wvangulik at xs4all dot nl changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |wvangulik at xs4all dot nl
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (8 preceding siblings ...)
2007-11-05 22:48 ` wvangulik at xs4all dot nl
@ 2007-11-06 12:37 ` henning dot m at insightbb dot com
2007-11-06 19:35 ` wvangulik at xs4all dot nl
` (2 subsequent siblings)
12 siblings, 0 replies; 15+ messages in thread
From: henning dot m at insightbb dot com @ 2007-11-06 12:37 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from henning dot m at insightbb dot com 2007-11-06 12:37 -------
(In reply to comment #8)
> (In reply to comment #7)
> > With Mike's description in comment #6, confirmed on 4.1.2 and 4.2.2. AVR GCC
> > 4.2.2 is worse than 4.1.2, in that even if sub2 is called with (x+1), the
> > variable is still 16 bits.
> >
>
> There is something more going on, this is the assembler output when sub2 is not
> in the same file, and calling sub2(x), i.e not x+1:
> ===================================================================
I think you will also find that if x is changed from ststic to auto the same
problem appears.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (9 preceding siblings ...)
2007-11-06 12:37 ` henning dot m at insightbb dot com
@ 2007-11-06 19:35 ` wvangulik at xs4all dot nl
2007-11-06 21:01 ` wvangulik at xs4all dot nl
2010-09-13 12:09 ` abnikant dot singh at atmel dot com
12 siblings, 0 replies; 15+ messages in thread
From: wvangulik at xs4all dot nl @ 2007-11-06 19:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from wvangulik at xs4all dot nl 2007-11-06 19:34 -------
(In reply to comment #9)
>
> I think you will also find that if x is changed from ststic to auto the same
> problem appears.
>
Ok, I tried to find the minimum test case.
And it has nothing todo with static/volatile/inline etc.
==============================================
int sub2(unsigned char); // external function
void foo(void) {
unsigned char x;
for(x=0;x<128; x++)
{
//sub2(x); //x is becomes a int (16bit)
sub2(x+1); //x is char (8bit)
}
}
All is compiled using 4.1.2 and "avr-gcc -Wall -Os -mmcu=avr5 -S test.c"
The output when compiling with "sub2(x)"
=================================================
/* prologue: frame size=0 */
push r28
push r29
/* prologue end (size=2) */
ldi r28,lo8(0)
ldi r29,hi8(0)
.L2:
mov r24,r28 ; << loads as a byte!
call sub2
adiw r28,1 ; << increment as a int
cpi r28,128 ; << compare as a int
cpc r29,__zero_reg__
brne .L2
/* epilogue: frame size=0 */
pop r29
pop r28
ret
The output when compiling with "sub2(x+1)"
================================================================
/* prologue: frame size=0 */
push r17
/* prologue end (size=1) */
ldi r17,lo8(0)
.L2:
subi r17,lo8(-(1))
mov r24,r17
call sub2
cpi r17,lo8(-128)
brne .L2
/* epilogue: frame size=0 */
pop r17
ret
===========================================================
>From compiling with x as a int I got a sort of hint. When using x+1, x is
actually loaded with 1 (and not zero).
So i tried:
======================================================
void foo(void) {
unsigned char x;
for(x=1;x<129; x++)
sub2(x);
}
And it gives the assembler loop check as for the "sub(x+1)" variant of course
the loop is now effectivley the same as "sub2(x+1)". But the the incrementing
of x is now done after the call, so it proofs that this is not the problem. I
also tried different loop lengths, thinking 128 is the nasty one, but it did
not help.
======================================================
/* prologue: frame size=0 */
push r17
/* prologue end (size=1) */
ldi r17,lo8(1)
.L2:
mov r24,r17
call sub2
subi r17,lo8(-(1))
cpi r17,lo8(-127)
brne .L2
/* epilogue: frame size=0 */
pop r17
ret
I am out of knowledge now, I do not know how to further debug GCC on this. Just
hoping this will help someone to pinpoint the problem.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (10 preceding siblings ...)
2007-11-06 19:35 ` wvangulik at xs4all dot nl
@ 2007-11-06 21:01 ` wvangulik at xs4all dot nl
2010-09-13 12:09 ` abnikant dot singh at atmel dot com
12 siblings, 0 replies; 15+ messages in thread
From: wvangulik at xs4all dot nl @ 2007-11-06 21:01 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from wvangulik at xs4all dot nl 2007-11-06 21:01 -------
I just realised I did not tried hard enough to find the smallest case:
===========================================
volatile unsigned char bar;
void foo(void) {
unsigned char x;
for(x=0;x<128; x++) {
//bar = x+1;
bar = x;
}
}
Looks like using the loop counter inside the loop causes the problem. Because
when not using x, but just e.g. calling a function is also gives the smallest
loop.
Looking at the RTL output of test.c.00.expand. it allready is a QI vs HI
there... so it's going wrong some where inside gcc I really have no knowledge
off...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug middle-end/33970] Missed optimization using unsigned char loop variable
2007-11-01 13:44 [Bug c/33970] New: Missed optimization using unsigned char loop variable henning dot m at insightbb dot com
` (11 preceding siblings ...)
2007-11-06 21:01 ` wvangulik at xs4all dot nl
@ 2010-09-13 12:09 ` abnikant dot singh at atmel dot com
12 siblings, 0 replies; 15+ messages in thread
From: abnikant dot singh at atmel dot com @ 2010-09-13 12:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from abnikant dot singh at atmel dot com 2010-09-13 12:09 -------
I have verified the attached test case and test case with other comments and
found the code generated is correct i.e. the variable is not promoted to
integer in gcc-4.3.3, gcc-4.4.3, gcc-4.5.0 and also the latest head.
The assembly for the following piece of code:
int sub2(unsigned char); // external function
void foo(void) {
unsigned char x;
for(x=0;x<128; x++)
{
sub2(x); //x is becomes a int (16bit)
// sub2(x+1); //x is char (8bit)
}
}
in gcc-4.3.3 is:
foo:
push r17
/* prologue: function */
/* frame size = 0 */
ldi r17,lo8(0)
.L2:
mov r24,r17
rcall sub2
subi r17,lo8(-(1))
cpi r17,lo8(-128)
brne .L2
/* epilogue start */
pop r17
ret
.size foo, .-foo
--
abnikant dot singh at atmel dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |abnikant dot singh at atmel
| |dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33970
^ permalink raw reply [flat|nested] 15+ messages in thread