public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
@ 2004-11-11 2:35 schlie at comcast dot net
2004-11-11 2:48 ` [Bug middle-end/18424] " pinskia at gcc dot gnu dot org
` (31 more replies)
0 siblings, 32 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-11 2:35 UTC (permalink / raw)
To: gcc-bugs
3.4.3 generates code which may be ~6x+ slower and larger than 3.3.1 @ -0s
as it apparently no longer evaluates constant expression trees in anything
other than simple expressions for some reason, which may result in serious
performance and code size regressions, which should really be fixed if possible.
// 000000c6 <foo>:
int foo ( int a ){
if (a & (1L << 23))
// c6: aa 27 eor r26, r26
// c8: 97 fd sbrc r25, 7
// ca: a0 95 com r26
// cc: ba 2f mov r27, r26
// ce: 27 e1 ldi r18, 0x17 ; 23
// d0: b6 95 lsr r27
// d2: a7 95 ror r26
// d4: 97 95 ror r25
// d6: 87 95 ror r24
// d8: 2a 95 dec r18
// da: d1 f7 brne .-12 ; 0xd0
// dc: 81 70 andi r24, 0x01 ; 1
// de: 90 70 andi r25, 0x00 ; 0
// e0: 89 2b or r24, r25
// e2: 19 f0 breq .+6 ; 0xea
return 1;
// e4: 81 e0 ldi r24, 0x01 ; 1
// e6: 90 e0 ldi r25, 0x00 ; 0
// e8: 08 95 ret
else
return 2 ;
// ea: 82 e0 ldi r24, 0x02 ; 2
// ec: 90 e0 ldi r25, 0x00 ; 0
}
// ee: 08 95 ret
// f0: 08 95 ret
// where the second return is odd as well?
vs GCC 3.3.1 @ -0s
// 000000c6 <foo>:
int foo2 ( int a ){
if (a & (1L << 23))
return 1;
else
return 2 ;
}
// c6: 82 e0 ldi r24, 0x02 ; 2
// c8: 90 e0 ldi r25, 0x00 ; 0
// ca: 08 95 ret
(where for referance int targeted to avr is 16-bits wide,
but the above problem is not likely target sensitive.)
--
Summary: 3.4.3 ~6x+ performance regression vs 3.3.1, constant
trees not being computed.
Product: gcc
Version: 3.4.3
Status: UNCONFIRMED
Severity: critical
Priority: P1
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: schlie at comcast dot net
CC: dmixm at marine dot febras dot ru,ericw at evcohs dot
com,gcc-bugs at gcc dot gnu dot org
GCC build triplet: any
GCC host triplet: any
GCC target triplet: avr-unknown-none
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
@ 2004-11-11 2:48 ` pinskia at gcc dot gnu dot org
2004-11-11 2:49 ` pinskia at gcc dot gnu dot org
` (30 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-11 2:48 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11 02:48 -------
Here is an example for PPC:
int foo2 ( short a ){
if (a & (1 << 23))
return 1;
else
return 2 ;
}
but it is not a regression on PPC with the above example
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|critical |minor
Component|c |middle-end
Keywords| |missed-optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
2004-11-11 2:48 ` [Bug middle-end/18424] " pinskia at gcc dot gnu dot org
@ 2004-11-11 2:49 ` pinskia at gcc dot gnu dot org
2004-11-11 2:52 ` pinskia at gcc dot gnu dot org
` (29 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-11 2:49 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11 02:49 -------
I amost think the size of long changed for 3.4.0 for avr to 32bits.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
2004-11-11 2:48 ` [Bug middle-end/18424] " pinskia at gcc dot gnu dot org
2004-11-11 2:49 ` pinskia at gcc dot gnu dot org
@ 2004-11-11 2:52 ` pinskia at gcc dot gnu dot org
2004-11-11 3:15 ` schlie at comcast dot net
` (28 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-11 2:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11 02:52 -------
or the default for -mint8 changed.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (2 preceding siblings ...)
2004-11-11 2:52 ` pinskia at gcc dot gnu dot org
@ 2004-11-11 3:15 ` schlie at comcast dot net
2004-11-11 3:18 ` pinskia at gcc dot gnu dot org
` (27 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-11 3:15 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-11-11 03:15 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs
3.3.1, constant trees not being computed.
Yes but regardless of the size of the constant 1:
if (a & (1L << 24)) or (a & (1 << 24)) yields the same results,
and further, if a simpler expression containing the same constant
Sub-expression is used, it properly computes the constant expression:
int foo (int a) {
a = a + (1 << 24); // apparently simple enough to compute (1 << 24)
if (a & (1 << 24)) { // then utilized here, which yields 0
return 1;
else
return 2;
}
=> return 2;
Which implies that it is not a back-end issue, but that (1 << 24)
is not being calculated as a constant expression if nested within
a more complex expression for some reason, I believe?
although the PPC may have optimized it away in other ways as it has
a more powerful shift instruction, but should not be a back-end thing,
as the embedded constant expression was properly computed and applied
in 3.3.x.
> From: pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>
> Reply-To: <gcc-bugzilla@gcc.gnu.org>
> Date: 11 Nov 2004 02:49:28 -0000
> To: <schlie@comcast.net>
> Subject: [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1,
> constant trees not being computed.
>
>
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11
> 02:49 -------
> I amost think the size of long changed for 3.4.0 for avr to 32bits.
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (3 preceding siblings ...)
2004-11-11 3:15 ` schlie at comcast dot net
@ 2004-11-11 3:18 ` pinskia at gcc dot gnu dot org
2004-11-11 3:56 ` schlie at comcast dot net
` (26 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-11 3:18 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11 03:17 -------
When I did 1 << 24 I got a warning (at least on the mainline on a cross to avr) about 24 being greater
than the size of int so it was going to be 0. Again in real terms there is something on here but I really
doubt it is a regression and you did not use -mint8 for the 3.3 build and not for the 3.4 build.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (4 preceding siblings ...)
2004-11-11 3:18 ` pinskia at gcc dot gnu dot org
@ 2004-11-11 3:56 ` schlie at comcast dot net
2004-11-11 4:33 ` schlie at comcast dot net
` (25 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-11 3:56 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-11-11 03:55 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs
3.3.1, constant trees not being computed.
> pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote:
> When I did 1 << 24 I got a warning (at least on the mainline on a cross to
> avr) about 24 being greater
> than the size of int so it was going to be 0. Again in real terms there is
> something on here but I really
> doubt it is a regression and you did not use -mint8 for the 3.3 build and not
> for the 3.4 build.
Yes, you're correct, (1 << 24) generates warnings: (not forcing -mmint8)
main.c: In function `foo1':
main.c:3: warning: left shift count >= width of type
main.c: In function `foo2':
main.c:12: warning: left shift count >= width of type
main.c: At top level:
main.c:21: warning: return type of 'main' is not `int'
and produces the anticipated correct code, sorry for the confusion.
---
However as reported with (1L << 24), it does not, nor should it yield
different results, but it appears to product the expected code only if
the same constant expression occurs in a simpler expression in the same
basic block: ?
File: main.lss:
main.elf: file format elf32-avr
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 00000000 00800100 0000011e 000001b2 2**0
CONTENTS, ALLOC, LOAD, DATA
1 .text 0000011e 00000000 00000000 00000094 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .bss 00000000 00800100 0000011e 000001b2 2**0
ALLOC
3 .noinit 00000000 00800100 00800100 000001b2 2**0
CONTENTS
4 .eeprom 00000000 00810000 00810000 000001b2 2**0
CONTENTS
5 .stab 000004c8 00000000 00000000 000001b4 2**2
CONTENTS, READONLY, DEBUGGING
6 .stabstr 0000044e 00000000 00000000 0000067c 2**0
CONTENTS, READONLY, DEBUGGING
Disassembly of section .text:
00000000 <__vectors>:
0: 0c 94 46 00 jmp 0x8c
4: 0c 94 61 00 jmp 0xc2
8: 0c 94 61 00 jmp 0xc2
c: 0c 94 61 00 jmp 0xc2
10: 0c 94 61 00 jmp 0xc2
14: 0c 94 61 00 jmp 0xc2
18: 0c 94 61 00 jmp 0xc2
1c: 0c 94 61 00 jmp 0xc2
20: 0c 94 61 00 jmp 0xc2
24: 0c 94 61 00 jmp 0xc2
28: 0c 94 61 00 jmp 0xc2
2c: 0c 94 61 00 jmp 0xc2
30: 0c 94 61 00 jmp 0xc2
34: 0c 94 61 00 jmp 0xc2
38: 0c 94 61 00 jmp 0xc2
3c: 0c 94 61 00 jmp 0xc2
40: 0c 94 61 00 jmp 0xc2
44: 0c 94 61 00 jmp 0xc2
48: 0c 94 61 00 jmp 0xc2
4c: 0c 94 61 00 jmp 0xc2
50: 0c 94 61 00 jmp 0xc2
54: 0c 94 61 00 jmp 0xc2
58: 0c 94 61 00 jmp 0xc2
5c: 0c 94 61 00 jmp 0xc2
60: 0c 94 61 00 jmp 0xc2
64: 0c 94 61 00 jmp 0xc2
68: 0c 94 61 00 jmp 0xc2
6c: 0c 94 61 00 jmp 0xc2
70: 0c 94 61 00 jmp 0xc2
74: 0c 94 61 00 jmp 0xc2
78: 0c 94 61 00 jmp 0xc2
7c: 0c 94 61 00 jmp 0xc2
80: 0c 94 61 00 jmp 0xc2
84: 0c 94 61 00 jmp 0xc2
88: 0c 94 61 00 jmp 0xc2
0000008c <__ctors_end>:
8c: 11 24 eor r1, r1
8e: 1f be out 0x3f, r1 ; 63
90: cf ef ldi r28, 0xFF ; 255
92: d0 e1 ldi r29, 0x10 ; 16
94: de bf out 0x3e, r29 ; 62
96: cd bf out 0x3d, r28 ; 61
00000098 <__do_copy_data>:
98: 11 e0 ldi r17, 0x01 ; 1
9a: a0 e0 ldi r26, 0x00 ; 0
9c: b1 e0 ldi r27, 0x01 ; 1
9e: ee e1 ldi r30, 0x1E ; 30
a0: f1 e0 ldi r31, 0x01 ; 1
a2: 02 c0 rjmp .+4 ; 0xa8
000000a4 <.do_copy_data_loop>:
a4: 05 90 lpm r0, Z+
a6: 0d 92 st X+, r0
000000a8 <.do_copy_data_start>:
a8: a0 30 cpi r26, 0x00 ; 0
aa: b1 07 cpc r27, r17
ac: d9 f7 brne .-10 ; 0xa4
000000ae <__do_clear_bss>:
ae: 11 e0 ldi r17, 0x01 ; 1
b0: a0 e0 ldi r26, 0x00 ; 0
b2: b1 e0 ldi r27, 0x01 ; 1
b4: 01 c0 rjmp .+2 ; 0xb8
000000b6 <.do_clear_bss_loop>:
b6: 1d 92 st X+, r1
000000b8 <.do_clear_bss_start>:
b8: a0 30 cpi r26, 0x00 ; 0
ba: b1 07 cpc r27, r17
bc: e1 f7 brne .-8 ; 0xb6
be: 0c 94 7c 00 jmp 0xf8
000000c2 <__bad_interrupt>:
c2: 0c 94 00 00 jmp 0x0
000000c6 <foo1>:
int foo1 ( int a ){
if (a & (1L << 23))
c6: aa 27 eor r26, r26
c8: 97 fd sbrc r25, 7
ca: a0 95 com r26
cc: ba 2f mov r27, r26
ce: 27 e1 ldi r18, 0x17 ; 23
d0: b6 95 lsr r27
d2: a7 95 ror r26
d4: 97 95 ror r25
d6: 87 95 ror r24
d8: 2a 95 dec r18
da: d1 f7 brne .-12 ; 0xd0
dc: 81 70 andi r24, 0x01 ; 1
de: 90 70 andi r25, 0x00 ; 0
e0: 89 2b or r24, r25
e2: 19 f0 breq .+6 ; 0xea
return 1;
e4: 81 e0 ldi r24, 0x01 ; 1
e6: 90 e0 ldi r25, 0x00 ; 0
e8: 08 95 ret
else
return 2 ;
ea: 82 e0 ldi r24, 0x02 ; 2
ec: 90 e0 ldi r25, 0x00 ; 0
}
ee: 08 95 ret
f0: 08 95 ret
000000f2 <foo2>:
int foo2 ( int a ){
a = (a & (1L << 23));
if (a & (1L << 23))
return 1;
else
return 2 ;
}
f2: 82 e0 ldi r24, 0x02 ; 2
f4: 90 e0 ldi r25, 0x00 ; 0
f6: 08 95 ret
000000f8 <main>:
void main( void ){
f8: cd ef ldi r28, 0xFD ; 253
fa: d0 e1 ldi r29, 0x10 ; 16
fc: de bf out 0x3e, r29 ; 62
fe: cd bf out 0x3d, r28 ; 61
volatile int a;
a = foo1 ( a );
100: 89 81 ldd r24, Y+1 ; 0x01
102: 9a 81 ldd r25, Y+2 ; 0x02
104: 0e 94 63 00 call 0xc6
108: 89 83 std Y+1, r24 ; 0x01
10a: 9a 83 std Y+2, r25 ; 0x02
a = foo2 ( a );
10c: 89 81 ldd r24, Y+1 ; 0x01
10e: 9a 81 ldd r25, Y+2 ; 0x02
110: 0e 94 79 00 call 0xf2
114: 89 83 std Y+1, r24 ; 0x01
116: 9a 83 std Y+2, r25 ; 0x02
118: 0c 94 8e 00 jmp 0x11c
0000011c <_exit>:
11c: ff cf rjmp .-2 ; 0x11c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (5 preceding siblings ...)
2004-11-11 3:56 ` schlie at comcast dot net
@ 2004-11-11 4:33 ` schlie at comcast dot net
2004-11-11 4:41 ` pinskia at gcc dot gnu dot org
` (24 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-11 4:33 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-11-11 04:33 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs.
3.3.1, constant trees not being computed.
As implied earlier, this problem may be related to (in my words) an overly
complicated and error prone processes of having to subsequently demote
operations and operands after their often needless initial promotion prior
to having determined their need.
Which longer term should really be re-considered, as it should be understood
that is not required to literally follow C's documented evaluation semantics
to yield logically equivalent optimal results. As operands need only be
promoted if their operation require it as determined by it's targets need,
and expression trees could be simplified if integer nodes were tagged with
their sign in addition to their size, rather then requiring a cast node be
added to the tree, thereby indirectly likely requiring less tree memory,
less collections, and simpler rtl target instruction descriptions.
The reason that PPC may not see the problem is that the ppc is both an int
wide machine, and has a multi-bit shift instruction, which may be optimized
away though a different mechanism than is failing in this circumstance.
But regardless, the problem exposed in this circumstance is a regression
from whatever middle-end mechanism enabled the proper evaluation of the
constant expression 3.3.1 which enabled the static analysis of the logical
result of the expression at hand, therefore shouldn't be considered a target
specific problem. (something broke in 3.3 -> 3.4 which isn't insignificant
for less than int wide targets without instruction sets similar to ppc.)
Thanks again for time and hopeful consideration, -paul-
> From: Paul Schlie <schlie@comcast.net>
> Date: Wed, 10 Nov 2004 22:55:42 -0500
> To: <gcc-bugzilla@gcc.gnu.org>
> Subject: Re: [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs
> 3.3.1, constant trees not being computed.
>
>> pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote:
>> When I did 1 << 24 I got a warning (at least on the mainline on a cross to
>> avr) about 24 being greater
>> than the size of int so it was going to be 0. Again in real terms there is
>> something on here but I really
>> doubt it is a regression and you did not use -mint8 for the 3.3 build and not
>> for the 3.4 build.
>
> Yes, you're correct, (1 << 24) generates warnings: (not forcing -mmint8)
>
> main.c: In function `foo1':
> main.c:3: warning: left shift count >= width of type
> main.c: In function `foo2':
> main.c:12: warning: left shift count >= width of type
> main.c: At top level:
> main.c:21: warning: return type of 'main' is not `int'
>
> and produces the anticipated correct code, sorry for the confusion.
>
> ---
>
> However as reported with (1L << 24), it does not, nor should it yield
> different results, but it appears to product the expected code only if
> the same constant expression occurs in a simpler expression in the same basic
> block: ?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (6 preceding siblings ...)
2004-11-11 4:33 ` schlie at comcast dot net
@ 2004-11-11 4:41 ` pinskia at gcc dot gnu dot org
2004-11-11 4:59 ` schlie at comcast dot net
` (23 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-11 4:41 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11 04:41 -------
Actually what you said is not true for this testcase as you have int & long and not int & int.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (7 preceding siblings ...)
2004-11-11 4:41 ` pinskia at gcc dot gnu dot org
@ 2004-11-11 4:59 ` schlie at comcast dot net
2004-11-11 5:04 ` pinskia at gcc dot gnu dot org
` (22 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-11 4:59 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-11-11 04:59 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs
3.3.1, constant trees not being computed.
> From: pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>
>
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11
> 04:41 -------
> Actually what you said is not true for this testcase as you have int & long
> and not int & int.
Sorry, I don't understand, it's fairly apparent to me, and apparently 3.3,
and 3.4 (once it's actually does compute (1L << 23) in an earlier
sub-expression), that:
<16-bits-wide-variable> = (<16-bit-wide variable> & 0x01000000) = 0
???
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (8 preceding siblings ...)
2004-11-11 4:59 ` schlie at comcast dot net
@ 2004-11-11 5:04 ` pinskia at gcc dot gnu dot org
2004-11-11 15:52 ` schlie at comcast dot net
` (21 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-11 5:04 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11 05:04 -------
int_val & long_val == (long)(int_val) & long_val by what I had quoted in the other bug which you were
talking about this.
Also, that simplification comes from combine and knowning that ((int_val & long_val) >> (log2
long_val)) & 1 (where long_val > MAX_INT) is false (nothing else). The problem on PPC and this one is
the same. Anyways as I had said before this is not a regression.
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|minor |enhancement
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2004-11-11 05:04:34
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (9 preceding siblings ...)
2004-11-11 5:04 ` pinskia at gcc dot gnu dot org
@ 2004-11-11 15:52 ` schlie at comcast dot net
2004-11-11 16:22 ` joseph at codesourcery dot com
` (20 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-11 15:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-11-11 15:51 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs.
3.3.1, constant trees not being computed.
> From: pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11
> 05:04 -------
> int_val & long_val == (long)(int_val) & long_val by what I had quoted in the
> other bug which you were
> talking about this.
>
> Also, that simplification comes from combine and knowning that ((int_val &
> long_val) >> (log2
> long_val)) & 1 (where long_val > MAX_INT) is false (nothing else). The
> problem on PPC and this one is
> the same. Anyways as I had said before this is not a regression.
For what it's worth, comparable code for ppc (as << 23 is within int range):
int foo ( int a ) {
if (a & (1L << 55))
return 1;
else
return 2;
}
Which I suspect will show a difference, although due to ppc's multi-bit
shift capabilities, it's literal representation may be detectable during
instruction-generation/peep-hole-optimization, but not on machines where
multi-bit shifts are not as compactly represented, therefore rely on earlier
recognition and optimization as appears to be present in 3.3.4.
Regardless of ppc relative code quality, 3.4.3 is generating substantially
slower & larger code than 3.3.4 for avr with no corresponding target machine
description changes, therefore would think that 3.4.3 is clearly suffering
from an avr target code performance/quality regression, yes?
As another less dramatic example:
int foo ( long a ) {
if (a & (1L << 23)) // where again (1L << 55) for ppc
return 1;
else
return 2;
}
Also demonstrates 3.4.3's generation of inferior avr code, relative to
3.4.3, with no corresponding changes to it's target description; which
seems to be due to the constant expression (1L << 23) not being computed if
nested within a more complex expression, which 3.3.4 was capable of
detecting, but 3.4.3 isn't.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (10 preceding siblings ...)
2004-11-11 15:52 ` schlie at comcast dot net
@ 2004-11-11 16:22 ` joseph at codesourcery dot com
2004-11-11 16:30 ` ericw at evcohs dot com
` (19 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: joseph at codesourcery dot com @ 2004-11-11 16:22 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From joseph at codesourcery dot com 2004-11-11 16:22 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs
3.3.1, constant trees not being computed.
Have you actually tried compiling code identical to that you test but with
8388608L in place of (1L << 23) before making claims about what is done
with constant expressions?
Your example may suggest a regression, provided no type sizes changed for
your target between the versions compared, but you really shouldn't report
conjectures about the cause of a bug without clear evidence to
substantiate them, which in this case would involve substituting the value
of the constant expression in the testcase.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (11 preceding siblings ...)
2004-11-11 16:22 ` joseph at codesourcery dot com
@ 2004-11-11 16:30 ` ericw at evcohs dot com
2004-11-11 17:19 ` schlie at comcast dot net
` (18 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: ericw at evcohs dot com @ 2004-11-11 16:30 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From ericw at evcohs dot com 2004-11-11 16:29 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs 3.3.1,
constant trees not being computed.
pinskia at gcc dot gnu dot org wrote:
>------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-11 02:52 -------
>or the default for -mint8 changed.
>
>
>
The size of long has always stayed at 32 bits.
The default sizes for -mint8 *might* have changed. I know that Svein
Seldal corrected a problem with the size of long with -mint8, but I
thought that was on HEAD, i.e. for 4.0.0, not necessarily on 3.4.3. But
I could be wrong. For reference:
Normal:
char = 8 bits
int = 16 bits
long = 32 bits
With -mint8 (on 4.0.0 IIRC)
char = 8 bits
int = 8 bits
long = 16 bits
Eric
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (12 preceding siblings ...)
2004-11-11 16:30 ` ericw at evcohs dot com
@ 2004-11-11 17:19 ` schlie at comcast dot net
2004-11-11 20:29 ` schlie at comcast dot net
` (17 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-11 17:19 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-11-11 17:19 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs
3.3.1, constant trees not being computed.
> From: joseph at codesourcery dot com <gcc-bugzilla@gcc.gnu.org>
> ------- Additional Comments From joseph at codesourcery dot com 2004-11-11
> 16:22 -------
> Subject: Re: 3.4.3 ~6x+ performance regression vs
> 3.3.1, constant trees not being computed.
>
> Have you actually tried compiling code identical to that you test but with
> 8388608L in place of (1L << 23) before making claims about what is done
> with constant expressions?
>
> Your example may suggest a regression, provided no type sizes changed for
> your target between the versions compared, but you really shouldn't report
> conjectures about the cause of a bug without clear evidence to
> substantiate them, which in this case would involve substituting the value
> of the constant expression in the testcase.
Good point, will do with both 0x800000L and (1 << 23).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (13 preceding siblings ...)
2004-11-11 17:19 ` schlie at comcast dot net
@ 2004-11-11 20:29 ` schlie at comcast dot net
2004-11-16 23:58 ` dmixm at marine dot febras dot ru
` (16 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-11 20:29 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-11-11 20:28 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs
3.3.1, constant trees not being computed.
> From: joseph at codesourcery dot com <gcc-bugzilla@gcc.gnu.org>
> ------- Additional Comments From joseph at codesourcery dot com 2004-11-11
> 16:22 -------
> Subject: Re: 3.4.3 ~6x+ performance regression vs
> 3.3.1, constant trees not being computed.
>
> Have you actually tried compiling code identical to that you test but with
> 8388608L in place of (1L << 23) before making claims about what is done
> with constant expressions?
>
> Your example may suggest a regression, provided no type sizes changed for
> your target between the versions compared, but you really shouldn't report
> conjectures about the cause of a bug without clear evidence to
> substantiate them, which in this case would involve substituting the value
> of the constant expression in the testcase.
You were correct, the problem wasn't that 3.4.3 wasn't computing the
constant expression values, it was that it was oddly transforming constant
values into runtime computed expressions, such that 3.4.3 converted:
(a & 0x800000L) => ((long)a >> 23) & 1), which doesn't quite seem sensible.
The following are the results for both 3.4.3 and 3.3.1; where 3.4.3 shows a
>100x performance regression, and a ~4x size regression relative to 3.3.1:
----
The source:
/*Compiling: main.c using (for the sake of argument)
avr-gcc -c -mmcu=atmega64 -I. -g -Os -funsigned-char -funsigned-bitfields
-fpack-struct
-fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=main.lst
-I/usr/local/avr/include
-std=gnu99 -funsafe-math-optimizations
-Wp,-M,-MP,-MT,main.o,-MF,.dep/main.o.d main.c
-o main.o
Linking: main.elf (again for the sake of argumnet)
avr-gcc -mmcu=atmega64 -I. -g -Os -funsigned-char -funsigned-bitfields
-fpack-struct
-fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=main.o
-I/usr/local/avr/include
-std=gnu99 -funsafe-math-optimizations
-Wp,-M,-MP,-MT,main.o,-MF,.dep/main.elf.d main.o
--output main.elf -Wl,-Map=main.map,--cref -lm
File: main.c
*/
int foo0 ( int a ){
if (a & 0x800000L)
return 1;
else
return 2 ;
}
int foo1 ( int a ){
if (a & (1L << 23))
return 1;
else
return 2 ;
}
int foo2 ( long a ){
if (a & 0x800000L)
return 1;
else
return 2 ;
}
int foo3 ( long a ){
if (a & (1L << 23))
return 1;
else
return 2 ;
}
int main( void ){
volatile int a;
a = foo0 ( a );
a = foo1 ( a );
a = foo2 ( a );
a = foo3 ( a );
return 0;
}
----
Listing for 3.4.3
main.elf: file format elf32-avr
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 00000000 00800100 000001c8 0000025c 2**0
CONTENTS, ALLOC, LOAD, DATA
1 .text 000001c8 00000000 00000000 00000094 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .bss 00000000 00800100 000001c8 0000025c 2**0
ALLOC
3 .noinit 00000000 00800100 00800100 0000025c 2**0
CONTENTS
4 .eeprom 00000000 00810000 00810000 0000025c 2**0
CONTENTS
5 .stab 000005d0 00000000 00000000 0000025c 2**2
CONTENTS, READONLY, DEBUGGING
6 .stabstr 0000046e 00000000 00000000 0000082c 2**0
CONTENTS, READONLY, DEBUGGING
Disassembly of section .text:
00000000 <__vectors>:
0: 0c 94 46 00 jmp 0x8c
4: 0c 94 61 00 jmp 0xc2
8: 0c 94 61 00 jmp 0xc2
c: 0c 94 61 00 jmp 0xc2
10: 0c 94 61 00 jmp 0xc2
14: 0c 94 61 00 jmp 0xc2
18: 0c 94 61 00 jmp 0xc2
1c: 0c 94 61 00 jmp 0xc2
20: 0c 94 61 00 jmp 0xc2
24: 0c 94 61 00 jmp 0xc2
28: 0c 94 61 00 jmp 0xc2
2c: 0c 94 61 00 jmp 0xc2
30: 0c 94 61 00 jmp 0xc2
34: 0c 94 61 00 jmp 0xc2
38: 0c 94 61 00 jmp 0xc2
3c: 0c 94 61 00 jmp 0xc2
40: 0c 94 61 00 jmp 0xc2
44: 0c 94 61 00 jmp 0xc2
48: 0c 94 61 00 jmp 0xc2
4c: 0c 94 61 00 jmp 0xc2
50: 0c 94 61 00 jmp 0xc2
54: 0c 94 61 00 jmp 0xc2
58: 0c 94 61 00 jmp 0xc2
5c: 0c 94 61 00 jmp 0xc2
60: 0c 94 61 00 jmp 0xc2
64: 0c 94 61 00 jmp 0xc2
68: 0c 94 61 00 jmp 0xc2
6c: 0c 94 61 00 jmp 0xc2
70: 0c 94 61 00 jmp 0xc2
74: 0c 94 61 00 jmp 0xc2
78: 0c 94 61 00 jmp 0xc2
7c: 0c 94 61 00 jmp 0xc2
80: 0c 94 61 00 jmp 0xc2
84: 0c 94 61 00 jmp 0xc2
88: 0c 94 61 00 jmp 0xc2
0000008c <__ctors_end>:
8c: 11 24 eor r1, r1
8e: 1f be out 0x3f, r1 ; 63
90: cf ef ldi r28, 0xFF ; 255
92: d0 e1 ldi r29, 0x10 ; 16
94: de bf out 0x3e, r29 ; 62
96: cd bf out 0x3d, r28 ; 61
00000098 <__do_copy_data>:
98: 11 e0 ldi r17, 0x01 ; 1
9a: a0 e0 ldi r26, 0x00 ; 0
9c: b1 e0 ldi r27, 0x01 ; 1
9e: e8 ec ldi r30, 0xC8 ; 200
a0: f1 e0 ldi r31, 0x01 ; 1
a2: 02 c0 rjmp .+4 ; 0xa8
000000a4 <.do_copy_data_loop>:
a4: 05 90 lpm r0, Z+
a6: 0d 92 st X+, r0
000000a8 <.do_copy_data_start>:
a8: a0 30 cpi r26, 0x00 ; 0
aa: b1 07 cpc r27, r17
ac: d9 f7 brne .-10 ; 0xa4
000000ae <__do_clear_bss>:
ae: 11 e0 ldi r17, 0x01 ; 1
b0: a0 e0 ldi r26, 0x00 ; 0
b2: b1 e0 ldi r27, 0x01 ; 1
b4: 01 c0 rjmp .+2 ; 0xb8
000000b6 <.do_clear_bss_loop>:
b6: 1d 92 st X+, r1
000000b8 <.do_clear_bss_start>:
b8: a0 30 cpi r26, 0x00 ; 0
ba: b1 07 cpc r27, r17
bc: e1 f7 brne .-8 ; 0xb6
be: 0c 94 b7 00 jmp 0x16e
000000c2 <__bad_interrupt>:
c2: 0c 94 00 00 jmp 0x0
000000c6 <foo0>:
*/
int foo0 ( int a ){
if (a & 0x800000L)
c6: aa 27 eor r26, r26
c8: 97 fd sbrc r25, 7
ca: a0 95 com r26
cc: ba 2f mov r27, r26
ce: 27 e1 ldi r18, 0x17 ; 23
d0: b6 95 lsr r27
d2: a7 95 ror r26
d4: 97 95 ror r25
d6: 87 95 ror r24
d8: 2a 95 dec r18
da: d1 f7 brne .-12 ; 0xd0
dc: 81 70 andi r24, 0x01 ; 1
de: 90 70 andi r25, 0x00 ; 0
e0: 89 2b or r24, r25
e2: 19 f0 breq .+6 ; 0xea
return 1;
e4: 81 e0 ldi r24, 0x01 ; 1
e6: 90 e0 ldi r25, 0x00 ; 0
e8: 08 95 ret
else
return 2 ;
ea: 82 e0 ldi r24, 0x02 ; 2
ec: 90 e0 ldi r25, 0x00 ; 0
}
ee: 08 95 ret
f0: 08 95 ret
000000f2 <foo1>:
int foo1 ( int a ){
if (a & (1L << 23))
f2: aa 27 eor r26, r26
f4: 97 fd sbrc r25, 7
f6: a0 95 com r26
f8: ba 2f mov r27, r26
fa: 37 e1 ldi r19, 0x17 ; 23
fc: b6 95 lsr r27
fe: a7 95 ror r26
100: 97 95 ror r25
102: 87 95 ror r24
104: 3a 95 dec r19
106: d1 f7 brne .-12 ; 0xfc
108: 81 70 andi r24, 0x01 ; 1
10a: 90 70 andi r25, 0x00 ; 0
10c: 89 2b or r24, r25
10e: 19 f0 breq .+6 ; 0x116
return 1;
110: 81 e0 ldi r24, 0x01 ; 1
112: 90 e0 ldi r25, 0x00 ; 0
114: 08 95 ret
else
return 2 ;
116: 82 e0 ldi r24, 0x02 ; 2
118: 90 e0 ldi r25, 0x00 ; 0
}
11a: 08 95 ret
11c: 08 95 ret
0000011e <foo2>:
int foo2 ( long a ){
11e: dc 01 movw r26, r24
120: cb 01 movw r24, r22
if (a & 0x800000L)
122: 47 e1 ldi r20, 0x17 ; 23
124: b6 95 lsr r27
126: a7 95 ror r26
128: 97 95 ror r25
12a: 87 95 ror r24
12c: 4a 95 dec r20
12e: d1 f7 brne .-12 ; 0x124
130: 81 70 andi r24, 0x01 ; 1
132: 90 70 andi r25, 0x00 ; 0
134: 89 2b or r24, r25
136: 19 f0 breq .+6 ; 0x13e
return 1;
138: 81 e0 ldi r24, 0x01 ; 1
13a: 90 e0 ldi r25, 0x00 ; 0
13c: 08 95 ret
else
return 2 ;
13e: 82 e0 ldi r24, 0x02 ; 2
140: 90 e0 ldi r25, 0x00 ; 0
}
142: 08 95 ret
144: 08 95 ret
00000146 <foo3>:
int foo3 ( long a ){
146: dc 01 movw r26, r24
148: cb 01 movw r24, r22
if (a & (1L << 23))
14a: 57 e1 ldi r21, 0x17 ; 23
14c: b6 95 lsr r27
14e: a7 95 ror r26
150: 97 95 ror r25
152: 87 95 ror r24
154: 5a 95 dec r21
156: d1 f7 brne .-12 ; 0x14c
158: 81 70 andi r24, 0x01 ; 1
15a: 90 70 andi r25, 0x00 ; 0
15c: 89 2b or r24, r25
15e: 19 f0 breq .+6 ; 0x166
return 1;
160: 81 e0 ldi r24, 0x01 ; 1
162: 90 e0 ldi r25, 0x00 ; 0
164: 08 95 ret
else
return 2 ;
166: 82 e0 ldi r24, 0x02 ; 2
168: 90 e0 ldi r25, 0x00 ; 0
}
16a: 08 95 ret
16c: 08 95 ret
0000016e <main>:
int main( void ){
16e: cd ef ldi r28, 0xFD ; 253
170: d0 e1 ldi r29, 0x10 ; 16
172: de bf out 0x3e, r29 ; 62
174: cd bf out 0x3d, r28 ; 61
volatile int a;
a = foo0 ( a );
176: 89 81 ldd r24, Y+1 ; 0x01
178: 9a 81 ldd r25, Y+2 ; 0x02
17a: 0e 94 63 00 call 0xc6
17e: 89 83 std Y+1, r24 ; 0x01
180: 9a 83 std Y+2, r25 ; 0x02
a = foo1 ( a );
182: 89 81 ldd r24, Y+1 ; 0x01
184: 9a 81 ldd r25, Y+2 ; 0x02
186: 0e 94 79 00 call 0xf2
18a: 89 83 std Y+1, r24 ; 0x01
18c: 9a 83 std Y+2, r25 ; 0x02
a = foo2 ( a );
18e: 89 81 ldd r24, Y+1 ; 0x01
190: 9a 81 ldd r25, Y+2 ; 0x02
192: aa 27 eor r26, r26
194: 97 fd sbrc r25, 7
196: a0 95 com r26
198: ba 2f mov r27, r26
19a: bc 01 movw r22, r24
19c: cd 01 movw r24, r26
19e: 0e 94 8f 00 call 0x11e
1a2: 89 83 std Y+1, r24 ; 0x01
1a4: 9a 83 std Y+2, r25 ; 0x02
a = foo3 ( a );
1a6: 89 81 ldd r24, Y+1 ; 0x01
1a8: 9a 81 ldd r25, Y+2 ; 0x02
1aa: aa 27 eor r26, r26
1ac: 97 fd sbrc r25, 7
1ae: a0 95 com r26
1b0: ba 2f mov r27, r26
1b2: bc 01 movw r22, r24
1b4: cd 01 movw r24, r26
1b6: 0e 94 a3 00 call 0x146
1ba: 89 83 std Y+1, r24 ; 0x01
1bc: 9a 83 std Y+2, r25 ; 0x02
return 0;
}
1be: 80 e0 ldi r24, 0x00 ; 0
1c0: 90 e0 ldi r25, 0x00 ; 0
1c2: 0c 94 e3 00 jmp 0x1c6
000001c6 <_exit>:
1c6: ff cf rjmp .-2 ; 0x1c6
---------
The listing with avr-gcc (GCC) 3.3.1:
1 .file "main.c"
2 .arch atmega64
3 __SREG__ = 0x3f
4 __SP_H__ = 0x3e
5 __SP_L__ = 0x3d
6 __tmp_reg__ = 0
7 __zero_reg__ = 1
8 .global __do_copy_data
9 .global __do_clear_bss
12 .text
13 .Ltext0:
38 .global foo0
40 foo0:
1:main.c **** /*Compiling: main.c using (for the sake of argument)
2:main.c ****
3:main.c **** avr-gcc -c -mmcu=atmega64 -I. -g -Os
-funsigned-char -funsigned-bitfields
4:main.c **** -fpack-struct
5:main.c **** -fshort-enums -Wall -Wstrict-prototypes
-Wa,-adhlns=main.lst
6:main.c **** -I/usr/local/avr/include
7:main.c **** -std=gnu99 -funsafe-math-optimizations
8:main.c **** -Wp,-M,-MP,-MT,main.o,-MF,.dep/main.o.d main.c
9:main.c **** -o main.o
10:main.c ****
11:main.c **** Linking: main.elf (again for the sake of argumnet)
12:main.c **** avr-gcc -mmcu=atmega64 -I. -g -Os -funsigned-char
-funsigned-bitfields
13:main.c **** -fpack-struct
14:main.c **** -fshort-enums -Wall -Wstrict-prototypes
-Wa,-adhlns=main.o
15:main.c **** -I/usr/local/avr/include
16:main.c **** -std=gnu99 -funsafe-math-optimizations
17:main.c **** -Wp,-M,-MP,-MT,main.o,-MF,.dep/main.elf.d main.o
18:main.c **** --output main.elf -Wl,-Map=main.map,--cref -lm
19:main.c ****
20:main.c **** File: main.c
21:main.c ****
22:main.c **** */
23:main.c ****
24:main.c **** int foo0 ( int a ){
42 .LM1:
43 /* prologue: frame size=0 */
44 /* prologue end (size=0) */
25:main.c ****
26:main.c **** if (a & 0x800000L)
46 .LM2:
47 0000 AA27 clr r26
48 0002 97FD sbrc r25,7
49 0004 A095 com r26
50 0006 BA2F mov r27,r26
51 0008 A7FF sbrs r26,7
52 000a 03C0 rjmp .L2
27:main.c **** return 1;
54 .LM3:
55 000c 81E0 ldi r24,lo8(1)
56 000e 90E0 ldi r25,hi8(1)
28:main.c **** else
29:main.c **** return 2 ;
30:main.c ****
31:main.c **** }
58 .LM4:
59 0010 0895 ret
60 .L2:
62 .LM5:
63 0012 82E0 ldi r24,lo8(2)
64 0014 90E0 ldi r25,hi8(2)
66 .LM6:
67 0016 0895 ret
68 /* epilogue: frame size=0 */
69 0018 0895 ret
70 /* epilogue end (size=1) */
71 /* function foo0 size 13 (12) */
73 .Lscope0:
77 .global foo1
79 foo1:
32:main.c ****
33:main.c **** int foo1 ( int a ){
81 .LM7:
82 /* prologue: frame size=0 */
83 /* prologue end (size=0) */
34:main.c ****
35:main.c **** if (a & (1L << 23))
85 .LM8:
86 001a AA27 clr r26
87 001c 97FD sbrc r25,7
88 001e A095 com r26
89 0020 BA2F mov r27,r26
90 0022 A7FF sbrs r26,7
91 0024 03C0 rjmp .L5
36:main.c **** return 1;
93 .LM9:
94 0026 81E0 ldi r24,lo8(1)
95 0028 90E0 ldi r25,hi8(1)
37:main.c **** else
38:main.c **** return 2 ;
39:main.c ****
40:main.c **** }
97 .LM10:
98 002a 0895 ret
99 .L5:
101 .LM11:
102 002c 82E0 ldi r24,lo8(2)
103 002e 90E0 ldi r25,hi8(2)
105 .LM12:
106 0030 0895 ret
107 /* epilogue: frame size=0 */
108 0032 0895 ret
109 /* epilogue end (size=1) */
110 /* function foo1 size 13 (12) */
112 .Lscope1:
116 .global foo2
118 foo2:
41:main.c ****
42:main.c **** int foo2 ( long a ){
120 .LM13:
121 /* prologue: frame size=0 */
122 /* prologue end (size=0) */
123 0034 DC01 movw r26,r24
124 0036 CB01 movw r24,r22
43:main.c ****
44:main.c **** if (a & 0x800000L)
126 .LM14:
127 0038 A7FF sbrs r26,7
128 003a 03C0 rjmp .L8
45:main.c **** return 1;
130 .LM15:
131 003c 81E0 ldi r24,lo8(1)
132 003e 90E0 ldi r25,hi8(1)
46:main.c **** else
47:main.c **** return 2 ;
48:main.c ****
49:main.c **** }
134 .LM16:
135 0040 0895 ret
136 .L8:
138 .LM17:
139 0042 82E0 ldi r24,lo8(2)
140 0044 90E0 ldi r25,hi8(2)
142 .LM18:
143 0046 0895 ret
144 /* epilogue: frame size=0 */
145 0048 0895 ret
146 /* epilogue end (size=1) */
147 /* function foo2 size 11 (10) */
149 .Lscope2:
153 .global foo3
155 foo3:
50:main.c ****
51:main.c **** int foo3 ( long a ){
157 .LM19:
158 /* prologue: frame size=0 */
159 /* prologue end (size=0) */
160 004a DC01 movw r26,r24
161 004c CB01 movw r24,r22
52:main.c ****
53:main.c **** if (a & (1L << 23))
163 .LM20:
164 004e A7FF sbrs r26,7
165 0050 03C0 rjmp .L11
54:main.c **** return 1;
167 .LM21:
168 0052 81E0 ldi r24,lo8(1)
169 0054 90E0 ldi r25,hi8(1)
55:main.c **** else
56:main.c **** return 2 ;
57:main.c ****
58:main.c **** }
171 .LM22:
172 0056 0895 ret
173 .L11:
175 .LM23:
176 0058 82E0 ldi r24,lo8(2)
177 005a 90E0 ldi r25,hi8(2)
179 .LM24:
180 005c 0895 ret
181 /* epilogue: frame size=0 */
182 005e 0895 ret
183 /* epilogue end (size=1) */
184 /* function foo3 size 11 (10) */
186 .Lscope3:
189 .global main
191 main:
59:main.c ****
60:main.c **** int main( void ){
193 .LM25:
194 /* prologue: frame size=2 */
195 0060 C0E0 ldi r28,lo8(__stack - 2)
196 0062 D0E0 ldi r29,hi8(__stack - 2)
197 0064 DEBF out __SP_H__,r29
198 0066 CDBF out __SP_L__,r28
199 /* prologue end (size=4) */
61:main.c ****
62:main.c **** volatile int a;
63:main.c ****
64:main.c **** a = foo0 ( a );
201 .LM26:
202 .LBB2:
203 0068 8981 ldd r24,Y+1
204 006a 9A81 ldd r25,Y+2
205 006c 0E94 0000 call foo0
206 0070 8983 std Y+1,r24
207 0072 9A83 std Y+2,r25
65:main.c **** a = foo1 ( a );
209 .LM27:
210 0074 8981 ldd r24,Y+1
211 0076 9A81 ldd r25,Y+2
212 0078 0E94 0000 call foo1
213 007c 8983 std Y+1,r24
214 007e 9A83 std Y+2,r25
66:main.c **** a = foo2 ( a );
216 .LM28:
217 0080 8981 ldd r24,Y+1
218 0082 9A81 ldd r25,Y+2
219 0084 AA27 clr r26
220 0086 97FD sbrc r25,7
221 0088 A095 com r26
222 008a BA2F mov r27,r26
223 008c BC01 movw r22,r24
224 008e CD01 movw r24,r26
225 0090 0E94 0000 call foo2
226 0094 8983 std Y+1,r24
227 0096 9A83 std Y+2,r25
67:main.c **** a = foo3 ( a );
229 .LM29:
230 0098 8981 ldd r24,Y+1
231 009a 9A81 ldd r25,Y+2
232 009c AA27 clr r26
233 009e 97FD sbrc r25,7
234 00a0 A095 com r26
235 00a2 BA2F mov r27,r26
236 00a4 BC01 movw r22,r24
237 00a6 CD01 movw r24,r26
238 00a8 0E94 0000 call foo3
239 00ac 8983 std Y+1,r24
240 00ae 9A83 std Y+2,r25
68:main.c ****
69:main.c **** return 0;
70:main.c **** }
242 .LM30:
243 .LBE2:
244 00b0 80E0 ldi r24,lo8(0)
245 00b2 90E0 ldi r25,hi8(0)
246 /* epilogue: frame size=2 */
247 00b4 0C94 0000 jmp exit
248 /* epilogue end (size=2) */
249 /* function main size 44 (38) */
254 .Lscope4:
256 .text
258 Letext:
259 /* File "main.c": code 92 = 0x005c ( 82),
prologues 4, epilogues 6 */
DEFINED SYMBOLS
*ABS*:00000000 main.c
*ABS*:0000003f __SREG__
*ABS*:0000003e __SP_H__
*ABS*:0000003d __SP_L__
*ABS*:00000000 __tmp_reg__
*ABS*:00000001 __zero_reg__
/tmp/ccS6dcXA.s:40 .text:00000000 foo0
/tmp/ccS6dcXA.s:79 .text:0000001a foo1
/tmp/ccS6dcXA.s:118 .text:00000034 foo2
/tmp/ccS6dcXA.s:155 .text:0000004a foo3
/tmp/ccS6dcXA.s:191 .text:00000060 main
/tmp/ccS6dcXA.s:258 .text:000000b8 Letext
UNDEFINED SYMBOLS
__do_copy_data
__do_clear_bss
__stack
exit
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (14 preceding siblings ...)
2004-11-11 20:29 ` schlie at comcast dot net
@ 2004-11-16 23:58 ` dmixm at marine dot febras dot ru
2004-11-17 7:22 ` schlie at comcast dot net
` (15 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: dmixm at marine dot febras dot ru @ 2004-11-16 23:58 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dmixm at marine dot febras dot ru 2004-11-16 23:58 -------
In March, 2004 Richard Sandiford has offered a patch for elimination of
this problem. See: http://gcc.gnu.org/ml/gcc/2004-03/msg01456.html This
patch modifies function do_jump (a file dojump.c). This change now is
present at a branch 4.0, but does not enter in 3.4.x. I have tried to
apply this patch to 3.4.3. It has earned only after change of a line
if (TREE_CODE (TREE_OPERAND (exp, 0)) == RSHIFT_EXPR
...
to
tree arg_nops = TREE_OPERAND (exp, 0); /* and below the same subst. */
STRIP_NOPS (arg_nops);
if (TREE_CODE (arg_nops) == RSHIFT_EXPR
...
and only for function foo_i.
foo_i() before patch:
...
mov r24,r25
ldi r25,6
1: lsr r24
dec r25
brne 1b
sbrs r24,0
rjmp .L2
foo_i() after patch:
...
sbrs r25,6
rjmp .L8
But foo_ll (shift loop with count 62!) and foo_l have remained on old -
through shift of the left argument.
Source file:
~~~~~~~~~~~
int foo_ll (long long x)
{
return (x & 0x4000000000000000LL) ? 1 : 3;
}
int foo_l (long x)
{
return (x & 0x40000000) ? 5 : 7;
}
int foo_i (int x)
{
return (x & 0x4000) ? 9 : 11;
}
int foo_c (char x)
{
return (x & 0x40) ? 13 : 15;
}
P.S. The code for foo_c was and remains beautiful due to work
`gcc/combine.c' .
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (15 preceding siblings ...)
2004-11-16 23:58 ` dmixm at marine dot febras dot ru
@ 2004-11-17 7:22 ` schlie at comcast dot net
2004-12-09 12:51 ` [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, " giovannibajo at libero dot it
` (14 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-11-17 7:22 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-11-17 07:21 -------
Subject: Re: 3.4.3 ~6x+ performance regression vs
3.3.1
> From: dmixm at marine dot febras dot ru <gcc-bugzilla@gcc.gnu.org>
> But foo_ll (shift loop with count 62!) and foo_l have remained on old -
> through shift of the left argument.
It would seem in general that if GCC in addition to transforming:
((size)x & (size)<pow-of-2-value>)) => (size)(x >> <log2-value>)
it also then transformed all RSHIFT expressions of the form:
(size)(x >> <value>)
=>
(size-lsw)((subreg:(size-lsw) x:size lsw)) >> (<value>-(word-size*lsw)))
[where lsw is the lest significant pow-of-2 word which remains significant
i.e lsw = 0 implies all words remain significant, lsw = 1 implies the
precision of the result may be reduced by one word, etc.]
It would enable for a hypothetical 8-bit word-wide machine:
(4-word-type)(X & 0x04000000)
=>
(4-word-type)(X >> 26)
=>
(1-word-type)((subreg:1-word-type X:4-word-type 3) >> 2)
Or for a 16-bit word wide machine: (or analogously for even wider machines)
(2-word-type)(X & 0x04000000)
=>
(2-word-type)(X >> 26)
=>
(1-word-type)((subreg:1-word-type X:2-word-type 1) >> 10)
Which both enables the use of the resulting demoted precision value to be
expressed more optimally as an argument in subsequent expressions, and
to more optimally generate code on less-then-word-wide target machines.
Furthermore, the subreg representation of demoted expression values, could
also enable the potential further optimization of sign/zero extended values,
by allowing the detection of only sign or zero extended bits remaining
significant:
(4-word-type)((char)X & (long)0x04000000)
=>
(4-word-type)((long<-char)X >> 26)
=>
(1-word-type)((subreg:char ((long<-char)x):long 3) >> 2)
=>
(1-word-type)((subreg:char (sign x):char 3) >> 2)
=>
(1-word-type)(sign x)
Or something like that possibly.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (16 preceding siblings ...)
2004-11-17 7:22 ` schlie at comcast dot net
@ 2004-12-09 12:51 ` giovannibajo at libero dot it
2004-12-09 15:00 ` roger at eyesopen dot com
` (13 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: giovannibajo at libero dot it @ 2004-12-09 12:51 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-12-09 12:51 -------
Proposed (partial) patch:
http://gcc.gnu.org/ml/gcc-patches/2004-12/msg00655.html
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |giovannibajo at libero dot
| |it
AssignedTo|unassigned at gcc dot gnu |sayle at gcc dot gnu dot org
|dot org |
Severity|enhancement |critical
Status|NEW |ASSIGNED
Keywords| |patch
Known to fail| |3.4.0 4.0.0
Known to work| |3.3.1
Summary|3.4.3 ~6x+ performance |[3.4/4.0 Regression] ~6x+
|regression vs 3.3.1, |performance regression,
|constant trees not being |constant trees not being
|computed. |computed.
Target Milestone|--- |4.0.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (17 preceding siblings ...)
2004-12-09 12:51 ` [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, " giovannibajo at libero dot it
@ 2004-12-09 15:00 ` roger at eyesopen dot com
2004-12-09 15:23 ` schlie at comcast dot net
` (12 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: roger at eyesopen dot com @ 2004-12-09 15:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From roger at eyesopen dot com 2004-12-09 14:59 -------
The patch is a "partial" fix as there will still be a performance regression for
the code generated vs. gcc 3.3.1. The reason being that 3.3.1 generated
incorrect code for test program in this PR.
int foo(int a) { return (a & (1L<<23)) ? 1 : 2; }
is supposed to sign-extend "a" from a 16-bit integer to a 32-bit long, to match
the (long) constant operand. This sign-extension may end setting bit 23, so the
result of the function is dependent upon "a". i.e. the best we can do is:
int foo(int a) { return (a < 0) ? 1 : 2; }
Unfortunately, we don't quite get that efficient for avr-elf, instead still
producing the sign-extension and an AND by constant. This fixes the regression
aspects of this patch, but we still have a missed optimization. The code we
generate is:
clr r26
sbrc r25,7
com r26
mov r27,r26
sbrs r26,7
rjmp .L2
ldi r24,lo8(1)
ldi r25,hi8(1)
ret
.L2:
ldi r24,lo8(2)
ldi r25,hi8(2)
ret
Perhaps once the patch is committed, we should close this PR, and open a
separate "enhancement" request PR to catch this missed AND(SIGN_EXTEND ..))
opportunity on AVR.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (18 preceding siblings ...)
2004-12-09 15:00 ` roger at eyesopen dot com
@ 2004-12-09 15:23 ` schlie at comcast dot net
2004-12-09 15:52 ` schlie at comcast dot net
` (11 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-12-09 15:23 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-12-09 15:23 -------
Subject: Re: [3.4/4.0 Regression] ~6x+ performance
regression, constant trees not being computed.
Few thoughts:
- I believe avr's back end does know how to convert:
((char)x & <pow2-const>) => bit-test x <log2-const>
which I believe it had relied upon previously.
- might it be possible to recognize, possibly through some sub-reg
mechanism as avr's word-size is 1(byte), that any right-shift which is
an word-size multiple may be subtracted off both the logical shited value
via higher-order subword selection (effectively demoting the operand), and
correspondingly reducing the const-shift value by the N*word-bit-size?
i.e.
(((long)x & (1 << 28)) == 0) => (((sub-word (long)x 3) >> 4) == 0)
=> (((byte)x >> 4) == 0)
(then possibly => (((byte)x & 0x10) == 0) which I believe avr's
back-end knows how to transform into a [bit-test x 5] )
Might this be possible, as it would seem generally useful as well?
Thanks again, -paul-
> From: roger at eyesopen dot com <gcc-bugzilla@gcc.gnu.org>
> ------- Additional Comments From roger at eyesopen dot com 2004-12-09 14:59
> -------
> The patch is a "partial" fix as there will still be a performance regression
> for
> the code generated vs. gcc 3.3.1. The reason being that 3.3.1 generated
> incorrect code for test program in this PR.
>
> int foo(int a) { return (a & (1L<<23)) ? 1 : 2; }
>
> is supposed to sign-extend "a" from a 16-bit integer to a 32-bit long, to
> match
> the (long) constant operand. This sign-extension may end setting bit 23, so
> the
> result of the function is dependent upon "a". i.e. the best we can do is:
>
> int foo(int a) { return (a < 0) ? 1 : 2; }
>
> Unfortunately, we don't quite get that efficient for avr-elf, instead still
> producing the sign-extension and an AND by constant. This fixes the
> regression
> aspects of this patch, but we still have a missed optimization. The code we
> generate is:
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (19 preceding siblings ...)
2004-12-09 15:23 ` schlie at comcast dot net
@ 2004-12-09 15:52 ` schlie at comcast dot net
2004-12-11 1:49 ` cvs-commit at gcc dot gnu dot org
` (10 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-12-09 15:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-12-09 15:52 -------
Subject: Re: [3.4/4.0 Regression] ~6x+ performance
regression, constant trees not being computed.
Sorry, lost the fact that only a single bit needs to remain significant in
the resulting trasform:
(((long)x & (1 << 28)) == 0)
=>
((((byte)(sub-word (long)x 3) >> 4) & 1) == 0)
=>
((((byte)x' >> 4) & 1) == 0) :: (((byte)x' & 0x10) == 0)
> From: schlie at comcast dot net <gcc-bugzilla@gcc.gnu.org>
> i.e.
>
> (((long)x & (1 << 28)) == 0) => (((sub-word (long)x 3) >> 4) == 0)
> => (((byte)x >> 4) == 0)
>
> (then possibly => (((byte)x & 0x10) == 0) which I believe avr's
> back-end knows how to transform into a [bit-test x 5] )
>
> Might this be possible, as it would seem generally useful as well?
>
> Thanks again, -paul-
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (20 preceding siblings ...)
2004-12-09 15:52 ` schlie at comcast dot net
@ 2004-12-11 1:49 ` cvs-commit at gcc dot gnu dot org
2004-12-14 1:47 ` cvs-commit at gcc dot gnu dot org
` (9 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2004-12-11 1:49 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From cvs-commit at gcc dot gnu dot org 2004-12-11 01:49 -------
Subject: Bug 18424
CVSROOT: /cvs/gcc
Module name: gcc
Changes by: sayle@gcc.gnu.org 2004-12-11 01:49:06
Modified files:
gcc : ChangeLog dojump.c
Log message:
PR target/18002
PR middle-end/18424
* dojump.c (do_jump): When attempting to reverse the effects of
fold_single_bit_test, we need to STRIP_NOPS and narrowing type
conversions, and handle BIT_XOR_EXPR that's used to invert the
sense of the single bit test.
Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.6778&r2=2.6779
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/dojump.c.diff?cvsroot=gcc&r1=1.32&r2=1.33
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (21 preceding siblings ...)
2004-12-11 1:49 ` cvs-commit at gcc dot gnu dot org
@ 2004-12-14 1:47 ` cvs-commit at gcc dot gnu dot org
2004-12-14 2:11 ` pinskia at gcc dot gnu dot org
` (8 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2004-12-14 1:47 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From cvs-commit at gcc dot gnu dot org 2004-12-14 01:47 -------
Subject: Bug 18424
CVSROOT: /cvs/gcc
Module name: gcc
Branch: gcc-3_4-branch
Changes by: sayle@gcc.gnu.org 2004-12-14 01:47:35
Modified files:
gcc : ChangeLog dojump.c Makefile.in
Log message:
PR target/18002
PR middle-end/18424
Backport from mainline
2004-03-20 Richard Sandiford <rsandifo@redhat.com>
* Makefile.in (dojump.o): Depend on $(GGC_H) and dojump.h.
(GTFILES): Add $(srcdir)/dojump.h.
(gt-dojump.h): New dependency.
* dojump.c (and_reg, and_test, shift_test): New static variables.
(prefer_and_bit_test): New function.
(do_jump): Use it to choose between (X & (1 << C)) and (X >> C) & 1.
2004-03-21 Andrew Pinski <pinskia@gcc.gnu.org>
* dojump.c (prefer_and_bit_test): Fix which part of
the and_test is replaced.
2004-12-10 Roger Sayle <roger@eyesopen.com>
* dojump.c (do_jump): When attempting to reverse the effects of
fold_single_bit_test, we need to STRIP_NOPS and narrowing type
conversions, and handle BIT_XOR_EXPR that's used to invert the
sense of the single bit test.
Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=2.2326.2.730&r2=2.2326.2.731
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/dojump.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.9.4.1&r2=1.9.4.2
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/Makefile.in.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.1223.2.20&r2=1.1223.2.21
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (22 preceding siblings ...)
2004-12-14 1:47 ` cvs-commit at gcc dot gnu dot org
@ 2004-12-14 2:11 ` pinskia at gcc dot gnu dot org
2004-12-14 2:13 ` schlie at comcast dot net
` (7 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-14 2:11 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-14 02:11 -------
Fixed also.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
Target Milestone|4.0.0 |3.4.4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (23 preceding siblings ...)
2004-12-14 2:11 ` pinskia at gcc dot gnu dot org
@ 2004-12-14 2:13 ` schlie at comcast dot net
2004-12-14 5:04 ` ericw at evcohs dot com
` (6 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-12-14 2:13 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-12-14 02:13 -------
Thank you all; and would like to try to verfiy on 4.0 as well
once we can figure out now to get the avr target to reliably build.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (24 preceding siblings ...)
2004-12-14 2:13 ` schlie at comcast dot net
@ 2004-12-14 5:04 ` ericw at evcohs dot com
2004-12-14 12:33 ` schlie at comcast dot net
` (5 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: ericw at evcohs dot com @ 2004-12-14 5:04 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From ericw at evcohs dot com 2004-12-14 05:03 -------
Subject: Re: [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
On 14 Dec 2004 at 2:13, schlie at comcast dot net wrote:
>
> ------- Additional Comments From schlie at comcast dot net 2004-12-14 02:13 -------
>
> Thank you all; and would like to try to verfiy on 4.0 as well
> once we can figure out now to get the avr target to reliably build.
>
>
AFAIK, the avr target is supposed to build now for HEAD. I haven't tried a snapshot recently,
though...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (25 preceding siblings ...)
2004-12-14 5:04 ` ericw at evcohs dot com
@ 2004-12-14 12:33 ` schlie at comcast dot net
2004-12-14 14:18 ` ericw at evcohs dot com
` (4 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-12-14 12:33 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-12-14 12:33 -------
Subject: Re: [3.4/4.0 Regression] ~6x+ performance
regression, constant trees not being computed.
Nope, unfortunately not as of yesterday, since reload.c was tweaked last
week.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (26 preceding siblings ...)
2004-12-14 12:33 ` schlie at comcast dot net
@ 2004-12-14 14:18 ` ericw at evcohs dot com
2004-12-14 23:20 ` giovannibajo at libero dot it
` (3 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: ericw at evcohs dot com @ 2004-12-14 14:18 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From ericw at evcohs dot com 2004-12-14 14:18 -------
Subject: Re: [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
On 14 Dec 2004 at 12:33, schlie at comcast dot net wrote:
>
> ------- Additional Comments From schlie at comcast dot net 2004-12-14 12:33 -------
> Subject: Re: [3.4/4.0 Regression] ~6x+ performance
> regression, constant trees not being computed.
>
> Nope, unfortunately not as of yesterday, since reload.c was tweaked last
> week.
Please file a separate bug report about this ASAP.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (27 preceding siblings ...)
2004-12-14 14:18 ` ericw at evcohs dot com
@ 2004-12-14 23:20 ` giovannibajo at libero dot it
2004-12-21 7:59 ` schlie at comcast dot net
` (2 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: giovannibajo at libero dot it @ 2004-12-14 23:20 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-12-14 23:20 -------
Subject: Re: [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
ericw at evcohs dot com <gcc-bugzilla@gcc.gnu.org> wrote:
>> Nope, unfortunately not as of yesterday, since reload.c was tweaked
>> last week.
>
> Please file a separate bug report about this ASAP.
It was already opened: PR 18887
Giovanni Bajo
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (28 preceding siblings ...)
2004-12-14 23:20 ` giovannibajo at libero dot it
@ 2004-12-21 7:59 ` schlie at comcast dot net
2004-12-21 8:02 ` schlie at comcast dot net
2004-12-21 16:02 ` pinskia at gcc dot gnu dot org
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-12-21 7:59 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-12-21 07:59 -------
Problems, with 4.0 avr test results (some good, some bad, some odd);
000000c6 <main>:
int main (void){
c6: c8 ef ldi r28, 0xF8 ; 248
c8: d0 e1 ldi r29, 0x10 ; 16
ca: de bf out 0x3e, r29 ; 62
cc: cd bf out 0x3d, r28 ; 61
volatile char c;
volatile int i;
volatile long l;
/* char tests */
c = (c & (1 << 4));
ce: 89 81 ldd r24, Y+1 ; 0x01
d0: 80 71 andi r24, 0x10 ; 16 ; good.
d2: 89 83 std Y+1, r24 ; 0x01
if (c & (1 << 4))
d4: 89 81 ldd r24, Y+1 ; 0x01
d6: 84 ff sbrs r24, 4 ; good single bit test & branch
d8: 03 c0 rjmp .+6 ; 0xe0 <main+0x1a>
c = 1;
da: 81 e0 ldi r24, 0x01 ; 1
dc: 89 83 std Y+1, r24 ; 0x01
de: 01 c0 rjmp .+2 ; 0xe2 <main+0x1c>
else
c = 0;
e0: 19 82 std Y+1, r1 ; 0x01
c = (c & (1 << 4)) ? 1 : 0;
e2: 89 81 ldd r24, Y+1 ; 0x01
e4: 99 27 eor r25, r25
e6: 44 e0 ldi r20, 0x04 ; 4 ; bad, using shift loop. although same as above.
e8: 96 95 lsr r25
ea: 87 95 ror r24
ec: 4a 95 dec r20
ee: e1 f7 brne .-8 ; 0xe8 <main+0x22>
f0: 81 70 andi r24, 0x01 ; 1
f2: 89 83 std Y+1, r24 ; 0x01
c = sizeof(char);
f4: 81 e0 ldi r24, 0x01 ; 1
f6: 89 83 std Y+1, r24 ; 0x01
/* int tests */
i = i & (1 << 10);
f8: 8a 81 ldd r24, Y+2 ; 0x02
fa: 9b 81 ldd r25, Y+3 ; 0x03
fc: 80 70 andi r24, 0x00 ; 0 ; ok, but nicer if recognized only highest byte significant.
fe: 94 70 andi r25, 0x04 ; 4
100: 8a 83 std Y+2, r24 ; 0x02
102: 9b 83 std Y+3, r25 ; 0x03
if (i & (1 << 10))
104: 8a 81 ldd r24, Y+2 ; 0x02
106: 9b 81 ldd r25, Y+3 ; 0x03
108: 9c 01 movw r18, r24 ; shouldn't be moving the operand? rest as above.
10a: 20 70 andi r18, 0x00 ; 0
10c: 34 70 andi r19, 0x04 ; 4
10e: 92 ff sbrs r25, 2
110: 05 c0 rjmp .+10 ; 0x11c <main+0x56>
i = 1;
112: 81 e0 ldi r24, 0x01 ; 1
114: 90 e0 ldi r25, 0x00 ; 0 ; r1 = 0 already?,
116: 8a 83 std Y+2, r24 ; 0x02
118: 9b 83 std Y+3, r25 ; 0x03
11a: 02 c0 rjmp .+4 ; 0x120 <main+0x5a>
else
i = 0;
11c: 2a 83 std Y+2, r18 ; 0x02
11e: 3b 83 std Y+3, r19 ; 0x03 ; wrong, r19 = andi r19, 0x04 ?
i = (i & (1 << 10)) ? 1 : 0;
120: 8a 81 ldd r24, Y+2 ; 0x02
122: 9b 81 ldd r25, Y+3 ; 0x03
124: 89 2f mov r24, r25 ; nice, shifts by 8 first.
126: 99 27 eor r25, r25
128: 86 95 lsr r24
12a: 86 95 lsr r24
12c: 81 70 andi r24, 0x01 ; 1
12e: 90 70 andi r25, 0x00 ; 0 ; but then fogets it already set r25 to 0?
130: 8a 83 std Y+2, r24 ; 0x02
132: 9b 83 std Y+3, r25 ; 0x03
i = sizeof(int);
134: 82 e0 ldi r24, 0x02 ; 2
136: 90 e0 ldi r25, 0x00 ; 0
138: 8a 83 std Y+2, r24 ; 0x02
13a: 9b 83 std Y+3, r25 ; 0x03
/* long tests */
l = (l & ((long)1 << 26));
13c: 8c 81 ldd r24, Y+4 ; 0x04
13e: 9d 81 ldd r25, Y+5 ; 0x05
140: ae 81 ldd r26, Y+6 ; 0x06
142: bf 81 ldd r27, Y+7 ; 0x07
144: 80 70 andi r24, 0x00 ; 0 ; ok.
146: 90 70 andi r25, 0x00 ; 0
148: a0 70 andi r26, 0x00 ; 0
14a: b4 70 andi r27, 0x04 ; 4
14c: 8c 83 std Y+4, r24 ; 0x04
14e: 9d 83 std Y+5, r25 ; 0x05
150: ae 83 std Y+6, r26 ; 0x06
152: bf 83 std Y+7, r27 ; 0x07
if (l & ((long)1 << 26))
154: 8c 81 ldd r24, Y+4 ; 0x04
156: 9d 81 ldd r25, Y+5 ; 0x05
158: ae 81 ldd r26, Y+6 ; 0x06
15a: bf 81 ldd r27, Y+7 ; 0x07
15c: 9c 01 movw r18, r24 ; again unnessisarily moving things?
15e: ad 01 movw r20, r26
160: 20 70 andi r18, 0x00 ; 0 ; very odd, both &'s
162: 30 70 andi r19, 0x00 ; 0
164: 40 70 andi r20, 0x00 ; 0
166: 54 70 andi r21, 0x04 ; 4
168: b2 ff sbrs r27, 2 ; and tests the significant bit, didn't need to do above.
16a: 09 c0 rjmp .+18 ; 0x17e <main+0xb8>
l = 1;
16c: 81 e0 ldi r24, 0x01 ; 1
16e: 90 e0 ldi r25, 0x00 ; 0 ; again r1 = 0 already?
170: a0 e0 ldi r26, 0x00 ; 0
172: b0 e0 ldi r27, 0x00 ; 0
174: 8c 83 std Y+4, r24 ; 0x04
176: 9d 83 std Y+5, r25 ; 0x05
178: ae 83 std Y+6, r26 ; 0x06
17a: bf 83 std Y+7, r27 ; 0x07
17c: 04 c0 rjmp .+8 ; 0x186 <main+0xc0>
else
l = 0;
17e: 2c 83 std Y+4, r18 ; 0x04
180: 3d 83 std Y+5, r19 ; 0x05
182: 4e 83 std Y+6, r20 ; 0x06
184: 5f 83 std Y+7, r21 ; 0x07 ; wrong r21 = andi r21, 0x04
l = (l & ((long)1 << 26)) ? 1 : 0;
186: 8c 81 ldd r24, Y+4 ; 0x04
188: 9d 81 ldd r25, Y+5 ; 0x05
18a: ae 81 ldd r26, Y+6 ; 0x06
18c: bf 81 ldd r27, Y+7 ; 0x07
18e: 2a e1 ldi r18, 0x1A ; 26
190: b6 95 lsr r27 ; not good, big shift loop, should & or just test most sig. byte.
192: a7 95 ror r26
194: 97 95 ror r25
196: 87 95 ror r24
198: 2a 95 dec r18
19a: d1 f7 brne .-12 ; 0x190 <main+0xca>
19c: aa 27 eor r26, r26
19e: 97 fd sbrc r25, 7
1a0: a0 95 com r26
1a2: ba 2f mov r27, r26
1a4: 81 70 andi r24, 0x01 ; 1
1a6: 90 70 andi r25, 0x00 ; 0
1a8: a0 70 andi r26, 0x00 ; 0
1aa: b0 70 andi r27, 0x00 ; 0
1ac: 8c 83 std Y+4, r24 ; 0x04
1ae: 9d 83 std Y+5, r25 ; 0x05
1b0: ae 83 std Y+6, r26 ; 0x06
1b2: bf 83 std Y+7, r27 ; 0x07
l = sizeof(long);
1a8: 84 e0 ldi r24, 0x04 ; 4
1aa: 90 e0 ldi r25, 0x00 ; 0
1ac: a0 e0 ldi r26, 0x00 ; 0
1ae: b0 e0 ldi r27, 0x00 ; 0
1b0: 8c 83 std Y+4, r24 ; 0x04
1b2: 9d 83 std Y+5, r25 ; 0x05
1b4: ae 83 std Y+6, r26 ; 0x06
1b6: bf 83 std Y+7, r27 ; 0x07
return 0;
}
1b8: 80 e0 ldi r24, 0x00 ; 0
1ba: 90 e0 ldi r25, 0x00 ; 0
1bc: 0c 94 e0 00 jmp 0x1c0 <_exit>
--
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (29 preceding siblings ...)
2004-12-21 7:59 ` schlie at comcast dot net
@ 2004-12-21 8:02 ` schlie at comcast dot net
2004-12-21 16:02 ` pinskia at gcc dot gnu dot org
31 siblings, 0 replies; 33+ messages in thread
From: schlie at comcast dot net @ 2004-12-21 8:02 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From schlie at comcast dot net 2004-12-21 08:02 -------
Problems, with 4.0 avr test results (some good, some bad, some odd);
000000c6 <main>:
int main (void){
c6: c8 ef ldi r28, 0xF8 ; 248
c8: d0 e1 ldi r29, 0x10 ; 16
ca: de bf out 0x3e, r29 ; 62
cc: cd bf out 0x3d, r28 ; 61
volatile char c;
volatile int i;
volatile long l;
/* char tests */
c = (c & (1 << 4));
ce: 89 81 ldd r24, Y+1 ; 0x01
d0: 80 71 andi r24, 0x10 ; 16 ; good.
d2: 89 83 std Y+1, r24 ; 0x01
if (c & (1 << 4))
d4: 89 81 ldd r24, Y+1 ; 0x01
d6: 84 ff sbrs r24, 4 ; good single bit test & branch
d8: 03 c0 rjmp .+6 ; 0xe0 <main+0x1a>
c = 1;
da: 81 e0 ldi r24, 0x01 ; 1
dc: 89 83 std Y+1, r24 ; 0x01
de: 01 c0 rjmp .+2 ; 0xe2 <main+0x1c>
else
c = 0;
e0: 19 82 std Y+1, r1 ; 0x01
c = (c & (1 << 4)) ? 1 : 0;
e2: 89 81 ldd r24, Y+1 ; 0x01
e4: 99 27 eor r25, r25
e6: 44 e0 ldi r20, 0x04 ; 4 ; bad, using shift loop. although same as above.
e8: 96 95 lsr r25
ea: 87 95 ror r24
ec: 4a 95 dec r20
ee: e1 f7 brne .-8 ; 0xe8 <main+0x22>
f0: 81 70 andi r24, 0x01 ; 1
f2: 89 83 std Y+1, r24 ; 0x01
c = sizeof(char);
f4: 81 e0 ldi r24, 0x01 ; 1
f6: 89 83 std Y+1, r24 ; 0x01
/* int tests */
i = i & (1 << 10);
f8: 8a 81 ldd r24, Y+2 ; 0x02
fa: 9b 81 ldd r25, Y+3 ; 0x03
fc: 80 70 andi r24, 0x00 ; 0 ; ok, but nicer if recognized only highest byte significant.
fe: 94 70 andi r25, 0x04 ; 4
100: 8a 83 std Y+2, r24 ; 0x02
102: 9b 83 std Y+3, r25 ; 0x03
if (i & (1 << 10))
104: 8a 81 ldd r24, Y+2 ; 0x02
106: 9b 81 ldd r25, Y+3 ; 0x03
108: 9c 01 movw r18, r24 ; shouldn't be moving the operand? rest as above.
10a: 20 70 andi r18, 0x00 ; 0
10c: 34 70 andi r19, 0x04 ; 4
10e: 92 ff sbrs r25, 2
110: 05 c0 rjmp .+10 ; 0x11c <main+0x56>
i = 1;
112: 81 e0 ldi r24, 0x01 ; 1
114: 90 e0 ldi r25, 0x00 ; 0 ; r1 = 0 already?,
116: 8a 83 std Y+2, r24 ; 0x02
118: 9b 83 std Y+3, r25 ; 0x03
11a: 02 c0 rjmp .+4 ; 0x120 <main+0x5a>
else
i = 0;
11c: 2a 83 std Y+2, r18 ; 0x02
11e: 3b 83 std Y+3, r19 ; 0x03 ; wrong, r19 = andi r19, 0x04 ?
i = (i & (1 << 10)) ? 1 : 0;
120: 8a 81 ldd r24, Y+2 ; 0x02
122: 9b 81 ldd r25, Y+3 ; 0x03
124: 89 2f mov r24, r25 ; nice, shifts by 8 first.
126: 99 27 eor r25, r25
128: 86 95 lsr r24
12a: 86 95 lsr r24
12c: 81 70 andi r24, 0x01 ; 1
12e: 90 70 andi r25, 0x00 ; 0 ; but then fogets it already set r25 to 0?
130: 8a 83 std Y+2, r24 ; 0x02
132: 9b 83 std Y+3, r25 ; 0x03
i = sizeof(int);
134: 82 e0 ldi r24, 0x02 ; 2
136: 90 e0 ldi r25, 0x00 ; 0
138: 8a 83 std Y+2, r24 ; 0x02
13a: 9b 83 std Y+3, r25 ; 0x03
/* long tests */
l = (l & ((long)1 << 26));
13c: 8c 81 ldd r24, Y+4 ; 0x04
13e: 9d 81 ldd r25, Y+5 ; 0x05
140: ae 81 ldd r26, Y+6 ; 0x06
142: bf 81 ldd r27, Y+7 ; 0x07
144: 80 70 andi r24, 0x00 ; 0 ; ok.
146: 90 70 andi r25, 0x00 ; 0
148: a0 70 andi r26, 0x00 ; 0
14a: b4 70 andi r27, 0x04 ; 4
14c: 8c 83 std Y+4, r24 ; 0x04
14e: 9d 83 std Y+5, r25 ; 0x05
150: ae 83 std Y+6, r26 ; 0x06
152: bf 83 std Y+7, r27 ; 0x07
if (l & ((long)1 << 26))
154: 8c 81 ldd r24, Y+4 ; 0x04
156: 9d 81 ldd r25, Y+5 ; 0x05
158: ae 81 ldd r26, Y+6 ; 0x06
15a: bf 81 ldd r27, Y+7 ; 0x07
15c: 9c 01 movw r18, r24 ; again unnessisarily moving things?
15e: ad 01 movw r20, r26
160: 20 70 andi r18, 0x00 ; 0 ; very odd, both &'s
162: 30 70 andi r19, 0x00 ; 0
164: 40 70 andi r20, 0x00 ; 0
166: 54 70 andi r21, 0x04 ; 4
168: b2 ff sbrs r27, 2 ; and tests the significant bit, didn't need to do above.
16a: 09 c0 rjmp .+18 ; 0x17e <main+0xb8>
l = 1;
16c: 81 e0 ldi r24, 0x01 ; 1
16e: 90 e0 ldi r25, 0x00 ; 0 ; again r1 = 0 already?
170: a0 e0 ldi r26, 0x00 ; 0
172: b0 e0 ldi r27, 0x00 ; 0
174: 8c 83 std Y+4, r24 ; 0x04
176: 9d 83 std Y+5, r25 ; 0x05
178: ae 83 std Y+6, r26 ; 0x06
17a: bf 83 std Y+7, r27 ; 0x07
17c: 04 c0 rjmp .+8 ; 0x186 <main+0xc0>
else
l = 0;
17e: 2c 83 std Y+4, r18 ; 0x04
180: 3d 83 std Y+5, r19 ; 0x05
182: 4e 83 std Y+6, r20 ; 0x06
184: 5f 83 std Y+7, r21 ; 0x07 ; wrong r21 = andi r21, 0x04
l = (l & ((long)1 << 26)) ? 1 : 0;
186: 8c 81 ldd r24, Y+4 ; 0x04
188: 9d 81 ldd r25, Y+5 ; 0x05
18a: ae 81 ldd r26, Y+6 ; 0x06
18c: bf 81 ldd r27, Y+7 ; 0x07
18e: 2a e1 ldi r18, 0x1A ; 26
190: b6 95 lsr r27 ; not good, big shift loop, should & or just test most sig. byte.
192: a7 95 ror r26
194: 97 95 ror r25
196: 87 95 ror r24
198: 2a 95 dec r18
19a: d1 f7 brne .-12 ; 0x190 <main+0xca>
19c: aa 27 eor r26, r26
19e: 97 fd sbrc r25, 7
1a0: a0 95 com r26
1a2: ba 2f mov r27, r26
1a4: 81 70 andi r24, 0x01 ; 1
1a6: 90 70 andi r25, 0x00 ; 0
1a8: a0 70 andi r26, 0x00 ; 0
1aa: b0 70 andi r27, 0x00 ; 0
1ac: 8c 83 std Y+4, r24 ; 0x04
1ae: 9d 83 std Y+5, r25 ; 0x05
1b0: ae 83 std Y+6, r26 ; 0x06
1b2: bf 83 std Y+7, r27 ; 0x07
l = sizeof(long);
1a8: 84 e0 ldi r24, 0x04 ; 4
1aa: 90 e0 ldi r25, 0x00 ; 0
1ac: a0 e0 ldi r26, 0x00 ; 0
1ae: b0 e0 ldi r27, 0x00 ; 0
1b0: 8c 83 std Y+4, r24 ; 0x04
1b2: 9d 83 std Y+5, r25 ; 0x05
1b4: ae 83 std Y+6, r26 ; 0x06
1b6: bf 83 std Y+7, r27 ; 0x07
return 0;
}
1b8: 80 e0 ldi r24, 0x00 ; 0
1ba: 90 e0 ldi r25, 0x00 ; 0
1bc: 0c 94 e0 00 jmp 0x1c0 <_exit>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
* [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, constant trees not being computed.
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
` (30 preceding siblings ...)
2004-12-21 8:02 ` schlie at comcast dot net
@ 2004-12-21 16:02 ` pinskia at gcc dot gnu dot org
31 siblings, 0 replies; 33+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-21 16:02 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-21 16:02 -------
No the orginal problem was fixed, please open a new bug about the new problem, I would not doubt
that the new problem is not a regression.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18424
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2004-12-21 16:02 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-11 2:35 [Bug c/18424] New: 3.4.3 ~6x+ performance regression vs 3.3.1, constant trees not being computed schlie at comcast dot net
2004-11-11 2:48 ` [Bug middle-end/18424] " pinskia at gcc dot gnu dot org
2004-11-11 2:49 ` pinskia at gcc dot gnu dot org
2004-11-11 2:52 ` pinskia at gcc dot gnu dot org
2004-11-11 3:15 ` schlie at comcast dot net
2004-11-11 3:18 ` pinskia at gcc dot gnu dot org
2004-11-11 3:56 ` schlie at comcast dot net
2004-11-11 4:33 ` schlie at comcast dot net
2004-11-11 4:41 ` pinskia at gcc dot gnu dot org
2004-11-11 4:59 ` schlie at comcast dot net
2004-11-11 5:04 ` pinskia at gcc dot gnu dot org
2004-11-11 15:52 ` schlie at comcast dot net
2004-11-11 16:22 ` joseph at codesourcery dot com
2004-11-11 16:30 ` ericw at evcohs dot com
2004-11-11 17:19 ` schlie at comcast dot net
2004-11-11 20:29 ` schlie at comcast dot net
2004-11-16 23:58 ` dmixm at marine dot febras dot ru
2004-11-17 7:22 ` schlie at comcast dot net
2004-12-09 12:51 ` [Bug middle-end/18424] [3.4/4.0 Regression] ~6x+ performance regression, " giovannibajo at libero dot it
2004-12-09 15:00 ` roger at eyesopen dot com
2004-12-09 15:23 ` schlie at comcast dot net
2004-12-09 15:52 ` schlie at comcast dot net
2004-12-11 1:49 ` cvs-commit at gcc dot gnu dot org
2004-12-14 1:47 ` cvs-commit at gcc dot gnu dot org
2004-12-14 2:11 ` pinskia at gcc dot gnu dot org
2004-12-14 2:13 ` schlie at comcast dot net
2004-12-14 5:04 ` ericw at evcohs dot com
2004-12-14 12:33 ` schlie at comcast dot net
2004-12-14 14:18 ` ericw at evcohs dot com
2004-12-14 23:20 ` giovannibajo at libero dot it
2004-12-21 7:59 ` schlie at comcast dot net
2004-12-21 8:02 ` schlie at comcast dot net
2004-12-21 16:02 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).