public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR
@ 2006-10-23 11:39 gcc-bugzilla at gcc dot gnu dot org
2006-10-24 7:32 ` [Bug target/29560] " r dot leitgeb at x-pin dot com
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: gcc-bugzilla at gcc dot gnu dot org @ 2006-10-23 11:39 UTC (permalink / raw)
To: gcc-bugs
For setting individual bits of a register, the construct
unsigned char BitLocation = Whatever;
REG |= (1<<BitLocation);
is commonly used. avr-gcc fails to realize that the target register
is only 8 bits wide and performs an unnecessary 16-Bit shift of 1.
Even if this code snippet is replaced by the following:
unsigned char BitLocation = Whatever;
const unsigned char one = 1;
REG |= (one << BitLocation);
The inefficient code will be generated, even with the -Os flag.
I suppose this is because the << operator is not defined for 8-bit
wide data types.
Environment:
System: Darwin variable.local 7.9.0 Darwin Kernel Version 7.9.0: Wed Mar 30
20:11:17 PST 2005; root:xnu/xnu-517.12.7.obj~1/RELEASE_PPC Power Macintosh
powerpc
host: powerpc-apple-darwin7.9.0
build: powerpc-apple-darwin7.9.0
target: avr-unknown-none
configured with: ../configure --target=avr --prefix=/usr/local/atmel
--enable-languages=c --disable-libssp --enable-__cxa_atexit
--enable-clocale=gnu --disable-nls : (reconfigured) ../configure --target=avr
--prefix=/usr/local/avr --enable-languages=c --disable-libssp
--enable-__cxa_atexit --enable-clocale=gnu --disable-nls
How-To-Repeat:
Use the following test program, compile with -Os -S to see assembler
code generated by avr-gcc:
#include <avr/io.h>
void setpin(unsigned char pin, unsigned char val) {
if (val == 1) PORTC |= (1<<pin);
else PORTC &= ~(1<<pin);
}
void setpin1(unsigned char pin, unsigned char val) {
const unsigned char one = 1;
if (val == 1) PORTC |= (one<<pin);
else PORTC &= ~(one<<pin);
}
int main(void) {
setpin(3, 1);
setpin1(3, 1);
return 0;
}
------- Comment #1 from r dot leitgeb at x-pin dot com 2006-10-23 11:39 -------
Fix:
Define shift operators for 8 bit wide data types, use if possible.
I'm not a compiler expert but I know that the multiply operator
detects the data types involved, so it should be doable with the
shift operators as well.
--
Summary: Poor optimization for character shifts on Atmel AVR
Product: gcc
Version: 4.1.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: r dot leitgeb at x-pin dot com
GCC build triplet: powerpc-apple-darwin7.9.0
GCC host triplet: powerpc-apple-darwin7.9.0
GCC target triplet: avr-unknown-none
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] Poor optimization for character shifts on Atmel AVR
2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
@ 2006-10-24 7:32 ` r dot leitgeb at x-pin dot com
2009-08-24 17:18 ` [Bug target/29560] [avr] Poor optimization for byte shifts eric dot weddington at atmel dot com
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: r dot leitgeb at x-pin dot com @ 2006-10-24 7:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from r dot leitgeb at x-pin dot com 2006-10-24 07:31 -------
Here's an excerpt of the assembly code obtained with -Os -S
with some comments from me:
----------------------------------------------------
setpin:
/* prologue: frame size=0 */
/* prologue end (size=0) */
mov r20,r24
clr r21
cpi r22,lo8(1) <--- The compare is done byte wide
brne .L2
in r18,53-0x20
ldi r24,lo8(1) <--- Here we load 1 sixteen bit wide
ldi r25,hi8(1)
rjmp 2f
1: lsl r24 <--- Here we shift 1 sixteen bit wide
rol r25
2: dec r20
brpl 1b
or r18,r24 <--- r25 is not used. Why would we compute it ?
rjmp .L6
.L2:
in r18,53-0x20
----------------------------------------------------
Cheers
Rudi
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
2006-10-24 7:32 ` [Bug target/29560] " r dot leitgeb at x-pin dot com
@ 2009-08-24 17:18 ` eric dot weddington at atmel dot com
2009-12-31 3:16 ` hutchinsonandy at gcc dot gnu dot org
2010-02-18 13:28 ` Rudolf dot Leitgeb at gmx dot at
3 siblings, 0 replies; 11+ messages in thread
From: eric dot weddington at atmel dot com @ 2009-08-24 17:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from eric dot weddington at atmel dot com 2009-08-24 17:18 -------
Confirmed on 4.3.2.
--
eric dot weddington at atmel dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
GCC build triplet|powerpc-apple-darwin7.9.0 |
GCC host triplet|powerpc-apple-darwin7.9.0 |
Keywords| |missed-optimization
Known to fail| |4.1.1 4.3.2
Last reconfirmed|0000-00-00 00:00:00 |2009-08-24 17:18:30
date| |
Summary|Poor optimization for |[avr] Poor optimization for
|character shifts on Atmel |byte shifts
|AVR |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
2006-10-24 7:32 ` [Bug target/29560] " r dot leitgeb at x-pin dot com
2009-08-24 17:18 ` [Bug target/29560] [avr] Poor optimization for byte shifts eric dot weddington at atmel dot com
@ 2009-12-31 3:16 ` hutchinsonandy at gcc dot gnu dot org
2010-02-18 13:28 ` Rudolf dot Leitgeb at gmx dot at
3 siblings, 0 replies; 11+ messages in thread
From: hutchinsonandy at gcc dot gnu dot org @ 2009-12-31 3:16 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from hutchinsonandy at gcc dot gnu dot org 2009-12-31 03:16 -------
The same occurs in 4.5
I think the bug is invalid
The expression 1<< pin will be promoted. This produces a defined result if
pin>7 and <15
The expression can not then be lower to 8 bit shift - since a shift by >7 is
undefined.
If you use pin &7, the shift is indeed 8 bit shift. Though it still loads a int
(HIMODE) value of 1 that should have been reduced to QIMODE.
setpinx:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
mov r25,r24
andi r25,lo8(7)
ldi r18,lo8(1)
ldi r19,hi8(1)
mov r24,r18
rjmp 2f
1: lsl r24
2: dec r25
brpl 1b
cpi r22,lo8(1)
brne .L7
in r25,53-0x20
or r25,r24
out 53-0x20,r25
ret
.L7:
in r25,53-0x20
com r24
and r24,r25
out 53-0x20,r24
ret
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
` (2 preceding siblings ...)
2009-12-31 3:16 ` hutchinsonandy at gcc dot gnu dot org
@ 2010-02-18 13:28 ` Rudolf dot Leitgeb at gmx dot at
3 siblings, 0 replies; 11+ messages in thread
From: Rudolf dot Leitgeb at gmx dot at @ 2010-02-18 13:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from Rudolf dot Leitgeb at gmx dot at 2010-02-18 13:28 -------
Right away: I am NOT an expert in compiler or optimizer design, so please bear
with me.
I understand, that (((unsigned char)1) << ((unsigned char)something)) may need
more than 8 bits for representation and that gcc's default int size on the
ATmega platform is 16 bits. But the result is assigned to an 8 bit value. I
take it that there is no way to back propagate this potential optimization from
the assignment to the shifting step. If you look in my assembly code, r25 is
computed with great effort yet never transferred anywhere.
The trick with &7 is nice but introduces another unnecessary AND operation. And
it is only correct if the input numebr is guaranteed to be between 0 and 7. Am
I correct that this whole optimization issue comes from the fact that ATmega is
an 8 bit architecture yet gcc's int on that platform is 16 bit?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
[not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2011-08-10 8:59 ` gjl at gcc dot gnu.org
@ 2011-08-10 9:20 ` gjl at gcc dot gnu.org
5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-08-10 9:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
Georg-Johann Lay <gjl at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
Known to fail|4.7.0 |
--- Comment #11 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-08-10 09:19:49 UTC ---
Closed as FIXED for 4.7.0 trunk with patch from comment #10.
The test case from attachment 24816 compiles with -Os -dp to:
shift1:
rjmp 2f ; 22 *ashlqi3/1 [length = 5]
1: lsl r24
2: dec r22
brpl 1b
ret ; 27 return [length = 1]
shift2:
mov r25,r24 ; 19 *movqi/1 [length = 1]
rjmp 2f ; 15 *ashlqi3/1 [length = 5]
1: lsl r25
2: dec r22
brpl 1b
sts y,r25 ; 16 *movqi/3 [length = 2]
ret ; 22 return [length = 1]
setpin:
in r25,44-0x20 ; 11 *movqi/4 [length = 1]
ldi r18,lo8(1) ; 13 *movhi/4 [length = 2]
ldi r19,hi8(1)
mov r0,r24 ; 49 *ashlqi3/1 [length = 5]
rjmp 2f
1: lsl r18
2: dec r0
brpl 1b
cpi r22,lo8(1) ; 7 *cmpqi/3 [length = 1]
brne .L4 ; 8 branch [length = 1]
or r25,r18 ; 15 iorqi3/1 [length = 1]
out 44-0x20,r25 ; 17 *movqi/3 [length = 1]
ret ; 46 return [length = 1]
.L4:
com r18 ; 27 one_cmplqi2 [length = 1]
and r18,r25 ; 28 andqi3/1 [length = 1]
out 44-0x20,r18 ; 30 *movqi/3 [length = 1]
ret ; 48 return [length = 1]
Notice the *ashlqi3 insns.
The fix uses combine patterns and an RTL peephole to transform
16-bit shift to 8-bit shift if applicable. There will be
situations where optimization opportunities are not noticed,
e.g. if a move occurs after the shift so that this move will
get a REG_UNUSED note but not the shift loop itself.
The fix just replaces 16-bit loop by 8-bit loop; there is no
back-propagation of the information.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
[not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2011-07-25 12:49 ` gjl at gcc dot gnu.org
@ 2011-08-10 8:59 ` gjl at gcc dot gnu.org
2011-08-10 9:20 ` gjl at gcc dot gnu.org
5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-08-10 8:59 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
--- Comment #10 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-08-10 08:58:07 UTC ---
Author: gjl
Date: Wed Aug 10 08:58:03 2011
New Revision: 177616
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=177616
Log:
PR target/29560
* config/avr/avr.md (*ashlhiqi3): New insn-and-split.
(*ashl<extend_prefix>qihiqi3): New insn-and-splits.
(*ashl<extend_prefix>qihiqi3.mem): New insn-and-splits.
Add peephole2 to map ashlhi3 to ashlqi3 if high part of
shift target is unused.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/avr/avr.md
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
[not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2011-07-23 14:55 ` gjl at gcc dot gnu.org
@ 2011-07-25 12:49 ` gjl at gcc dot gnu.org
2011-08-10 8:59 ` gjl at gcc dot gnu.org
2011-08-10 9:20 ` gjl at gcc dot gnu.org
5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-07-25 12:49 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
--- Comment #9 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-07-25 12:48:29 UTC ---
Created attachment 24827
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24827
Fix PR29560 by adding peephole2 pattern.
PR target/29560
* config/avr/avr.md: Add peephole2 to map ashlhi3 to ashlqi3 if
high part of shift target is unused.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
[not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
2011-06-21 18:25 ` gjl at gcc dot gnu.org
2011-07-23 14:51 ` gjl at gcc dot gnu.org
@ 2011-07-23 14:55 ` gjl at gcc dot gnu.org
2011-07-25 12:49 ` gjl at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-07-23 14:55 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
Georg-Johann Lay <gjl at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |NEW
Resolution|INVALID |
Target Milestone|--- |4.7.0
--- Comment #8 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-07-23 14:53:52 UTC ---
Set to NEW as explained in previous post.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
[not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
2011-06-21 18:25 ` gjl at gcc dot gnu.org
@ 2011-07-23 14:51 ` gjl at gcc dot gnu.org
2011-07-23 14:55 ` gjl at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-07-23 14:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
--- Comment #7 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-07-23 14:50:58 UTC ---
Created attachment 24816
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24816
Test case for 16-bit shifts that use low part only
Reopened this optimization issue.
For the attached test case, there are sometimes REG_UNUSED notes for the high
part like with avr-gcc-4.6.1 -Os -dP
shift1:
; (insn 8 4 17 (set (reg:HI 24 r24 [50])
; (ashift:HI (reg:HI 24 r24 [ x ])
; (reg/v:QI 22 r22 [orig:47 s ] [47]))) foo.c:3 68 {ashlhi3}
; (expr_list:REG_DEAD (reg/v:QI 22 r22 [orig:47 s ] [47])
; (expr_list:REG_UNUSED (reg:QI 25 r25)
; (nil))))
rjmp 2f ; 8 ashlhi3/1 [length = 6]
1: lsl r24
rol r25
2: dec r22
brpl 1b
So that there is a way to map ashlhi3 to ashlqi3, i.e. 16-bit shift to a 8-bit
shift.
Unfortunately, these notes are not always present like in shift2:
shift2:
; (insn 15 4 8 (set (reg:HI 18 r18)
; (reg:HI 24 r24 [ x ])) foo.c:10 10 {*movhi}
; (nil))
movw r18,r24 ; 15 *movhi/1 [length = 1]
; (insn 8 15 9 (set (reg:HI 18 r18)
; (ashift:HI (reg:HI 18 r18)
; (reg/v:QI 22 r22 [orig:46 s ] [46]))) foo.c:10 68 {ashlhi3}
; (expr_list:REG_DEAD (reg/v:QI 22 r22 [orig:46 s ] [46])
; (nil)))
rjmp 2f ; 8 ashlhi3/1 [length = 6]
1: lsl r18
rol r19
2: dec r22
brpl 1b
The notes are not present in pass .split2 so a split won't help. The notes
appear to be available in .peephole2 so that could be a fix. Moreover, the
notes won't be back-propagated so that an optimization will cover at most one
insn: the shift insn.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/29560] [avr] Poor optimization for byte shifts
[not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
@ 2011-06-21 18:25 ` gjl at gcc dot gnu.org
2011-07-23 14:51 ` gjl at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-06-21 18:25 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560
Georg-Johann Lay <gjl at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |INVALID
--- Comment #6 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-06-21 18:24:44 UTC ---
Closing this PR as invalid.
As Andy already explained, for shift offsets >7 and <15 there has to be a
defined result because int for avr-gcc is 16 bits, and this is actually *not* a
byte shift.
With alternative code that tested for these offsets at runtime, jumped around
and loaded 0 if appropriate, you were not better of than with the current code
-- in the contrary.
Note that gcc deflates you function calls to just one instruction in main.
So maybe what you really want here is to make these functions static inline so
that you do no more see the code in the functions. Presumably you always call
this functions with compile time constants.
If you really need to quench out the last tick and call with non-constants, you
could invent inline assembler with an instruction sequence that is similar to
(1 << (R24 & 7)), perhaps together with __builtin_constant_p magic:
LDI R24, 1
SBRC R24, 1
LDI R24, 4
SBRC R24, 0
LSL R24
SBRC R24, 2
SWAP R24
Note that with -mint8, int is 8 bits. That is still present as option, but no
more supported and perhaps non-functional.
--
Just for reference; here is a source without external #include:
#define PORTC (*((unsigned char volatile*) 53))
void setpin(unsigned char pin, unsigned char val)
{
if (val == 1) PORTC |= (1<<pin);
else PORTC &= ~(1<<pin);
}
void setpin1(unsigned char pin, unsigned char val)
{
const unsigned char one = 1;
if (val == 1) PORTC |= (one<<(pin));
else PORTC &= ~(one<<pin);
}
int main(void)
{
setpin(3, 1);
setpin1(3, 1);
return 0;
}
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-08-10 9:20 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
2006-10-24 7:32 ` [Bug target/29560] " r dot leitgeb at x-pin dot com
2009-08-24 17:18 ` [Bug target/29560] [avr] Poor optimization for byte shifts eric dot weddington at atmel dot com
2009-12-31 3:16 ` hutchinsonandy at gcc dot gnu dot org
2010-02-18 13:28 ` Rudolf dot Leitgeb at gmx dot at
[not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
2011-06-21 18:25 ` gjl at gcc dot gnu.org
2011-07-23 14:51 ` gjl at gcc dot gnu.org
2011-07-23 14:55 ` gjl at gcc dot gnu.org
2011-07-25 12:49 ` gjl at gcc dot gnu.org
2011-08-10 8:59 ` gjl at gcc dot gnu.org
2011-08-10 9:20 ` gjl at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).