[Bug target/29560] New: Poor optimization for character shifts on Atmel AVR

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/29560]  New: Poor optimization for character shifts on Atmel AVR
@ 2006-10-23 11:39 gcc-bugzilla at gcc dot gnu dot org
  2006-10-24  7:32 ` [Bug target/29560] " r dot leitgeb at x-pin dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: gcc-bugzilla at gcc dot gnu dot org @ 2006-10-23 11:39 UTC (permalink / raw)
  To: gcc-bugs


        For setting individual bits of a register, the construct
        unsigned char BitLocation = Whatever;
        REG |= (1<<BitLocation); 
        is commonly used. avr-gcc fails to realize that the target register
        is only 8 bits wide and performs an unnecessary 16-Bit shift of 1.
        Even if this code snippet is replaced by the following:
        unsigned char BitLocation = Whatever;
        const unsigned char one = 1;
        REG |= (one << BitLocation);
        The inefficient code will be generated, even with the -Os flag.
        I suppose this is because the << operator is not defined for 8-bit
        wide data types.

Environment:
System: Darwin variable.local 7.9.0 Darwin Kernel Version 7.9.0: Wed Mar 30
20:11:17 PST 2005; root:xnu/xnu-517.12.7.obj~1/RELEASE_PPC Power Macintosh
powerpc



host: powerpc-apple-darwin7.9.0
build: powerpc-apple-darwin7.9.0
target: avr-unknown-none
configured with: ../configure --target=avr --prefix=/usr/local/atmel
--enable-languages=c --disable-libssp --enable-__cxa_atexit
--enable-clocale=gnu --disable-nls : (reconfigured) ../configure --target=avr
--prefix=/usr/local/avr --enable-languages=c --disable-libssp
--enable-__cxa_atexit --enable-clocale=gnu --disable-nls

How-To-Repeat:
        Use the following test program, compile with -Os -S to see assembler 
        code generated by avr-gcc:
        #include <avr/io.h>

        void setpin(unsigned char pin, unsigned char val) {
                if (val == 1) PORTC |= (1<<pin);
                else PORTC &= ~(1<<pin);
        }

        void setpin1(unsigned char pin, unsigned char val) {
                const unsigned char one = 1;
                if (val == 1) PORTC |= (one<<pin);
                else PORTC &= ~(one<<pin);
        }

        int main(void) {
                setpin(3, 1);
                setpin1(3, 1);
                return 0;
        }


------- Comment #1 from r dot leitgeb at x-pin dot com  2006-10-23 11:39 -------
Fix:
        Define shift operators for 8 bit wide data types, use if possible.
        I'm not a compiler expert but I know that the multiply operator
        detects the data types involved, so it should be doable with the
        shift operators as well.


-- 
           Summary: Poor optimization for character shifts on Atmel AVR
           Product: gcc
           Version: 4.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: r dot leitgeb at x-pin dot com
 GCC build triplet: powerpc-apple-darwin7.9.0
  GCC host triplet: powerpc-apple-darwin7.9.0
GCC target triplet: avr-unknown-none


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] Poor optimization for character shifts on Atmel AVR
  2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
@ 2006-10-24  7:32 ` r dot leitgeb at x-pin dot com
  2009-08-24 17:18 ` [Bug target/29560] [avr] Poor optimization for byte shifts eric dot weddington at atmel dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: r dot leitgeb at x-pin dot com @ 2006-10-24  7:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from r dot leitgeb at x-pin dot com  2006-10-24 07:31 -------
Here's an excerpt of the assembly code obtained with -Os -S
with some comments from me:

----------------------------------------------------
setpin:
/* prologue: frame size=0 */
/* prologue end (size=0) */
        mov r20,r24
        clr r21
        cpi r22,lo8(1)    <--- The compare is done byte wide
        brne .L2
        in r18,53-0x20
        ldi r24,lo8(1)    <--- Here we load 1 sixteen bit wide
        ldi r25,hi8(1)
        rjmp 2f
1:      lsl r24           <--- Here we shift 1 sixteen bit wide
        rol r25
2:      dec r20
        brpl 1b
        or r18,r24        <--- r25 is not used. Why would we compute it ?
        rjmp .L6
.L2:
        in r18,53-0x20
----------------------------------------------------

Cheers

Rudi


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
  2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
  2006-10-24  7:32 ` [Bug target/29560] " r dot leitgeb at x-pin dot com
@ 2009-08-24 17:18 ` eric dot weddington at atmel dot com
  2009-12-31  3:16 ` hutchinsonandy at gcc dot gnu dot org
  2010-02-18 13:28 ` Rudolf dot Leitgeb at gmx dot at
  3 siblings, 0 replies; 11+ messages in thread
From: eric dot weddington at atmel dot com @ 2009-08-24 17:18 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from eric dot weddington at atmel dot com  2009-08-24 17:18 -------
Confirmed on 4.3.2.


-- 

eric dot weddington at atmel dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
  GCC build triplet|powerpc-apple-darwin7.9.0   |
   GCC host triplet|powerpc-apple-darwin7.9.0   |
           Keywords|                            |missed-optimization
      Known to fail|                            |4.1.1 4.3.2
   Last reconfirmed|0000-00-00 00:00:00         |2009-08-24 17:18:30
               date|                            |
            Summary|Poor optimization for       |[avr] Poor optimization for
                   |character shifts on Atmel   |byte shifts
                   |AVR                         |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
  2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
  2006-10-24  7:32 ` [Bug target/29560] " r dot leitgeb at x-pin dot com
  2009-08-24 17:18 ` [Bug target/29560] [avr] Poor optimization for byte shifts eric dot weddington at atmel dot com
@ 2009-12-31  3:16 ` hutchinsonandy at gcc dot gnu dot org
  2010-02-18 13:28 ` Rudolf dot Leitgeb at gmx dot at
  3 siblings, 0 replies; 11+ messages in thread
From: hutchinsonandy at gcc dot gnu dot org @ 2009-12-31  3:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from hutchinsonandy at gcc dot gnu dot org  2009-12-31 03:16 -------
The same occurs in 4.5

I think the bug  is invalid

The expression 1<< pin will be promoted. This produces a defined result if
pin>7 and <15
The expression can not then be lower to 8 bit shift - since a shift by >7 is
undefined.

If you use pin &7, the shift is indeed 8 bit shift. Though it still loads a int
(HIMODE) value of 1 that should have been reduced to QIMODE.

        setpinx:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
        mov r25,r24
        andi r25,lo8(7)
        ldi r18,lo8(1)
        ldi r19,hi8(1)
        mov r24,r18
        rjmp 2f
1:      lsl r24
2:      dec r25
        brpl 1b
        cpi r22,lo8(1)
        brne .L7
        in r25,53-0x20
        or r25,r24
        out 53-0x20,r25
        ret
.L7:
        in r25,53-0x20
        com r24
        and r24,r25
        out 53-0x20,r24
        ret


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
  2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2009-12-31  3:16 ` hutchinsonandy at gcc dot gnu dot org
@ 2010-02-18 13:28 ` Rudolf dot Leitgeb at gmx dot at
  3 siblings, 0 replies; 11+ messages in thread
From: Rudolf dot Leitgeb at gmx dot at @ 2010-02-18 13:28 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #5 from Rudolf dot Leitgeb at gmx dot at  2010-02-18 13:28 -------
Right away: I am NOT an expert in compiler or optimizer design, so please bear
with me.

I understand, that (((unsigned char)1) << ((unsigned char)something)) may need
more than 8 bits for representation and that gcc's default int size on the
ATmega platform is 16 bits. But the result is assigned to an 8 bit value. I
take it that there is no way to back propagate this potential optimization from
the assignment to the shifting step. If you look in my assembly code, r25 is
computed with great effort yet never transferred anywhere.

The trick with &7 is nice but introduces another unnecessary AND operation. And
it is only correct if the input numebr is guaranteed to be between 0 and 7. Am
I correct that this whole optimization issue comes from the fact that ATmega is
an 8 bit architecture yet gcc's int on that platform is 16 bit?

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
       [not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2011-08-10  8:59 ` gjl at gcc dot gnu.org
@ 2011-08-10  9:20 ` gjl at gcc dot gnu.org
  5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-08-10  9:20 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
      Known to fail|4.7.0                       |

--- Comment #11 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-08-10 09:19:49 UTC ---
Closed as FIXED for 4.7.0 trunk with patch from comment #10.

The test case from attachment 24816 compiles with -Os -dp to:

shift1:
    rjmp 2f     ;  22    *ashlqi3/1    [length = 5]
1:    lsl r24
2:    dec r22
    brpl 1b
    ret     ;  27    return    [length = 1]

shift2:
    mov r25,r24     ;  19    *movqi/1    [length = 1]
    rjmp 2f     ;  15    *ashlqi3/1    [length = 5]
1:    lsl r25
2:    dec r22
    brpl 1b
    sts y,r25     ;  16    *movqi/3    [length = 2]
    ret     ;  22    return    [length = 1]

setpin:
    in r25,44-0x20     ;  11    *movqi/4    [length = 1]
    ldi r18,lo8(1)     ;  13    *movhi/4    [length = 2]
    ldi r19,hi8(1)
    mov r0,r24     ;  49    *ashlqi3/1    [length = 5]
    rjmp 2f
1:    lsl r18
2:    dec r0
    brpl 1b
    cpi r22,lo8(1)     ;  7    *cmpqi/3    [length = 1]
    brne .L4     ;  8    branch    [length = 1]
    or r25,r18     ;  15    iorqi3/1    [length = 1]
    out 44-0x20,r25     ;  17    *movqi/3    [length = 1]
    ret     ;  46    return    [length = 1]
.L4:
    com r18     ;  27    one_cmplqi2    [length = 1]
    and r18,r25     ;  28    andqi3/1    [length = 1]
    out 44-0x20,r18     ;  30    *movqi/3    [length = 1]
    ret     ;  48    return    [length = 1]


Notice the *ashlqi3 insns.

The fix uses combine patterns and an RTL peephole to transform
16-bit shift to 8-bit shift if applicable.  There will be
situations where optimization opportunities are not noticed,
e.g. if a move occurs after the shift so that this move will
get a REG_UNUSED note but not the shift loop itself.

The fix just replaces 16-bit loop by 8-bit loop; there is no
back-propagation of the information.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
       [not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2011-07-25 12:49 ` gjl at gcc dot gnu.org
@ 2011-08-10  8:59 ` gjl at gcc dot gnu.org
  2011-08-10  9:20 ` gjl at gcc dot gnu.org
  5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-08-10  8:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560

--- Comment #10 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-08-10 08:58:07 UTC ---
Author: gjl
Date: Wed Aug 10 08:58:03 2011
New Revision: 177616

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=177616
Log:

    PR target/29560
    * config/avr/avr.md (*ashlhiqi3): New insn-and-split.
    (*ashl<extend_prefix>qihiqi3): New insn-and-splits.
    (*ashl<extend_prefix>qihiqi3.mem): New insn-and-splits.
    Add peephole2 to map ashlhi3 to ashlqi3 if high part of
    shift target is unused.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr.md


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
       [not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2011-07-23 14:55 ` gjl at gcc dot gnu.org
@ 2011-07-25 12:49 ` gjl at gcc dot gnu.org
  2011-08-10  8:59 ` gjl at gcc dot gnu.org
  2011-08-10  9:20 ` gjl at gcc dot gnu.org
  5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-07-25 12:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560

--- Comment #9 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-07-25 12:48:29 UTC ---
Created attachment 24827
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24827
Fix PR29560 by adding peephole2 pattern.


    PR target/29560
    * config/avr/avr.md: Add peephole2 to map ashlhi3 to ashlqi3 if
    high part of shift target is unused.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
       [not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
  2011-06-21 18:25 ` gjl at gcc dot gnu.org
  2011-07-23 14:51 ` gjl at gcc dot gnu.org
@ 2011-07-23 14:55 ` gjl at gcc dot gnu.org
  2011-07-25 12:49 ` gjl at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-07-23 14:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |NEW
         Resolution|INVALID                     |
   Target Milestone|---                         |4.7.0

--- Comment #8 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-07-23 14:53:52 UTC ---
Set to NEW as explained in previous post.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
       [not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
  2011-06-21 18:25 ` gjl at gcc dot gnu.org
@ 2011-07-23 14:51 ` gjl at gcc dot gnu.org
  2011-07-23 14:55 ` gjl at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-07-23 14:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560

--- Comment #7 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-07-23 14:50:58 UTC ---
Created attachment 24816
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24816
Test case for 16-bit shifts that use low part only

Reopened this optimization issue.

For the attached test case, there are sometimes REG_UNUSED notes for the high
part like with avr-gcc-4.6.1 -Os -dP

shift1:
 ; (insn 8 4 17 (set (reg:HI 24 r24 [50])
 ;         (ashift:HI (reg:HI 24 r24 [ x ])
 ;             (reg/v:QI 22 r22 [orig:47 s ] [47]))) foo.c:3 68 {ashlhi3}
 ;      (expr_list:REG_DEAD (reg/v:QI 22 r22 [orig:47 s ] [47])
 ;         (expr_list:REG_UNUSED (reg:QI 25 r25)
 ;             (nil))))
    rjmp 2f     ;  8    ashlhi3/1    [length = 6]
1:    lsl r24
    rol r25
2:    dec r22
    brpl 1b

So that there is a way to map ashlhi3 to ashlqi3, i.e. 16-bit shift to a 8-bit
shift.

Unfortunately, these notes are not always present like in shift2:

shift2:
 ; (insn 15 4 8 (set (reg:HI 18 r18)
 ;         (reg:HI 24 r24 [ x ])) foo.c:10 10 {*movhi}
 ;      (nil))
    movw r18,r24     ;  15    *movhi/1    [length = 1]
 ; (insn 8 15 9 (set (reg:HI 18 r18)
 ;         (ashift:HI (reg:HI 18 r18)
 ;             (reg/v:QI 22 r22 [orig:46 s ] [46]))) foo.c:10 68 {ashlhi3}
 ;      (expr_list:REG_DEAD (reg/v:QI 22 r22 [orig:46 s ] [46])
 ;         (nil)))
    rjmp 2f     ;  8    ashlhi3/1    [length = 6]
1:    lsl r18
    rol r19
2:    dec r22
    brpl 1b

The notes are not present in pass .split2 so a split won't help. The notes
appear to be available in .peephole2 so that could be a fix. Moreover, the
notes won't be back-propagated so that an optimization will cover at most one
insn: the shift insn.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/29560] [avr] Poor optimization for byte shifts
       [not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
@ 2011-06-21 18:25 ` gjl at gcc dot gnu.org
  2011-07-23 14:51 ` gjl at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: gjl at gcc dot gnu.org @ 2011-06-21 18:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560

Georg-Johann Lay <gjl at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID

--- Comment #6 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-06-21 18:24:44 UTC ---
Closing this PR as invalid.

As Andy already explained, for shift offsets >7 and <15 there has to be a
defined result because int for avr-gcc is 16 bits, and this is actually *not* a
byte shift.

With alternative code that tested for these offsets at runtime, jumped around
and loaded 0 if appropriate, you were not better of than with the current code
-- in the contrary.

Note that gcc deflates you function calls to just one instruction in main.

So maybe what you really want here is to make these functions static inline so
that you do no more see the code in the functions.  Presumably you always call
this functions with compile time constants.

If you really need to quench out the last tick and call with non-constants, you
could invent inline assembler with an instruction sequence that is similar to
(1 << (R24 & 7)), perhaps together with __builtin_constant_p magic:

   LDI  R24, 1
   SBRC R24, 1
   LDI  R24, 4
   SBRC R24, 0
   LSL  R24
   SBRC R24, 2
   SWAP R24

Note that with -mint8, int is 8 bits.  That is still present as option, but no
more supported and perhaps non-functional.

--

Just for reference; here is a source without external #include:

#define PORTC (*((unsigned char volatile*) 53))

void setpin(unsigned char pin, unsigned char val)
{
    if (val == 1) PORTC |= (1<<pin);
    else PORTC &= ~(1<<pin);
}

void setpin1(unsigned char pin, unsigned char val)
{
    const unsigned char one = 1;
    if (val == 1) PORTC |= (one<<(pin));
    else PORTC &= ~(one<<pin);
}

int main(void)
{
    setpin(3, 1);
    setpin1(3, 1);
    return 0;
}


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-08-10  9:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-23 11:39 [Bug target/29560] New: Poor optimization for character shifts on Atmel AVR gcc-bugzilla at gcc dot gnu dot org
2006-10-24  7:32 ` [Bug target/29560] " r dot leitgeb at x-pin dot com
2009-08-24 17:18 ` [Bug target/29560] [avr] Poor optimization for byte shifts eric dot weddington at atmel dot com
2009-12-31  3:16 ` hutchinsonandy at gcc dot gnu dot org
2010-02-18 13:28 ` Rudolf dot Leitgeb at gmx dot at
     [not found] <bug-29560-4@http.gcc.gnu.org/bugzilla/>
2011-06-21 18:25 ` gjl at gcc dot gnu.org
2011-07-23 14:51 ` gjl at gcc dot gnu.org
2011-07-23 14:55 ` gjl at gcc dot gnu.org
2011-07-25 12:49 ` gjl at gcc dot gnu.org
2011-08-10  8:59 ` gjl at gcc dot gnu.org
2011-08-10  9:20 ` gjl at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).