public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/100622] New: Conversion to smaller unsigned type in loop
@ 2021-05-16  8:37 tkoenig at gcc dot gnu.org
  2021-05-17  8:46 ` [Bug rtl-optimization/100622] " tkoenig at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2021-05-16  8:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100622

            Bug ID: 100622
           Summary: Conversion to smaller unsigned type in loop
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

Consider

unsigned int foo(unsigned int *a, int n)
{
  int i;
  unsigned int res = 0;
  for (i=0; i<n; i++)
    res += a[i];

  return res;
}

unsigned int foo2 (unsigned int *a, int n)
{
  int i;
  unsigned long res = 0;
  for (i=0; i<n; i++)
    res += a[i];

  return res;
}

Given modular 2^n arithmetic, these two functions are identical in effect.

On POWER with a reasonably recent trunk with -O1, this gets compiled to
(using -O1 in order to avoid loop unrolling for better visibility)

$ gcc -O1 -c add.c && objdump --disassemble add.o

add.o:     file format elf64-powerpcle


Disassembly of section .text:

0000000000000000 <foo>:
   0:   00 00 04 2c     cmpwi   r4,0
   4:   30 00 81 40     ble     34 <foo+0x34>
   8:   20 00 89 78     clrldi  r9,r4,32
   c:   fc ff 43 39     addi    r10,r3,-4
  10:   00 00 60 38     li      r3,0
  14:   a6 03 29 7d     mtctr   r9
  18:   04 00 0a 85     lwzu    r8,4(r10)
  1c:   14 1a 68 7c     add     r3,r8,r3
  20:   20 00 63 78     clrldi  r3,r3,32
  24:   ff ff 29 39     addi    r9,r9,-1
  28:   20 00 29 79     clrldi  r9,r9,32
  2c:   ec ff 00 42     bdnz    18 <foo+0x18>
  30:   20 00 80 4e     blr
  34:   00 00 60 38     li      r3,0
  38:   20 00 80 4e     blr
        ...

0000000000000048 <foo2>:
  48:   00 00 04 2c     cmpwi   r4,0
  4c:   30 00 81 40     ble     7c <foo2+0x34>
  50:   20 00 89 78     clrldi  r9,r4,32
  54:   fc ff 43 39     addi    r10,r3,-4
  58:   00 00 60 38     li      r3,0
  5c:   a6 03 29 7d     mtctr   r9
  60:   04 00 0a 85     lwzu    r8,4(r10)
  64:   14 42 63 7c     add     r3,r3,r8
  68:   ff ff 29 39     addi    r9,r9,-1
  6c:   20 00 29 79     clrldi  r9,r9,32
  70:   f0 ff 00 42     bdnz    60 <foo2+0x18>
  74:   20 00 63 78     clrldi  r3,r3,32
  78:   20 00 80 4e     blr
  7c:   00 00 60 38     li      r3,0
  80:   f4 ff ff 4b     b       74 <foo2+0x2c>

so there is an extra instruction to mask the result of the
addition in foo.  This should not be needed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-09-17  6:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-16  8:37 [Bug rtl-optimization/100622] New: Conversion to smaller unsigned type in loop tkoenig at gcc dot gnu.org
2021-05-17  8:46 ` [Bug rtl-optimization/100622] " tkoenig at gcc dot gnu.org
2021-05-17  9:05 ` tkoenig at gcc dot gnu.org
2021-05-17 12:41 ` rguenth at gcc dot gnu.org
2021-05-17 13:39 ` tkoenig at gcc dot gnu.org
2021-05-17 22:02 ` segher at gcc dot gnu.org
2021-06-08  6:25 ` guojiufu at gcc dot gnu.org
2021-06-08 12:08 ` segher at gcc dot gnu.org
2021-09-17  6:49 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).