public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add)
@ 2015-08-20  8:29 christophe.leroy@c-s.fr
  2015-08-20 11:47 ` [Bug regression/67288] " rguenth at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: christophe.leroy@c-s.fr @ 2015-08-20  8:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288

            Bug ID: 67288
           Summary: [4.9 regression] non optimal simple function (useless
                    additional shift/remove/shift/add)
           Product: gcc
           Version: 4.9.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: regression
          Assignee: unassigned at gcc dot gnu.org
          Reporter: christophe.leroy@c-s.fr
  Target Milestone: ---

The following function (Linux Kernel, compiled with -O2) was resulting in a
good assembly with GCC 4.8.3. With GCC 4.9.3 there are a lot of unneccessary
instructions

/* L1_CACHE_BYTES = 16 */
/* L1_CACHE_SHIFT = 4 */

#define mb()   __asm__ __volatile__ ("sync" : : : "memory")

static inline void dcbf(void *addr)
{
        __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
}

void flush_dcache_range(unsigned long start, unsigned long stop)
{
        void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
        unsigned int size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
        unsigned int i;

        for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
                dcbf(addr);
        if (i)
                mb();
}

Result with GCC 4.9.3: (15 insns)

c000d970 <flush_dcache_range>:
c000d970:       54 63 00 36     rlwinm  r3,r3,0,0,27
c000d974:       38 84 00 0f     addi    r4,r4,15
c000d978:       7c 83 20 50     subf    r4,r3,r4
c000d97c:       54 89 e1 3f     rlwinm. r9,r4,28,4,31
c000d980:       4d 82 00 20     beqlr   
c000d984:       55 24 20 36     rlwinm  r4,r9,4,0,27
c000d988:       39 24 ff f0     addi    r9,r4,-16
c000d98c:       55 29 e1 3e     rlwinm  r9,r9,28,4,31
c000d990:       39 29 00 01     addi    r9,r9,1
c000d994:       7d 29 03 a6     mtctr   r9
c000d998:       7c 00 18 ac     dcbf    0,r3
c000d99c:       38 63 00 10     addi    r3,r3,16
c000d9a0:       42 00 ff f8     bdnz    c000d998 <flush_dcache_range+0x28>
c000d9a4:       7c 00 04 ac     sync    
c000d9a8:       4e 80 00 20     blr

The following section is just useless: (shift left 4 bits, remove 16, shift
right 4 bits, add 1)
c000d984:       55 24 20 36     rlwinm  r4,r9,4,0,27
c000d988:       39 24 ff f0     addi    r9,r4,-16
c000d98c:       55 29 e1 3e     rlwinm  r9,r9,28,4,31
c000d990:       39 29 00 01     addi    r9,r9,1



Result with GCC 4.8.3 was correct: (11 insns)

c000d894 <flush_dcache_range>:
c000d894:       54 63 00 36     rlwinm  r3,r3,0,0,27
c000d898:       38 84 00 0f     addi    r4,r4,15
c000d89c:       7d 23 20 50     subf    r9,r3,r4
c000d8a0:       55 29 e1 3f     rlwinm. r9,r9,28,4,31
c000d8a4:       4d 82 00 20     beqlr   
c000d8a8:       7d 29 03 a6     mtctr   r9
c000d8ac:       7c 00 18 ac     dcbf    0,r3
c000d8b0:       38 63 00 10     addi    r3,r3,16
c000d8b4:       42 00 ff f8     bdnz    c000d8ac <flush_dcache_range+0x18>
c000d8b8:       7c 00 04 ac     sync    
c000d8bc:       4e 80 00 20     blr


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regression/67288] [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add)
  2015-08-20  8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
@ 2015-08-20 11:47 ` rguenth at gcc dot gnu.org
  2015-08-22 12:43 ` segher at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-08-20 11:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |powerpc*
   Target Milestone|---                         |4.9.4


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regression/67288] [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add)
  2015-08-20  8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
  2015-08-20 11:47 ` [Bug regression/67288] " rguenth at gcc dot gnu.org
@ 2015-08-22 12:43 ` segher at gcc dot gnu.org
  2015-08-24 19:49 ` segher at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: segher at gcc dot gnu.org @ 2015-08-22 12:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org

--- Comment #1 from Segher Boessenkool <segher at gcc dot gnu.org> ---
It does the correct transform, which is needed here in general; it
doesn't notice it already knows 0 < r9 < 0x10000000, which would
simplify a lot of code away.

I think there are dups of this report already.  I cannot confirm
this PR because I cannot compile this (it's incomplete code).

I blame this on bad interplay between ivopts and the RTL loop
optimisers, btw.; although it might well be a bug in the RTL doloop
transform, ivopts should deal with doloop too.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regression/67288] [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add)
  2015-08-20  8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
  2015-08-20 11:47 ` [Bug regression/67288] " rguenth at gcc dot gnu.org
  2015-08-22 12:43 ` segher at gcc dot gnu.org
@ 2015-08-24 19:49 ` segher at gcc dot gnu.org
  2021-05-14  9:47 ` [Bug target/67288] [9/10/11/12 " jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: segher at gcc dot gnu.org @ 2015-08-24 19:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-08-24
     Ever confirmed|0                           |1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/67288] [9/10/11/12 regression] non optimal simple function (useless additional shift/remove/shift/add)
  2015-08-20  8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
                   ` (2 preceding siblings ...)
  2015-08-24 19:49 ` segher at gcc dot gnu.org
@ 2021-05-14  9:47 ` jakub at gcc dot gnu.org
  2021-06-01  8:06 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-14  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|8.5                         |9.4

--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 8 branch is being closed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/67288] [9/10/11/12 regression] non optimal simple function (useless additional shift/remove/shift/add)
  2015-08-20  8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
                   ` (3 preceding siblings ...)
  2021-05-14  9:47 ` [Bug target/67288] [9/10/11/12 " jakub at gcc dot gnu.org
@ 2021-06-01  8:06 ` rguenth at gcc dot gnu.org
  2021-07-14  2:36 ` guojiufu at gcc dot gnu.org
  2021-07-14 16:14 ` segher at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-01  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.4                         |9.5

--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9.4 is being released, retargeting bugs to GCC 9.5.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/67288] [9/10/11/12 regression] non optimal simple function (useless additional shift/remove/shift/add)
  2015-08-20  8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
                   ` (4 preceding siblings ...)
  2021-06-01  8:06 ` rguenth at gcc dot gnu.org
@ 2021-07-14  2:36 ` guojiufu at gcc dot gnu.org
  2021-07-14 16:14 ` segher at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: guojiufu at gcc dot gnu.org @ 2021-07-14  2:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288

Jiu Fu Guo <guojiufu at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |FIXED
                 CC|                            |guojiufu at gcc dot gnu.org

--- Comment #22 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
 (In reply to Segher Boessenkool from comment #4)
> It's not fixed.  On trunk we get:
> 
> ===
> flush_dcache_range:
>         rlwinm 3,3,0,0,27
>         addi 4,4,15
>         subf 4,3,4
>         srwi. 9,4,4
>         beq 0,.L1
>         slwi 9,9,4
>         addi 9,9,-16
>         srwi 9,9,4
>         addi 9,9,1
>         mtctr 9
>         .p2align 4,,15
> .L3:
>         dcbf 0, 3
>         addi 3,3,16
>         bdnz .L3
>         sync
> .L1:
>         blr
> ===
> 
> (-m32, edited a bit).
> 
> The slwi/addi/srwi/addi is unnecessary.

With the latest trunk (which contains
https://gcc.gnu.org/g:8a15faa730f99100f6f3ed12663563356ec5a2c0)
The asm code is:

        .cfi_startproc
        rldicr %r3,%r3,0,59
        addi %r9,%r4,15
        subf %r9,%r3,%r9
        srwi %r9,%r9,4
        cmpwi %cr0,%r9,0
        beqlr %cr0
        rldicl %r9,%r9,0,32
        mtctr %r9
        .p2align 4,,15
.L3:
        dcbf 0, %r3
        addi %r3,%r3,16
        bdnz .L3
        sync
        blr

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/67288] [9/10/11/12 regression] non optimal simple function (useless additional shift/remove/shift/add)
  2015-08-20  8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
                   ` (5 preceding siblings ...)
  2021-07-14  2:36 ` guojiufu at gcc dot gnu.org
@ 2021-07-14 16:14 ` segher at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: segher at gcc dot gnu.org @ 2021-07-14 16:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288

--- Comment #23 from Segher Boessenkool <segher at gcc dot gnu.org> ---
-m32 is required here.  With -fno-unroll-loops -m32 you now get

flush_dcache_range:
        rlwinm 3,3,0,0,27
        addi 4,4,15
        subf 4,3,4
        srwi. 4,4,4
        beqlr 0
        mtctr 4
        .p2align 4,,15
.L3:
        dcbf 0, 3
        addi 3,3,16
        bdnz .L3
        sync
        blr

So yup, it has been fixed.  Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-14 16:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-20  8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
2015-08-20 11:47 ` [Bug regression/67288] " rguenth at gcc dot gnu.org
2015-08-22 12:43 ` segher at gcc dot gnu.org
2015-08-24 19:49 ` segher at gcc dot gnu.org
2021-05-14  9:47 ` [Bug target/67288] [9/10/11/12 " jakub at gcc dot gnu.org
2021-06-01  8:06 ` rguenth at gcc dot gnu.org
2021-07-14  2:36 ` guojiufu at gcc dot gnu.org
2021-07-14 16:14 ` segher at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).