public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add)
@ 2015-08-20 8:29 christophe.leroy@c-s.fr
2015-08-20 11:47 ` [Bug regression/67288] " rguenth at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: christophe.leroy@c-s.fr @ 2015-08-20 8:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
Bug ID: 67288
Summary: [4.9 regression] non optimal simple function (useless
additional shift/remove/shift/add)
Product: gcc
Version: 4.9.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: regression
Assignee: unassigned at gcc dot gnu.org
Reporter: christophe.leroy@c-s.fr
Target Milestone: ---
The following function (Linux Kernel, compiled with -O2) was resulting in a
good assembly with GCC 4.8.3. With GCC 4.9.3 there are a lot of unneccessary
instructions
/* L1_CACHE_BYTES = 16 */
/* L1_CACHE_SHIFT = 4 */
#define mb() __asm__ __volatile__ ("sync" : : : "memory")
static inline void dcbf(void *addr)
{
__asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
}
void flush_dcache_range(unsigned long start, unsigned long stop)
{
void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
unsigned int size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1);
unsigned int i;
for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES)
dcbf(addr);
if (i)
mb();
}
Result with GCC 4.9.3: (15 insns)
c000d970 <flush_dcache_range>:
c000d970: 54 63 00 36 rlwinm r3,r3,0,0,27
c000d974: 38 84 00 0f addi r4,r4,15
c000d978: 7c 83 20 50 subf r4,r3,r4
c000d97c: 54 89 e1 3f rlwinm. r9,r4,28,4,31
c000d980: 4d 82 00 20 beqlr
c000d984: 55 24 20 36 rlwinm r4,r9,4,0,27
c000d988: 39 24 ff f0 addi r9,r4,-16
c000d98c: 55 29 e1 3e rlwinm r9,r9,28,4,31
c000d990: 39 29 00 01 addi r9,r9,1
c000d994: 7d 29 03 a6 mtctr r9
c000d998: 7c 00 18 ac dcbf 0,r3
c000d99c: 38 63 00 10 addi r3,r3,16
c000d9a0: 42 00 ff f8 bdnz c000d998 <flush_dcache_range+0x28>
c000d9a4: 7c 00 04 ac sync
c000d9a8: 4e 80 00 20 blr
The following section is just useless: (shift left 4 bits, remove 16, shift
right 4 bits, add 1)
c000d984: 55 24 20 36 rlwinm r4,r9,4,0,27
c000d988: 39 24 ff f0 addi r9,r4,-16
c000d98c: 55 29 e1 3e rlwinm r9,r9,28,4,31
c000d990: 39 29 00 01 addi r9,r9,1
Result with GCC 4.8.3 was correct: (11 insns)
c000d894 <flush_dcache_range>:
c000d894: 54 63 00 36 rlwinm r3,r3,0,0,27
c000d898: 38 84 00 0f addi r4,r4,15
c000d89c: 7d 23 20 50 subf r9,r3,r4
c000d8a0: 55 29 e1 3f rlwinm. r9,r9,28,4,31
c000d8a4: 4d 82 00 20 beqlr
c000d8a8: 7d 29 03 a6 mtctr r9
c000d8ac: 7c 00 18 ac dcbf 0,r3
c000d8b0: 38 63 00 10 addi r3,r3,16
c000d8b4: 42 00 ff f8 bdnz c000d8ac <flush_dcache_range+0x18>
c000d8b8: 7c 00 04 ac sync
c000d8bc: 4e 80 00 20 blr
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regression/67288] [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add)
2015-08-20 8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
@ 2015-08-20 11:47 ` rguenth at gcc dot gnu.org
2015-08-22 12:43 ` segher at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-08-20 11:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target| |powerpc*
Target Milestone|--- |4.9.4
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regression/67288] [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add)
2015-08-20 8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
2015-08-20 11:47 ` [Bug regression/67288] " rguenth at gcc dot gnu.org
@ 2015-08-22 12:43 ` segher at gcc dot gnu.org
2015-08-24 19:49 ` segher at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: segher at gcc dot gnu.org @ 2015-08-22 12:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
Segher Boessenkool <segher at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |segher at gcc dot gnu.org
--- Comment #1 from Segher Boessenkool <segher at gcc dot gnu.org> ---
It does the correct transform, which is needed here in general; it
doesn't notice it already knows 0 < r9 < 0x10000000, which would
simplify a lot of code away.
I think there are dups of this report already. I cannot confirm
this PR because I cannot compile this (it's incomplete code).
I blame this on bad interplay between ivopts and the RTL loop
optimisers, btw.; although it might well be a bug in the RTL doloop
transform, ivopts should deal with doloop too.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regression/67288] [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add)
2015-08-20 8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
2015-08-20 11:47 ` [Bug regression/67288] " rguenth at gcc dot gnu.org
2015-08-22 12:43 ` segher at gcc dot gnu.org
@ 2015-08-24 19:49 ` segher at gcc dot gnu.org
2021-05-14 9:47 ` [Bug target/67288] [9/10/11/12 " jakub at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: segher at gcc dot gnu.org @ 2015-08-24 19:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
Segher Boessenkool <segher at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-08-24
Ever confirmed|0 |1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/67288] [9/10/11/12 regression] non optimal simple function (useless additional shift/remove/shift/add)
2015-08-20 8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
` (2 preceding siblings ...)
2015-08-24 19:49 ` segher at gcc dot gnu.org
@ 2021-05-14 9:47 ` jakub at gcc dot gnu.org
2021-06-01 8:06 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-14 9:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|8.5 |9.4
--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 8 branch is being closed.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/67288] [9/10/11/12 regression] non optimal simple function (useless additional shift/remove/shift/add)
2015-08-20 8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
` (3 preceding siblings ...)
2021-05-14 9:47 ` [Bug target/67288] [9/10/11/12 " jakub at gcc dot gnu.org
@ 2021-06-01 8:06 ` rguenth at gcc dot gnu.org
2021-07-14 2:36 ` guojiufu at gcc dot gnu.org
2021-07-14 16:14 ` segher at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-01 8:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|9.4 |9.5
--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/67288] [9/10/11/12 regression] non optimal simple function (useless additional shift/remove/shift/add)
2015-08-20 8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
` (4 preceding siblings ...)
2021-06-01 8:06 ` rguenth at gcc dot gnu.org
@ 2021-07-14 2:36 ` guojiufu at gcc dot gnu.org
2021-07-14 16:14 ` segher at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: guojiufu at gcc dot gnu.org @ 2021-07-14 2:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
Jiu Fu Guo <guojiufu at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution|--- |FIXED
CC| |guojiufu at gcc dot gnu.org
--- Comment #22 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #4)
> It's not fixed. On trunk we get:
>
> ===
> flush_dcache_range:
> rlwinm 3,3,0,0,27
> addi 4,4,15
> subf 4,3,4
> srwi. 9,4,4
> beq 0,.L1
> slwi 9,9,4
> addi 9,9,-16
> srwi 9,9,4
> addi 9,9,1
> mtctr 9
> .p2align 4,,15
> .L3:
> dcbf 0, 3
> addi 3,3,16
> bdnz .L3
> sync
> .L1:
> blr
> ===
>
> (-m32, edited a bit).
>
> The slwi/addi/srwi/addi is unnecessary.
With the latest trunk (which contains
https://gcc.gnu.org/g:8a15faa730f99100f6f3ed12663563356ec5a2c0)
The asm code is:
.cfi_startproc
rldicr %r3,%r3,0,59
addi %r9,%r4,15
subf %r9,%r3,%r9
srwi %r9,%r9,4
cmpwi %cr0,%r9,0
beqlr %cr0
rldicl %r9,%r9,0,32
mtctr %r9
.p2align 4,,15
.L3:
dcbf 0, %r3
addi %r3,%r3,16
bdnz .L3
sync
blr
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/67288] [9/10/11/12 regression] non optimal simple function (useless additional shift/remove/shift/add)
2015-08-20 8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
` (5 preceding siblings ...)
2021-07-14 2:36 ` guojiufu at gcc dot gnu.org
@ 2021-07-14 16:14 ` segher at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: segher at gcc dot gnu.org @ 2021-07-14 16:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67288
--- Comment #23 from Segher Boessenkool <segher at gcc dot gnu.org> ---
-m32 is required here. With -fno-unroll-loops -m32 you now get
flush_dcache_range:
rlwinm 3,3,0,0,27
addi 4,4,15
subf 4,3,4
srwi. 4,4,4
beqlr 0
mtctr 4
.p2align 4,,15
.L3:
dcbf 0, 3
addi 3,3,16
bdnz .L3
sync
blr
So yup, it has been fixed. Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-07-14 16:14 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-20 8:29 [Bug regression/67288] New: [4.9 regression] non optimal simple function (useless additional shift/remove/shift/add) christophe.leroy@c-s.fr
2015-08-20 11:47 ` [Bug regression/67288] " rguenth at gcc dot gnu.org
2015-08-22 12:43 ` segher at gcc dot gnu.org
2015-08-24 19:49 ` segher at gcc dot gnu.org
2021-05-14 9:47 ` [Bug target/67288] [9/10/11/12 " jakub at gcc dot gnu.org
2021-06-01 8:06 ` rguenth at gcc dot gnu.org
2021-07-14 2:36 ` guojiufu at gcc dot gnu.org
2021-07-14 16:14 ` segher at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).