public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack
@ 2020-11-18 9:35 gabravier at gmail dot com
2020-11-18 10:33 ` [Bug target/97887] [10/11 Regression] " rguenth at gcc dot gnu.org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: gabravier at gmail dot com @ 2020-11-18 9:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
Bug ID: 97887
Summary: Failure to optimize neg plus div to avoid using x87
floating point stack
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
float f(float a)
{
return -a / a;
}
On x86 -O3, LLVM outputs this:
.LCPI0_0:
.long 0x80000000 # float -0
.long 0x80000000 # float -0
.long 0x80000000 # float -0
.long 0x80000000 # float -0
f(float):
movaps xmm1, xmmword ptr [rip + .LCPI0_0] # xmm1 =
[-0.0E+0,-0.0E+0,-0.0E+0,-0.0E+0]
xorps xmm1, xmm0
divss xmm1, xmm0
movaps xmm0, xmm1
ret
GCC outputs this:
f(float):
movss DWORD PTR [rsp-4], xmm0
fld DWORD PTR [rsp-4]
movaps xmm1, xmm0
fchs
fstp DWORD PTR [rsp-4]
movss xmm0, DWORD PTR [rsp-4]
divss xmm0, xmm1
ret
I'm *pretty sure* that loading the value into the x87 stack (especially mixed
with SSE instructions) is much slower than using SSE instructions for this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
@ 2020-11-18 10:33 ` rguenth at gcc dot gnu.org
2020-11-18 10:33 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-11-18 10:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vmakarov at gcc dot gnu.org
Target|x86_64 i?86 |x86_64-*-* i?86-*-*
Keywords| |needs-bisection, ra
Target Milestone|--- |10.3
Priority|P3 |P2
Summary|Failure to optimize neg |[10/11 Regression] Failure
|plus div to avoid using x87 |to optimize neg plus div to
|floating point stack |avoid using x87 floating
| |point stack
Known to fail| |10.2.0, 11.0
Known to work| |7.5.0, 9.3.1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Whoo:
********** Local #1: **********
Spilling non-eliminable hard regs: 7
New elimination table:
Can eliminate 16 to 7 (offset=8, prev_offset=0)
Can eliminate 16 to 6 (offset=8, prev_offset=0)
Can eliminate 19 to 7 (offset=0, prev_offset=0)
Can eliminate 19 to 6 (offset=0, prev_offset=0)
0 Non-prefered reload: reject+=600
0 Non input pseudo reload: reject++
1 Matching alt: reject+=2
1 Non-prefered reload: reject+=600
alt=0,overall=1227,losers=4,rld_nregs=2
Staticly defined alt reject+=600
0 Non-prefered reload: reject+=600
0 Non input pseudo reload: reject++
1 Matching alt: reject+=2
1 Non-prefered reload: reject+=600
alt=1,overall=1815,losers=2 -- refuse
Choosing alt 0 in insn 7: (0) =f (1) 0 {*negsf2_i387_1}
Creating newreg=89 from oldreg=86, assigning class FLOAT_REGS to r89
7: {r89:SF=-r89:SF;clobber flags:CC;}
REG_UNUSED flags:CC
Inserting insn reload before:
18: r89:SF=r88:SF
Inserting insn reload after:
19: r86:SF=r89:SF
WTF.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
2020-11-18 10:33 ` [Bug target/97887] [10/11 Regression] " rguenth at gcc dot gnu.org
@ 2020-11-18 10:33 ` rguenth at gcc dot gnu.org
2020-11-18 10:42 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-11-18 10:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2020-11-18
Ever confirmed|0 |1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
2020-11-18 10:33 ` [Bug target/97887] [10/11 Regression] " rguenth at gcc dot gnu.org
2020-11-18 10:33 ` rguenth at gcc dot gnu.org
@ 2020-11-18 10:42 ` rguenth at gcc dot gnu.org
2020-11-18 12:11 ` ubizjak at gmail dot com
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-11-18 10:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |uros at gcc dot gnu.org
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
combine first makes recog pick negsf2_i387_1:
Trying 6 -> 7:
6: r87:V4SF=[`*.LC0']
REG_EQUAL const_vector
7: {r86:SF=-r84:SF;use r87:V4SF;clobber flags:CC;}
REG_DEAD r87:V4SF
REG_UNUSED flags:CC
Successfully matched this instruction:
(parallel [
(set (reg:SF 86)
(neg:SF (reg/v:SF 84 [ a ])))
(use (mem/u/c:V4SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S16
A128]))
(clobber (reg:CC 17 flags))
])
allowing combination of insns 6 and 7
original costs 8 + 4 = 12
replacement cost 4
...
(insn 7 6 8 2 (parallel [
(set (reg:SF 86)
(neg:SF (reg:SF 88)))
(clobber (reg:CC 17 flags))
]) "t.i":3:12 595 {*negsf2_i387_1}
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(insn 8 7 13 2 (set (reg:SF 85)
(div:SF (reg:SF 86)
(reg:SF 88))) "t.i":3:15 965 {*fop_sf_1}
(expr_list:REG_DEAD (reg:SF 88)
(expr_list:REG_DEAD (reg:SF 86)
(nil))))
of course we have fop_sf_1 for the division already before but that's
probably just a misnamed pattern.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
` (2 preceding siblings ...)
2020-11-18 10:42 ` rguenth at gcc dot gnu.org
@ 2020-11-18 12:11 ` ubizjak at gmail dot com
2020-11-18 13:54 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2020-11-18 12:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #2)
> combine first makes recog pick negsf2_i387_1:
This should have the following insn constraint:
"TARGET_80387 && !(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)"
to hide it from combine in cases where relevant SSE mode is available.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
` (3 preceding siblings ...)
2020-11-18 12:11 ` ubizjak at gmail dot com
@ 2020-11-18 13:54 ` rguenth at gcc dot gnu.org
2020-11-18 13:56 ` ubizjak at gmail dot com
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-11-18 13:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #3)
> (In reply to Richard Biener from comment #2)
> > combine first makes recog pick negsf2_i387_1:
>
> This should have the following insn constraint:
>
> "TARGET_80387 && !(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)"
>
> to hide it from combine in cases where relevant SSE mode is available.
Hmm, it is
;; Changing of sign for FP values is doable using integer unit too.
(define_insn "*<code><mode>2_i387_1"
[(set (match_operand:X87MODEF 0 "register_operand" "=f,!r")
(absneg:X87MODEF
(match_operand:X87MODEF 1 "register_operand" "0,0")))
(clobber (reg:CC FLAGS_REG))]
"TARGET_80387"
"#")
that is not guarded in this way?
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
` (4 preceding siblings ...)
2020-11-18 13:54 ` rguenth at gcc dot gnu.org
@ 2020-11-18 13:56 ` ubizjak at gmail dot com
2020-11-18 13:58 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2020-11-18 13:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
--- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> ---
> > This should have the following insn constraint:
> >
> > "TARGET_80387 && !(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)"
> >
> > to hide it from combine in cases where relevant SSE mode is available.
>
> Hmm, it is
>
> ;; Changing of sign for FP values is doable using integer unit too.
> (define_insn "*<code><mode>2_i387_1"
> [(set (match_operand:X87MODEF 0 "register_operand" "=f,!r")
> (absneg:X87MODEF
> (match_operand:X87MODEF 1 "register_operand" "0,0")))
> (clobber (reg:CC FLAGS_REG))]
> "TARGET_80387"
> "#")
>
> that is not guarded in this way?
Yes, this is the one.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
` (5 preceding siblings ...)
2020-11-18 13:56 ` ubizjak at gmail dot com
@ 2020-11-18 13:58 ` rguenth at gcc dot gnu.org
2020-11-18 15:09 ` ubizjak at gmail dot com
2020-11-19 9:00 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-11-18 13:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
So likely caused by g:f359611b363490b48a7ce0fd021f7e47d8816eb0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
` (6 preceding siblings ...)
2020-11-18 13:58 ` rguenth at gcc dot gnu.org
@ 2020-11-18 15:09 ` ubizjak at gmail dot com
2020-11-19 9:00 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2020-11-18 15:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #7 from Uroš Bizjak <ubizjak at gmail dot com> ---
I'll fix this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/97887] [10/11 Regression] Failure to optimize neg plus div to avoid using x87 floating point stack
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
` (7 preceding siblings ...)
2020-11-18 15:09 ` ubizjak at gmail dot com
@ 2020-11-19 9:00 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-11-19 9:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97887
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
Keywords|needs-bisection, ra |
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed by
commit 50134189a434e638861f8bf27d5caab9622811c8
Author: Uros Bizjak <ubizjak@gmail.com>
Date: Thu Nov 19 09:23:46 2020 +0100
i386: Disable *<absneg:code><mode>2_i387_1 for TARGET_SSE_MATH modes
This pattern interferes with *<absneg:code><mode>2_1 when TARGET_SSE_MATH
modes are active. Combine pass is able to remove (use) RTXes and transforms
*<absneg:code><mode>2_1 to *<absneg:code><mode>2_i387_1 where SSE
alternatives are not available.
2020-11-19 Uro305241 Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386.md (*<absneg:code><mode>2_i387_1):
Disable for TARGET_SSE_MATH modes.
gcc/testsuite/
* gcc.target/i386/pr97887.c: New test.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-11-19 9:00 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-18 9:35 [Bug target/97887] New: Failure to optimize neg plus div to avoid using x87 floating point stack gabravier at gmail dot com
2020-11-18 10:33 ` [Bug target/97887] [10/11 Regression] " rguenth at gcc dot gnu.org
2020-11-18 10:33 ` rguenth at gcc dot gnu.org
2020-11-18 10:42 ` rguenth at gcc dot gnu.org
2020-11-18 12:11 ` ubizjak at gmail dot com
2020-11-18 13:54 ` rguenth at gcc dot gnu.org
2020-11-18 13:56 ` ubizjak at gmail dot com
2020-11-18 13:58 ` rguenth at gcc dot gnu.org
2020-11-18 15:09 ` ubizjak at gmail dot com
2020-11-19 9:00 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).