* [Bug rtl-optimization/67325] Optimize shift (aka subreg) of load to simple load
2015-08-23 6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
@ 2015-08-23 6:14 ` pinskia at gcc dot gnu.org
2015-08-23 6:41 ` glisse at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-08-23 6:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|target |rtl-optimization
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
There was some code in combine/simplify-rtx which does this but maybe it only
handles logical shift right and not arithmetic shift right.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/67325] Optimize shift (aka subreg) of load to simple load
2015-08-23 6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
2015-08-23 6:14 ` [Bug rtl-optimization/67325] " pinskia at gcc dot gnu.org
@ 2015-08-23 6:41 ` glisse at gcc dot gnu.org
2015-08-23 6:59 ` trippels at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: glisse at gcc dot gnu.org @ 2015-08-23 6:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325
--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
I am seeing the same code (well, with shrq) if I make the types unsigned.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/67325] Optimize shift (aka subreg) of load to simple load
2015-08-23 6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
2015-08-23 6:14 ` [Bug rtl-optimization/67325] " pinskia at gcc dot gnu.org
2015-08-23 6:41 ` glisse at gcc dot gnu.org
@ 2015-08-23 6:59 ` trippels at gcc dot gnu.org
2015-08-23 7:08 ` [Bug target/67325] " pinskia at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: trippels at gcc dot gnu.org @ 2015-08-23 6:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325
Markus Trippelsdorf <trippels at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-08-23
CC| |trippels at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #3 from Markus Trippelsdorf <trippels at gcc dot gnu.org> ---
Apparently never worked with gcc. Clang gets it right.
https://goo.gl/BB4KMZ
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/67325] Optimize shift (aka subreg) of load to simple load
2015-08-23 6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
` (2 preceding siblings ...)
2015-08-23 6:59 ` trippels at gcc dot gnu.org
@ 2015-08-23 7:08 ` pinskia at gcc dot gnu.org
2015-08-23 7:40 ` glisse at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-08-23 7:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|rtl-optimization |target
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is a target issue:
Trying 6 -> 7:
Failed to match this instruction:
(parallel [
(set (reg:DI 65)
(sign_extend:DI (mem:SI (plus:DI (reg/v/f:DI 63 [ l ])
(const_int 4 [0x4])) [2 *l_1(D)+4 S4 A32])))
(clobber (reg:CC 17 flags))
])
Successfully matched this instruction:
(set (reg:DI 65)
(sign_extend:DI (mem:SI (plus:DI (reg/v/f:DI 63 [ l ])
(const_int 4 [0x4])) [2 *l_1(D)+4 S4 A32])))
rejecting combination of insns 6 and 7
original costs 4 + 4 = 8
replacement cost 12
starting the processing of deferred insns
ending the processing of deferred insns
So GCC is able to do it but rejects it because the cost is worse for some
reason.
That is it is replacing:
(insn 6 3 7 2 (set (reg:DI 66 [ *l_1(D) ])
(mem:DI (reg/v/f:DI 63 [ l ]) [2 *l_1(D)+0 S8 A64])) t1.c:2 62
{*movdi_internal_rex64}
(expr_list:REG_DEAD (reg/v/f:DI 63 [ l ])
(nil)))
(insn 7 6 13 2 (parallel [
(set (reg:DI 65)
(ashiftrt:DI (reg:DI 66 [ *l_1(D) ])
(const_int 32 [0x20])))
(clobber (reg:CC 17 flags))
]) t1.c:2 530 {*ashrdi3_1}
(expr_list:REG_DEAD (reg:DI 66 [ *l_1(D) ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil))))
With the above. Note the clobber gets in the way of combining with the next
insn of the subreg.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/67325] Optimize shift (aka subreg) of load to simple load
2015-08-23 6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
` (3 preceding siblings ...)
2015-08-23 7:08 ` [Bug target/67325] " pinskia at gcc dot gnu.org
@ 2015-08-23 7:40 ` glisse at gcc dot gnu.org
2024-05-28 22:59 ` cvs-commit at gcc dot gnu.org
2024-05-28 23:00 ` liuhongt at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: glisse at gcc dot gnu.org @ 2015-08-23 7:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325
--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #4)
> So GCC is able to do it but rejects it because the cost is worse for some
> reason.
Indeed, and -Os produces the expected
movl 4(%rdi), %eax
(I did not benchmark)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/67325] Optimize shift (aka subreg) of load to simple load
2015-08-23 6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
` (4 preceding siblings ...)
2015-08-23 7:40 ` glisse at gcc dot gnu.org
@ 2024-05-28 22:59 ` cvs-commit at gcc dot gnu.org
2024-05-28 23:00 ` liuhongt at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-05-28 22:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:1d6199e5f8c1c08083eeb0279f71333234fe14ad
commit r15-882-g1d6199e5f8c1c08083eeb0279f71333234fe14ad
Author: liuhongt <hongtao.liu@intel.com>
Date: Mon Feb 19 13:57:24 2024 +0800
Reduce cost of MEM (A + imm).
For MEM, rtx_cost iterates each subrtx, and adds up the costs,
so for MEM (reg) and MEM (reg + 4), the former costs 5,
the latter costs 9, it is not accurate for x86. Ideally
address_cost should be used, but it reduce cost too much.
So current solution is make constant disp as cheap as possible.
gcc/ChangeLog:
PR target/67325
* config/i386/i386.cc (ix86_rtx_costs): Reduce cost of MEM (A
+ imm) to "cost of MEM (A)" + 1.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr67325.c: New test.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/67325] Optimize shift (aka subreg) of load to simple load
2015-08-23 6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
` (5 preceding siblings ...)
2024-05-28 22:59 ` cvs-commit at gcc dot gnu.org
@ 2024-05-28 23:00 ` liuhongt at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-05-28 23:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325
Hongtao Liu <liuhongt at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |liuhongt at gcc dot gnu.org
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #7 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
Fixed in GCC15.
^ permalink raw reply [flat|nested] 8+ messages in thread