public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations
@ 2024-01-04 9:07 denis.campredon at gmail dot com
2024-01-04 9:13 ` [Bug target/113231] " pinskia at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: denis.campredon at gmail dot com @ 2024-01-04 9:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231
Bug ID: 113231
Summary: x86_64 use MMX instructions for simple shift
operations
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: denis.campredon at gmail dot com
Target Milestone: ---
Compiled with -Os, the following functions use MMX instructions for simple
shift.
------------------
void foo(int *i)
{
*i *=2;
}
void bar(int *i)
{
*i <<=2;
}
void baz(int *i)
{
*i >>=2;
}
------------------
foo(int*):
movd xmm0, DWORD PTR [rdi]
pslld xmm0, 1
movd DWORD PTR [rdi], xmm0
ret
bar(int*):
movd xmm0, DWORD PTR [rdi]
pslld xmm0, 2
movd DWORD PTR [rdi], xmm0
ret
baz(int*):
movd xmm0, DWORD PTR [rdi]
psrad xmm0, 2
movd DWORD PTR [rdi], xmm0
ret
-----------------
I would expect the generated code to use "sar" or "sal" instructions like O2
uses.
The functions used to generate optimal code with version 9.5.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113231] x86_64 use MMX instructions for simple shift operations
2024-01-04 9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
@ 2024-01-04 9:13 ` pinskia at gcc dot gnu.org
2024-01-04 9:22 ` [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os pinskia at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-04 9:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
xmm0 is sse registers rather than mmx :).
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
2024-01-04 9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
2024-01-04 9:13 ` [Bug target/113231] " pinskia at gcc dot gnu.org
@ 2024-01-04 9:22 ` pinskia at gcc dot gnu.org
2024-01-04 12:45 ` ubizjak at gmail dot com
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-04 9:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|x86_64 use MMX instructions |x86_64 uses SSE
|for simple shift operations |instructions for `*mem <<=
| |const` at -Os
Last reconfirmed| |2024-01-04
Keywords| |missed-optimization
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> Registers conversion cost: 0
In this case we start off with:
```
(insn 6 3 0 2 (parallel [
(set (mem:SI (reg/v/f:DI 100 [ iD.2766 ]) [1 *i_4(D)+0 S4 A32])
(ashift:SI (mem:SI (reg/v/f:DI 100 [ iD.2766 ]) [1 *i_4(D)+0 S4
A32])
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
]) "/app/example.cpp":3:8 911 {*ashlsi3_1}
(expr_list:REG_DEAD (reg/v/f:DI 100 [ iD.2766 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil))))
```
Which has 0 registers usage but then STV does not take into account the need
for the load/store for SSE registers.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
2024-01-04 9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
2024-01-04 9:13 ` [Bug target/113231] " pinskia at gcc dot gnu.org
2024-01-04 9:22 ` [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os pinskia at gcc dot gnu.org
@ 2024-01-04 12:45 ` ubizjak at gmail dot com
2024-01-04 16:25 ` roger at nextmovesoftware dot com
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-01-04 12:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |roger at nextmovesoftware dot com
--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
CC Roger.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
2024-01-04 9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
` (2 preceding siblings ...)
2024-01-04 12:45 ` ubizjak at gmail dot com
@ 2024-01-04 16:25 ` roger at nextmovesoftware dot com
2024-01-07 17:43 ` cvs-commit at gcc dot gnu.org
2024-01-09 8:57 ` roger at nextmovesoftware dot com
5 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2024-01-04 16:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |roger at nextmovesoftware dot com
--- Comment #4 from Roger Sayle <roger at nextmovesoftware dot com> ---
I'm testing a patch, for more accurate conversion gains/costs in the
scalar-to-vector pass. Adding -mno-stv will work around the problem.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
2024-01-04 9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
` (3 preceding siblings ...)
2024-01-04 16:25 ` roger at nextmovesoftware dot com
@ 2024-01-07 17:43 ` cvs-commit at gcc dot gnu.org
2024-01-09 8:57 ` roger at nextmovesoftware dot com
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-01-07 17:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231
--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:
https://gcc.gnu.org/g:0a8aba760f62e9d66cc5610ecc276c1f0befc651
commit r14-6985-g0a8aba760f62e9d66cc5610ecc276c1f0befc651
Author: Roger Sayle <roger@nextmovesoftware.com>
Date: Sun Jan 7 17:42:00 2024 +0000
i386: PR target/113231: Improved costs in Scalar-To-Vector (STV) pass.
This patch improves the cost/gain calculation used during the i386
backend's
SImode/DImode scalar-to-vector (STV) conversion pass. The current code
handles loads and stores, but doesn't consider that converting other
scalar operations with a memory destination, requires an explicit load
before and an explicit store after the vector equivalent.
To ease the review, the significant change looks like:
/* For operations on memory operands, include the overhead
of explicit load and store instructions. */
if (MEM_P (dst))
igain += !optimize_insn_for_size_p ()
? -COSTS_N_BYTES (8);
: (m * (ix86_cost->int_load[2]
+ ix86_cost->int_store[2])
- (ix86_cost->sse_load[sse_cost_idx] +
ix86_cost->sse_store[sse_cost_idx]));
however the patch itself is complicated by a change in indentation
which leads to a number of lines with only whitespace changes.
For architectures where integer load/store costs are the same as
vector load/store costs, there should be no change without -Os/-Oz.
2024-01-07 Roger Sayle <roger@nextmovesoftware.com>
Uros Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/113231
* config/i386/i386-features.cc (compute_convert_gain): Include
the overhead of explicit load and store (movd) instructions when
converting non-store scalar operations with memory destinations.
Various indentation whitespace fixes.
gcc/testsuite/ChangeLog
PR target/113231
* gcc.target/i386/pr113231.c: New test case.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
2024-01-04 9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
` (4 preceding siblings ...)
2024-01-07 17:43 ` cvs-commit at gcc dot gnu.org
@ 2024-01-09 8:57 ` roger at nextmovesoftware dot com
5 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2024-01-09 8:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Target Milestone|--- |14.0
Status|ASSIGNED |RESOLVED
--- Comment #6 from Roger Sayle <roger at nextmovesoftware dot com> ---
This should now be fixed on mainline.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-01-09 8:57 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-04 9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
2024-01-04 9:13 ` [Bug target/113231] " pinskia at gcc dot gnu.org
2024-01-04 9:22 ` [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os pinskia at gcc dot gnu.org
2024-01-04 12:45 ` ubizjak at gmail dot com
2024-01-04 16:25 ` roger at nextmovesoftware dot com
2024-01-07 17:43 ` cvs-commit at gcc dot gnu.org
2024-01-09 8:57 ` roger at nextmovesoftware dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).