[Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations
@ 2024-01-04  9:07 denis.campredon at gmail dot com
  2024-01-04  9:13 ` [Bug target/113231] " pinskia at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: denis.campredon at gmail dot com @ 2024-01-04  9:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

            Bug ID: 113231
           Summary: x86_64 use MMX instructions for simple shift
                    operations
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: denis.campredon at gmail dot com
  Target Milestone: ---

Compiled with -Os, the following functions use MMX instructions for simple
shift.

------------------
void foo(int *i)
{
    *i *=2;
}

void bar(int *i)
{
    *i <<=2;
}

void baz(int *i)
{
    *i >>=2;
}
------------------

foo(int*):
        movd    xmm0, DWORD PTR [rdi]
        pslld   xmm0, 1
        movd    DWORD PTR [rdi], xmm0
        ret
bar(int*):
        movd    xmm0, DWORD PTR [rdi]
        pslld   xmm0, 2
        movd    DWORD PTR [rdi], xmm0
        ret
baz(int*):
        movd    xmm0, DWORD PTR [rdi]
        psrad   xmm0, 2
        movd    DWORD PTR [rdi], xmm0
        ret

-----------------

I would expect the generated code to use "sar" or "sal" instructions like O2
uses.

The functions used to generate optimal code with version 9.5.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113231] x86_64 use MMX instructions for simple shift operations
  2024-01-04  9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
@ 2024-01-04  9:13 ` pinskia at gcc dot gnu.org
  2024-01-04  9:22 ` [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-04  9:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
xmm0 is sse registers rather than mmx :).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
  2024-01-04  9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
  2024-01-04  9:13 ` [Bug target/113231] " pinskia at gcc dot gnu.org
@ 2024-01-04  9:22 ` pinskia at gcc dot gnu.org
  2024-01-04 12:45 ` ubizjak at gmail dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-04  9:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|x86_64 use MMX instructions |x86_64 uses SSE
                   |for simple shift operations |instructions for `*mem <<=
                   |                            |const` at -Os
   Last reconfirmed|                            |2024-01-04
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>  Registers conversion cost: 0

In this case we start off with:
```
(insn 6 3 0 2 (parallel [
            (set (mem:SI (reg/v/f:DI 100 [ iD.2766 ]) [1 *i_4(D)+0 S4 A32])
                (ashift:SI (mem:SI (reg/v/f:DI 100 [ iD.2766 ]) [1 *i_4(D)+0 S4
A32])
                    (const_int 1 [0x1])))
            (clobber (reg:CC 17 flags))
        ]) "/app/example.cpp":3:8 911 {*ashlsi3_1}
     (expr_list:REG_DEAD (reg/v/f:DI 100 [ iD.2766 ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))
```

Which has 0 registers usage but then STV does not take into account the need
for the load/store for SSE registers.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
  2024-01-04  9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
  2024-01-04  9:13 ` [Bug target/113231] " pinskia at gcc dot gnu.org
  2024-01-04  9:22 ` [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os pinskia at gcc dot gnu.org
@ 2024-01-04 12:45 ` ubizjak at gmail dot com
  2024-01-04 16:25 ` roger at nextmovesoftware dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2024-01-04 12:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot com

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
CC Roger.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
  2024-01-04  9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
                   ` (2 preceding siblings ...)
  2024-01-04 12:45 ` ubizjak at gmail dot com
@ 2024-01-04 16:25 ` roger at nextmovesoftware dot com
  2024-01-07 17:43 ` cvs-commit at gcc dot gnu.org
  2024-01-09  8:57 ` roger at nextmovesoftware dot com
  5 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2024-01-04 16:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |roger at nextmovesoftware dot com

--- Comment #4 from Roger Sayle <roger at nextmovesoftware dot com> ---
I'm testing a patch, for more accurate conversion gains/costs in the
scalar-to-vector pass.  Adding -mno-stv will work around the problem.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
  2024-01-04  9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
                   ` (3 preceding siblings ...)
  2024-01-04 16:25 ` roger at nextmovesoftware dot com
@ 2024-01-07 17:43 ` cvs-commit at gcc dot gnu.org
  2024-01-09  8:57 ` roger at nextmovesoftware dot com
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-01-07 17:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:0a8aba760f62e9d66cc5610ecc276c1f0befc651

commit r14-6985-g0a8aba760f62e9d66cc5610ecc276c1f0befc651
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Sun Jan 7 17:42:00 2024 +0000

    i386: PR target/113231: Improved costs in Scalar-To-Vector (STV) pass.

    This patch improves the cost/gain calculation used during the i386
backend's
    SImode/DImode scalar-to-vector (STV) conversion pass.  The current code
    handles loads and stores, but doesn't consider that converting other
    scalar operations with a memory destination, requires an explicit load
    before and an explicit store after the vector equivalent.

    To ease the review, the significant change looks like:

             /* For operations on memory operands, include the overhead
                of explicit load and store instructions.  */
             if (MEM_P (dst))
               igain += !optimize_insn_for_size_p ()
                        ? -COSTS_N_BYTES (8);
                        : (m * (ix86_cost->int_load[2]
                                + ix86_cost->int_store[2])
                           - (ix86_cost->sse_load[sse_cost_idx] +
                              ix86_cost->sse_store[sse_cost_idx]));

    however the patch itself is complicated by a change in indentation
    which leads to a number of lines with only whitespace changes.
    For architectures where integer load/store costs are the same as
    vector load/store costs, there should be no change without -Os/-Oz.

    2024-01-07  Roger Sayle  <roger@nextmovesoftware.com>
                Uros Bizjak  <ubizjak@gmail.com>

    gcc/ChangeLog
            PR target/113231
            * config/i386/i386-features.cc (compute_convert_gain): Include
            the overhead of explicit load and store (movd) instructions when
            converting non-store scalar operations with memory destinations.
            Various indentation whitespace fixes.

    gcc/testsuite/ChangeLog
            PR target/113231
            * gcc.target/i386/pr113231.c: New test case.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os
  2024-01-04  9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
                   ` (4 preceding siblings ...)
  2024-01-07 17:43 ` cvs-commit at gcc dot gnu.org
@ 2024-01-09  8:57 ` roger at nextmovesoftware dot com
  5 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2024-01-09  8:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
   Target Milestone|---                         |14.0
             Status|ASSIGNED                    |RESOLVED

--- Comment #6 from Roger Sayle <roger at nextmovesoftware dot com> ---
This should now be fixed on mainline.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-01-09  8:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-04  9:07 [Bug rtl-optimization/113231] New: x86_64 use MMX instructions for simple shift operations denis.campredon at gmail dot com
2024-01-04  9:13 ` [Bug target/113231] " pinskia at gcc dot gnu.org
2024-01-04  9:22 ` [Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os pinskia at gcc dot gnu.org
2024-01-04 12:45 ` ubizjak at gmail dot com
2024-01-04 16:25 ` roger at nextmovesoftware dot com
2024-01-07 17:43 ` cvs-commit at gcc dot gnu.org
2024-01-09  8:57 ` roger at nextmovesoftware dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).