[Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct since r14-2526

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct since r14-2526
Date: Mon, 04 Mar 2024 00:51:49 +0000	[thread overview]
Message-ID: <bug-114187-4-NW8ficrk2v@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-114187-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114187

--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:d35b5b0e0a0727cfdaba5f859e44116c33648639

commit r14-9287-gd35b5b0e0a0727cfdaba5f859e44116c33648639
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Mon Mar 4 00:47:08 2024 +0000

    PR target/114187: Fix ?Fmode SUBREG simplification in simplify_subreg.

    This patch fixes PR target/114187 a typo/missed-optimization in
simplify-rtx
    that's exposed by (my) changes to x86_64's parameter passing.  The context
    is that construction of double word (TImode) values now uses the idiom:

    (ior:TI (ashift:TI (zero_extend:TI (reg:DI x)) (const_int 64 [0x40]))
            (zero_extend:TI (reg:DI y)))

    Extracting the DImode highpart and lowpart halves of this complex
expression
    is supported by simplications in simplify_subreg.  The problem is when the
    doubleword TImode value represents two DFmode fields, there isn't a direct
    simplification to extract the highpart or lowpart SUBREGs, but instead GCC
    uses two steps, extract the DImode {high,low} part and then cast the result
    back to a floating point mode, DFmode.

    The (buggy) code to do this is:

      /* If the outer mode is not integral, try taking a subreg with the
equivalent
         integer outer mode and then bitcasting the result.
         Other simplifications rely on integer to integer subregs and we'd
         potentially miss out on optimizations otherwise.  */
      if (known_gt (GET_MODE_SIZE (innermode),
                    GET_MODE_SIZE (outermode))
          && SCALAR_INT_MODE_P (innermode)
          && !SCALAR_INT_MODE_P (outermode)
          && int_mode_for_size (GET_MODE_BITSIZE (outermode),
                                0).exists (&int_outermode))
        {
          rtx tem = simplify_subreg (int_outermode, op, innermode, byte);
          if (tem)
            return simplify_gen_subreg (outermode, tem, int_outermode, byte);
        }

    The issue/mistake is that the second call, to simplify_subreg, shouldn't
    use "byte" as the final argument; the offset has already been handled by
    the first call, to simplify_subreg, and this second call is just a type
    conversion from an integer mode to floating point (from DImode to DFmode).

    Interestingly, this mistake was already spotted by Richard Sandiford when
    the optimization was originally contributed in January 2023.
    https://gcc.gnu.org/pipermail/gcc-patches/2023-January/610920.html
    >> Richard Sandiford writes:
    >> Also, the final line should pass 0 rather than byte.

    Unfortunately a miscommunication/misunderstanding in a later thread
    https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612898.html
    resulted in this correction being undone.  Using lowpart_subreg should
    avoid/reduce confusion in future.

    2024-03-03  Roger Sayle  <roger@nextmovesoftware.com>

    gcc/ChangeLog
            PR target/114187
            * simplify-rtx.cc (simplify_context::simplify_subreg): Call
            lowpart_subreg to perform type conversion, to avoid confusion
            over the offset to use in the call to simplify_reg_subreg.

    gcc/testsuite/ChangeLog
            PR target/114187
            * g++.target/i386/pr114187.C: New test case.

next prev parent reply	other threads:[~2024-03-04  0:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-01  9:16 [Bug rtl-optimization/114187] New: [14 regression] bizarre register dance on x86_64 for pass-by-value struct matteo at mitalia dot net
2024-03-01  9:25 ` [Bug target/114187] " pinskia at gcc dot gnu.org
2024-03-01  9:27 ` pinskia at gcc dot gnu.org
2024-03-01 12:53 ` [Bug target/114187] [14 regression] bizarre register dance on x86_64 for pass-by-value struct since r14-2526 jakub at gcc dot gnu.org
2024-03-01 16:08 ` roger at nextmovesoftware dot com
2024-03-01 19:15 ` roger at nextmovesoftware dot com
2024-03-04  0:51 ` cvs-commit at gcc dot gnu.org [this message]
2024-03-04 13:19 ` rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-114187-4-NW8ficrk2v@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).