public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Qing Zhao <qing.zhao@oracle.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] fold, simplify-rtx: Punt on non-representable floating point constants [PR104522]
Date: Wed, 13 Apr 2022 10:41:30 +0200	[thread overview]
Message-ID: <CAFiYyc185w6AZTOZ_Tn93qHqzhn+-xVP1-gVZ8xgZ2FT8vKaxg@mail.gmail.com> (raw)
In-Reply-To: <6685073D-CCC0-404E-82D1-0A3009AD3FC0@oracle.com>

On Tue, Feb 15, 2022 at 5:31 PM Qing Zhao via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> > On Feb 15, 2022, at 3:58 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > Hi!
> >
> > For IBM double double I've added in PR95450 and PR99648 verification that
> > when we at the tree/GIMPLE or RTL level interpret target bytes as a REAL_CST
> > or CONST_DOUBLE constant, we try to encode it back to target bytes and
> > verify it is the same.
> > This is because our real.c support isn't able to represent all valid values
> > of IBM double double which has variable precision.
> > In PR104522, it has been noted that we have similar problem with the
> > Intel/Motorola extended XFmode formats, our internal representation isn't
> > able to record pseudo denormals, pseudo infinities, pseudo NaNs and unnormal
> > values.
> > So, the following patch is an attempt to extend that verification to all
> > floats.
> > Unfortunately, it wasn't that straightforward, because the
> > __builtin_clear_padding code exactly for the XFmode long doubles needs to
> > discover what bits are padding and does that by interpreting memory of
> > all 1s.  That is actually a valid supported value, a qNaN with negative
> > sign with all mantissa bits set, but the verification includes also the
> > padding bits (exactly what __builtin_clear_padding wants to figure out)
> > and so fails the comparison check and so we ICE.
> > The patch fixes that case by moving that verification from
> > native_interpret_real to its caller, so that clear_padding_type can
> > call native_interpret_real and avoid that extra check.
> >
> > With this, the only thing that regresses in the testsuite is
> > +FAIL: gcc.target/i386/auto-init-4.c scan-assembler-times long\\t-16843010 5
> > because it decides to use a pattern that has non-zero bits in the padding
> > bits of the long double, so the simplify-rtx.cc change prevents folding
> > a SUBREG into a constant.  We emit (the testcase is -O0 but we emit worse
> > code at all opt levels) something like:
> >        movabsq $-72340172838076674, %rax
> >        movabsq $-72340172838076674, %rdx
> >        movq    %rax, -48(%rbp)
> >        movq    %rdx, -40(%rbp)
> >        fldt    -48(%rbp)
> >        fstpt   -32(%rbp)
> > instead of
> >        fldt    .LC2(%rip)
> >        fstpt   -32(%rbp)
> > ...
> > .LC2:
> >        .long   -16843010
> >        .long   -16843010
> >        .long   65278
> >        .long   0
> > Note, neither of those sequences actually stores the padding bits, fstpt
> > simply doesn't touch them.
> > For vars with clear_padding_real_needs_padding_p types that are allocated
> > to memory at expansion time, I'd say much better would be to do the stores
> > using integral modes rather than XFmode, so do that:
> >        movabsq $-72340172838076674, %rax
> >       movq    %rax, -32(%rbp)
> >       movq    %rax, -24(%rbp)
> > directly.  That is the only way to ensure the padding bits are initialized
> > (or expand __builtin_clear_padding, but then you initialize separately the
> > value bits and padding bits).
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, though as mentioned
> > above, the gcc.target/i386/auto-init-4.c case is unresolved.
>
> Thanks, I will try to fix this testing case in a later patch.

I've looked at this FAIL now and really wonder whether "pattern init" as
implemented makes any sense for non-integral types.  We end up with
initializing a register (SSA name) with

  VIEW_CONVERT_EXPR<long double>(0xfefefefefefefefefefefefefefefefe)

as we go building a TImode constant (we verified we have a TImode SET!)
but then

          /* Pun the LHS to make sure its type has constant size
             unless it is an SSA name where that's already known.  */
          if (TREE_CODE (lhs) != SSA_NAME)
            lhs = build1 (VIEW_CONVERT_EXPR, itype, lhs);
          else
            init = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), init);
...
      expand_assignment (lhs, init, false);

and generally registers do not have any padding.  This weird expansion
then causes us to spill the TImode constant and reload the XFmode value,
which is definitely not optimal here.

One approach to avoid the worse code generation would be to use mode
specific patterns for registers (like using a NaN or a target specific
value that
can be loaded cheaply), another would be to simply fall back to zero
initialization
when we fail to constant fold the initializer like with

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 8b1733e20c4..a4b995f71e4 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3125,7 +3125,11 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
          if (TREE_CODE (lhs) != SSA_NAME)
            lhs = build1 (VIEW_CONVERT_EXPR, itype, lhs);
          else
-           init = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), init);
+           {
+             init = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), init);
+             if (!CONSTANT_CLASS_P (init))
+               init = build_zero_cst (TREE_TYPE (lhs));
+           }
        }
       else
        /* Use zero-init also for variable-length sizes.  */

note that still does not address the issue noted by Jakub that we do not
initialize padding this way - but of course that's because we expand a
special assignment from .DEFERRED_INIT and are not initializing
the storage allocated for the variable on the stack (at -O0) by RTL
expansion.  Ideally .DEFERRED_INIT "expansion" for stack variables
would take place there, simply filling the allocated frame with the
pattern or zero.  That would probably mean that RTL expansion
should scoop up .DEFERRED_INITs for variables it decides not to
expand to pseudos and not leave that to stmt expansion.

I'm going to simply XFAIL this testcase for GCC 12 and opened
PR105259 so we can revisit the above for GCC 13.

Richard.

> Qing
> >
>

  reply	other threads:[~2022-04-13  8:41 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-15  9:58 Jakub Jelinek
2022-02-15 10:55 ` Richard Biener
2022-02-15 16:30 ` Qing Zhao
2022-04-13  8:41   ` Richard Biener [this message]
2022-04-13 15:22     ` Qing Zhao
2022-04-14  6:53       ` Richard Biener
2022-04-19 21:36         ` Qing Zhao
2022-04-20 10:38           ` Richard Biener
2022-04-20 16:02             ` Qing Zhao
2022-04-21  7:09               ` Richard Biener
2022-04-22 15:26                 ` Qing Zhao
2022-04-25  6:27                   ` Richard Biener
2022-04-22 15:00             ` Qing Zhao
2022-04-25  6:24               ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc185w6AZTOZ_Tn93qHqzhn+-xVP1-gVZ8xgZ2FT8vKaxg@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=qing.zhao@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).