public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Richard Biener <rguenther@suse.de>,
	gcc Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [match.pd] PR83750 - CSE erf/erfc pair
Date: Tue, 19 Oct 2021 16:39:55 +0530	[thread overview]
Message-ID: <CAAgBjMkdrPvysUru8-xZvz_TMiHNDxrfS25AYAW6OJfQwUbyOg@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc0i=z=reQ5mQhupY6JfN2Do5=LHCe-iqoczdto62q_Cig@mail.gmail.com>

On Tue, 19 Oct 2021 at 13:02, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Mon, 18 Oct 2021 at 17:23, Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > >
> > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener <rguenther@suse.de> wrote:
> > > > >
> > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener <rguenther@suse.de> wrote:
> > > > > > >
> > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > Hi Richard,
> > > > > > > > As suggested in PR, I have attached WIP patch that adds two patterns
> > > > > > > > to match.pd:
> > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > >
> > > > > > > > This works to remove call to erfc for the following test:
> > > > > > > > double f(double x)
> > > > > > > > {
> > > > > > > >   double g(double, double);
> > > > > > > >
> > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > >   return g(t1, t2);
> > > > > > > > }
> > > > > > > >
> > > > > > > > with .optimized dump shows:
> > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > >
> > > > > > > > However, for the following test:
> > > > > > > > double f(double x)
> > > > > > > > {
> > > > > > > >   double g(double, double);
> > > > > > > >
> > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > >   return t1;
> > > > > > > > }
> > > > > > > >
> > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 -
> > > > > > > > erf(x) to erfc(x) again
> > > > > > > > post canonicalization.
> > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied,
> > > > > > > > but then it tries to
> > > > > > > > resimplify erfc(x), which fails post canonicalization. So we end up
> > > > > > > > with erfc(x) transformed to
> > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal.
> > > > > > > > Could you suggest how to proceed ?
> > > > > > >
> > > > > > > I applied your patch manually and it does the intended
> > > > > > > simplifications so I wonder what I am missing?
> > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when there's
> > > > > > no erf(x) in the source ?
> > > > >
> > > > > I do think it's reasonable to expect erfc to be available when erf
> > > > > is and vice versa but note both are C99 specified functions (either
> > > > > requires -lm).
> > > > OK, thanks. Would it be OK to commit the patch after bootstrap+test ?
> > >
> > > Yes, but I'm confused because you say the patch doesn't work for you?
> > The patch works for me to CSE erf/erfc pair.
> > However when there's only erfc in the source, it canonicalizes erfc(x)
> > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > erfc(x)
> > with -O3 -funsafe-math-optimizations.
> >
> > For,
> > t1 = __builtin_erfc(x),
> >
> > .optimized dump shows:
> >   _2 = __builtin_erf (x_1(D));
> >   t1_3 = 1.0e+0 - _2;
> >
> > and for,
> > double t1 = x + __builtin_erfc(x);
> >
> > .optimized dump shows:
> >   _3 = __builtin_erf (x_2(D));
> >   _7 = x_2(D) + 1.0e+0;
> >   t1_4 = _7 - _3;
> >
> > I assume in both cases, we want erfc in the code-gen instead ?
> > I think the reason uncaonicalization fails is because the pattern 1 -
> > erf(x) to erfc(x)
> > gets applied, but then it fails in resimplifying erfc(x), and we end
> > up with 1 - erf(x) in code-gen.
> >
> > From gimple-match.c, it hits the simplification:
> >
> >                                 gimple_seq *lseq = seq;
> >                                 if (__builtin_expect (!dbg_cnt
> > (match), 0)) goto next_after_fail1172;
> >                                 if (__builtin_expect (dump_file &&
> > (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying pattern
> > %s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__);
> >                                 {
> >                                   res_op->set_op (CFN_BUILT_IN_ERFC, type, 1);
> >                                   res_op->ops[0] = captures[0];
> >                                   res_op->resimplify (lseq, valueize);
> >                                   return true;
> >                                 }
> >
> > But res_op->resimplify returns false, and doesn't end up adding to lseq.
>
> There's nothing to add to lseq since there's also nothing to resimplify.
> The only thing that could happen is that the replacement is not done
> because replace_stmt_with_simplification via maybe_push_res_to_seq
> doesn't pass the builtin_decl_implicit test:
>
>           /* Find the function we want to call.  */
>           tree decl = builtin_decl_implicit (as_builtin_fn (fn));
>           if (!decl)
>             return NULL;
>
> btw, it did work for me since the call was present before and gimplification
> should then mark the function eligible for implicit generation.
>
> > As you suggest, should we instead handle this in fre to transform
> > erfc(x) to 1 - erf(x), only when
> > there's a matching erf(x) in the source ?
>
> Note that's strictly less powerful and we'd have to handle erf(x) -> 1 +erfc (x)
> to handle CSE in
>
>   tem = erfc (x);
>   tem2 = erf (x);
>
> So no, I think the canonicalization is fine unless there's a compelling reason
> for having both erfc and erf.
>
> Can you debug why the reverse transform doesn't work for you?
It seems the issue was that erfc wasn't getting marked with const
attribute, and failed the following test in
maybe_push_res_to_seq:
          /* We can't and should not emit calls to non-const functions.  */
          if (!(flags_from_decl_or_type (decl) & ECF_CONST))
            return NULL;

Passing -fno-math-errno seems to work for the reverse transform:

double f(double x)
{
  double g(double, double);

  double t1 = __builtin_erfc (x);
  return t1;
}

Compiling with -O3 -funsafe-math-optimizations -fno-math-errno:

vrp2 dump shows:
Folding statement: _2 = __builtin_erf (x_1(D));
Not folded
Folding statement: t1_3 = 1.0e+0 - _2;
Applying pattern match.pd:6162, gimple-match.c:68450
gimple_simplified to t1_3 = __builtin_erfc (x_1(D));
Folded into: t1_3 = __builtin_erfc (x_1(D));

and .optimized dump shows:
double f (double x)
{
  double t1;

  <bb 2> [local count: 1073741824]:
  t1_3 = __builtin_erfc (x_1(D)); [tail call]
  return t1_3;
}

Unfortunately, for the test-case involving erf/erfc pair, the reverse
transform seems to undo the CSE:

double f(double x)
{
  double g(double, double);

  double t1 = __builtin_erf (x);
  double t2 = __builtin_erfc (x);
  return g(t1, t2);
}

gimplification turns erfc to 1 - erf:
Applying pattern match.pd:6158, gimple-match.c:44479
gimple_simplified to D.1988 = __builtin_erf (x);
t2 = 1.0e+0 - D.1988;

  t1 = __builtin_erf (x);
  D.1988 = __builtin_erf (x);
  t2 = 1.0e+0 - D.1988;
  D.1987 = g (t1, t2);

fre1 does the CSE:
  t1_2 = __builtin_erf (x_1(D));
  t2_4 = 1.0e+0 - t1_2;
  _7 = g (t1_2, t2_4);

and forwprop4 again converts 1 - erf(x) to erfc(x), "undoing" the CSE:
Applying pattern match.pd:6162, gimple-match.c:68450
gimple_simplified to t2_3 = __builtin_erfc (x_1(D));

  t1_2 = __builtin_erf (x_1(D));
  t2_3 = __builtin_erfc (x_1(D));
  _6 = g (t1_2, t2_3);

and .optimized dump shows:
  t1_2 = __builtin_erf (x_1(D));
  t2_3 = __builtin_erfc (x_1(D));
  _6 = g (t1_2, t2_3); [tail call]

Thanks,
Prathamesh
>
> Richard.
>
> > Thanks,
> > Prathamesh
> > >
> > > Btw, please add the testcase from the PR and also a testcase that shows
> > > the canonicalization is undone.  Maybe you can also double-check that
> > > we handle x + erfc (x) because I see we associate that as
> > > (x + 1) - erf (x) which is then not recognized back to erfc.
> > >
> > > The less surprising (as to preserve the function called in the source)
> > > variant for the PR would be to teach CSE to lookup erf(x) when
> > > visiting erfc(x) and when found synthesize 1 - erf(x).
> > >
> > > That said, a mathematician should chime in on how important it is
> > > to preserve erfc vs. erf (precision or even speed).
> > >
> > > Thanks,
> > > Richard.
> > >
> > > > Thanks,
> > > > Prathamesh
> > > >
> > > > >
> > > > > Richard.
> > > > >
> > > > > > So for the following test:
> > > > > > double f(double x)
> > > > > > {
> > > > > >   t1 = __builtin_erfc(x)
> > > > > >   return t1;
> > > > > > }
> > > > > >
> > > > > > .optimized dump shows:
> > > > > > double f (double x)
> > > > > > {
> > > > > >   double t1;
> > > > > >   double _2;
> > > > > >
> > > > > >   <bb 2> [local count: 1073741824]:
> > > > > >   _2 = __builtin_erf (x_1(D));
> > > > > >   t1_3 = 1.0e+0 - _2;
> > > > > >   return t1_3;
> > > > > > }
> > > > > >
> > > > > > while before patch, it has:
> > > > > >   t1_4 = __builtin_erfc (x_2(D)); [tail call]
> > > > > >   return t1_4;
> > > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > >
> > > > > > >
> > > > > > > Richard.
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Prathamesh
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguenther@suse.de>
> > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de>
> > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

  reply	other threads:[~2021-10-19 11:10 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-18 10:05 Prathamesh Kulkarni
2021-10-18 10:48 ` Richard Biener
2021-10-18 11:30   ` Prathamesh Kulkarni
2021-10-18 11:40     ` Richard Biener
2021-10-18 11:45       ` Prathamesh Kulkarni
2021-10-18 11:53         ` Richard Biener
2021-10-19  7:01           ` Prathamesh Kulkarni
2021-10-19  7:32             ` Richard Biener
2021-10-19 11:09               ` Prathamesh Kulkarni [this message]
2021-10-19 11:25                 ` Richard Biener
2021-10-20 11:59                   ` Prathamesh Kulkarni
2021-10-20 12:51                     ` Richard Biener
2021-10-22  7:47                       ` Prathamesh Kulkarni
2021-10-22  9:26                         ` Richard Biener
2021-10-22 10:51                           ` Prathamesh Kulkarni
2021-10-25  9:48                             ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAgBjMkdrPvysUru8-xZvz_TMiHNDxrfS25AYAW6OJfQwUbyOg@mail.gmail.com \
    --to=prathamesh.kulkarni@linaro.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).