From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id CF3273858D3C for ; Tue, 19 Oct 2021 11:25:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CF3273858D3C Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id BB8CA219DA; Tue, 19 Oct 2021 11:25:43 +0000 (UTC) Received: from murzim.suse.de (murzim.suse.de [10.160.4.192]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id B5994A3B84; Tue, 19 Oct 2021 11:25:43 +0000 (UTC) Date: Tue, 19 Oct 2021 13:25:43 +0200 (CEST) From: Richard Biener To: Prathamesh Kulkarni cc: Richard Biener , gcc Patches Subject: Re: [match.pd] PR83750 - CSE erf/erfc pair In-Reply-To: Message-ID: <7069701q-n4q5-sr4-5824-53r8133q19s4@fhfr.qr> References: <7nno61so-8qor-n16r-444p-527o2qp169r5@fhfr.qr> <92r13pn-97s9-r18r-6p8-96o3rqqn9or1@fhfr.qr> <41ssp55p-1sp2-r9qs-n4po-poq797or90@fhfr.qr> MIME-Version: 1.0 X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Oct 2021 11:25:47 -0000 On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote: > On Tue, 19 Oct 2021 at 13:02, Richard Biener wrote: > > > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches > > wrote: > > > > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener wrote: > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote: > > > > > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener wrote: > > > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote: > > > > > > > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener wrote: > > > > > > > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote: > > > > > > > > > > > > > > > > > Hi Richard, > > > > > > > > > As suggested in PR, I have attached WIP patch that adds two patterns > > > > > > > > > to match.pd: > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and, > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p(). > > > > > > > > > > > > > > > > > > This works to remove call to erfc for the following test: > > > > > > > > > double f(double x) > > > > > > > > > { > > > > > > > > > double g(double, double); > > > > > > > > > > > > > > > > > > double t1 = __builtin_erf (x); > > > > > > > > > double t2 = __builtin_erfc (x); > > > > > > > > > return g(t1, t2); > > > > > > > > > } > > > > > > > > > > > > > > > > > > with .optimized dump shows: > > > > > > > > > t1_2 = __builtin_erf (x_1(D)); > > > > > > > > > t2_3 = 1.0e+0 - t1_2; > > > > > > > > > > > > > > > > > > However, for the following test: > > > > > > > > > double f(double x) > > > > > > > > > { > > > > > > > > > double g(double, double); > > > > > > > > > > > > > > > > > > double t1 = __builtin_erfc (x); > > > > > > > > > return t1; > > > > > > > > > } > > > > > > > > > > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not transform 1 - > > > > > > > > > erf(x) to erfc(x) again > > > > > > > > > post canonicalization. > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) gets applied, > > > > > > > > > but then it tries to > > > > > > > > > resimplify erfc(x), which fails post canonicalization. So we end up > > > > > > > > > with erfc(x) transformed to > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't ideal. > > > > > > > > > Could you suggest how to proceed ? > > > > > > > > > > > > > > > > I applied your patch manually and it does the intended > > > > > > > > simplifications so I wonder what I am missing? > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even when there's > > > > > > > no erf(x) in the source ? > > > > > > > > > > > > I do think it's reasonable to expect erfc to be available when erf > > > > > > is and vice versa but note both are C99 specified functions (either > > > > > > requires -lm). > > > > > OK, thanks. Would it be OK to commit the patch after bootstrap+test ? > > > > > > > > Yes, but I'm confused because you say the patch doesn't work for you? > > > The patch works for me to CSE erf/erfc pair. > > > However when there's only erfc in the source, it canonicalizes erfc(x) > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to > > > erfc(x) > > > with -O3 -funsafe-math-optimizations. > > > > > > For, > > > t1 = __builtin_erfc(x), > > > > > > .optimized dump shows: > > > _2 = __builtin_erf (x_1(D)); > > > t1_3 = 1.0e+0 - _2; > > > > > > and for, > > > double t1 = x + __builtin_erfc(x); > > > > > > .optimized dump shows: > > > _3 = __builtin_erf (x_2(D)); > > > _7 = x_2(D) + 1.0e+0; > > > t1_4 = _7 - _3; > > > > > > I assume in both cases, we want erfc in the code-gen instead ? > > > I think the reason uncaonicalization fails is because the pattern 1 - > > > erf(x) to erfc(x) > > > gets applied, but then it fails in resimplifying erfc(x), and we end > > > up with 1 - erf(x) in code-gen. > > > > > > From gimple-match.c, it hits the simplification: > > > > > > gimple_seq *lseq = seq; > > > if (__builtin_expect (!dbg_cnt > > > (match), 0)) goto next_after_fail1172; > > > if (__builtin_expect (dump_file && > > > (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying pattern > > > %s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__); > > > { > > > res_op->set_op (CFN_BUILT_IN_ERFC, type, 1); > > > res_op->ops[0] = captures[0]; > > > res_op->resimplify (lseq, valueize); > > > return true; > > > } > > > > > > But res_op->resimplify returns false, and doesn't end up adding to lseq. > > > > There's nothing to add to lseq since there's also nothing to resimplify. > > The only thing that could happen is that the replacement is not done > > because replace_stmt_with_simplification via maybe_push_res_to_seq > > doesn't pass the builtin_decl_implicit test: > > > > /* Find the function we want to call. */ > > tree decl = builtin_decl_implicit (as_builtin_fn (fn)); > > if (!decl) > > return NULL; > > > > btw, it did work for me since the call was present before and gimplification > > should then mark the function eligible for implicit generation. > > > > > As you suggest, should we instead handle this in fre to transform > > > erfc(x) to 1 - erf(x), only when > > > there's a matching erf(x) in the source ? > > > > Note that's strictly less powerful and we'd have to handle erf(x) -> 1 +erfc (x) > > to handle CSE in > > > > tem = erfc (x); > > tem2 = erf (x); > > > > So no, I think the canonicalization is fine unless there's a compelling reason > > for having both erfc and erf. > > > > Can you debug why the reverse transform doesn't work for you? > It seems the issue was that erfc wasn't getting marked with const > attribute, and failed the following test in > maybe_push_res_to_seq: > /* We can't and should not emit calls to non-const functions. */ > if (!(flags_from_decl_or_type (decl) & ECF_CONST)) > return NULL; > > Passing -fno-math-errno seems to work for the reverse transform: > > double f(double x) > { > double g(double, double); > > double t1 = __builtin_erfc (x); > return t1; > } > > Compiling with -O3 -funsafe-math-optimizations -fno-math-errno: > > vrp2 dump shows: > Folding statement: _2 = __builtin_erf (x_1(D)); > Not folded > Folding statement: t1_3 = 1.0e+0 - _2; > Applying pattern match.pd:6162, gimple-match.c:68450 > gimple_simplified to t1_3 = __builtin_erfc (x_1(D)); > Folded into: t1_3 = __builtin_erfc (x_1(D)); > > and .optimized dump shows: > double f (double x) > { > double t1; > > [local count: 1073741824]: > t1_3 = __builtin_erfc (x_1(D)); [tail call] > return t1_3; > } > > Unfortunately, for the test-case involving erf/erfc pair, the reverse > transform seems to undo the CSE: > > double f(double x) > { > double g(double, double); > > double t1 = __builtin_erf (x); > double t2 = __builtin_erfc (x); > return g(t1, t2); > } > > gimplification turns erfc to 1 - erf: > Applying pattern match.pd:6158, gimple-match.c:44479 > gimple_simplified to D.1988 = __builtin_erf (x); > t2 = 1.0e+0 - D.1988; > > t1 = __builtin_erf (x); > D.1988 = __builtin_erf (x); > t2 = 1.0e+0 - D.1988; > D.1987 = g (t1, t2); > > fre1 does the CSE: > t1_2 = __builtin_erf (x_1(D)); > t2_4 = 1.0e+0 - t1_2; > _7 = g (t1_2, t2_4); > > and forwprop4 again converts 1 - erf(x) to erfc(x), "undoing" the CSE: > Applying pattern match.pd:6162, gimple-match.c:68450 > gimple_simplified to t2_3 = __builtin_erfc (x_1(D)); > > t1_2 = __builtin_erf (x_1(D)); > t2_3 = __builtin_erfc (x_1(D)); > _6 = g (t1_2, t2_3); > > and .optimized dump shows: > t1_2 = __builtin_erf (x_1(D)); > t2_3 = __builtin_erfc (x_1(D)); > _6 = g (t1_2, t2_3); [tail call] You probably want an explicit && single_use () check on the 1 - erf() -> erfc transform. > Thanks, > Prathamesh > > > > Richard. > > > > > Thanks, > > > Prathamesh > > > > > > > > Btw, please add the testcase from the PR and also a testcase that shows > > > > the canonicalization is undone. Maybe you can also double-check that > > > > we handle x + erfc (x) because I see we associate that as > > > > (x + 1) - erf (x) which is then not recognized back to erfc. > > > > > > > > The less surprising (as to preserve the function called in the source) > > > > variant for the PR would be to teach CSE to lookup erf(x) when > > > > visiting erfc(x) and when found synthesize 1 - erf(x). > > > > > > > > That said, a mathematician should chime in on how important it is > > > > to preserve erfc vs. erf (precision or even speed). > > > > > > > > Thanks, > > > > Richard. > > > > > > > > > Thanks, > > > > > Prathamesh > > > > > > > > > > > > > > > > > Richard. > > > > > > > > > > > > > So for the following test: > > > > > > > double f(double x) > > > > > > > { > > > > > > > t1 = __builtin_erfc(x) > > > > > > > return t1; > > > > > > > } > > > > > > > > > > > > > > .optimized dump shows: > > > > > > > double f (double x) > > > > > > > { > > > > > > > double t1; > > > > > > > double _2; > > > > > > > > > > > > > > [local count: 1073741824]: > > > > > > > _2 = __builtin_erf (x_1(D)); > > > > > > > t1_3 = 1.0e+0 - _2; > > > > > > > return t1_3; > > > > > > > } > > > > > > > > > > > > > > while before patch, it has: > > > > > > > t1_4 = __builtin_erfc (x_2(D)); [tail call] > > > > > > > return t1_4; > > > > > > > > > > > > > > Thanks, > > > > > > > Prathamesh > > > > > > > > > > > > > > > > > > > > > > > Richard. > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Prathamesh > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Richard Biener > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg) > > > > > > > > > > > > > > > > > > > -- > > > > > > Richard Biener > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg) > > > > > > > > > > > > > -- > > > > Richard Biener > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg) > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)