From: Richard Biener <richard.guenther@gmail.com>
To: Jackson Woodruff <jackson.woodruff@foss.arm.com>
Cc: Jeff Law <law@redhat.com>,
Wilco Dijkstra <Wilco.Dijkstra@arm.com>,
"kyrylo.tkachov@foss.arm.com" <kyrylo.tkachov@foss.arm.com>,
"Joseph S. Myers" <joseph@codesourcery.com>,
GCC Patches <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>
Subject: Re: [PATCH] Factor out division by squares and remove division around comparisons (1/2)
Date: Wed, 30 Aug 2017 13:26:00 -0000 [thread overview]
Message-ID: <CAFiYyc0Su3E_qu=pucVnPfH2O+9pz+=AJAtGdkgFvgiwLA=7mQ@mail.gmail.com> (raw)
In-Reply-To: <dbeb5c74-3035-71d6-e2d8-49e8b0c46b7f@foss.arm.com>
On Wed, Aug 30, 2017 at 11:46 AM, Jackson Woodruff
<jackson.woodruff@foss.arm.com> wrote:
> On 08/29/2017 01:13 PM, Richard Biener wrote:
>>
>> On Tue, Aug 29, 2017 at 1:35 PM, Jackson Woodruff
>> <jackson.woodruff@foss.arm.com> wrote:
>>>
>>> Hi all,
>>>
>>> Apologies again to those CC'ed, who (again) received this twice.
>>>
>>> Joseph: Yes you are correct. I misread the original thread, now fixed.
>>>
>>> Richard: I've moved the optimizations out of fold-const.c. One has been
>>> replicated in match.pd, and the other (x / C +- y / C -> (x +- y) / C)
>>> I've
>>> deleted as it only introduced a new optimization when running with the
>>> flags
>>> '-O0 -funsafe-math-optimizations'.
>>
>>
>> Hmm, how did you verify that, that it only adds sth with -O0
>> -funsafe-math-optimizations?
>
>
> By checking with various flags, although not exhaustively. I looked for
> reasons for the behavior in match.pd (explained below).
>
> I have also since discovered that the combinations of
> '-funsafe-math-optimizations -frounding-math' (at all levels) and
> '-fno-recriprocal-math -funsafe-math-optimizations' mean this pattern adds
> something.
>
>> Is that because in GIMPLE the reassoc pass should do this transform?
>
> It's because the pattern that changes (X / C) -> X * (1 / C) is gated with
> O1:
>
> (for cst (REAL_CST COMPLEX_CST VECTOR_CST)
> (simplify
> (rdiv @0 cst@1)
> -> (if (optimize)
> -> (if (flag_reciprocal_math
> && !real_zerop (@1))
> (with
> { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @1);
> }
> (if (tem)
> (mult @0 { tem; } )))
> (if (cst != COMPLEX_CST)
> (with { tree inverse = exact_inverse (type, @1); }
> (if (inverse)
> (mult @0 { inverse; } ))))))))
>
>
> I've flagged the two lines that are particularly relevant to this.
So this means we go x / (C * y) -> (x / C) / y -> (x * (1/C)) / y
why's that in any way preferable? I suppose this is again to enable
the recip pass to detect / y (as opposed to / (C * y))? What's the
reason to believe that / y is more "frequent"?
> Removing this pattern, as I would expect, means that the divisions in the
> above optimization (and the one further down) are not removed.
>
> So then there is the question of edge cases. This pattern is (ignoring the
> second case) going to fail when const_binop returns null. Looking through
> that function says that it will fail (for reals) when:
>
> - Either argument is null (not the case)
>
> - The operation is not in the list (PLUS_EXPR,
> MINUS_EXPR, MULT_EXPR, RDIV_EXPR, MIN_EXPR, MAX_EXPR)
> (again not the case)
>
> - We honor Signalling NaNs and one of the operands is a sNaN.
>
> - The operation is a division, and the second argument is zero
> and dividing by 0.0 raises an exception.
>
> - The result is infinity and neither of the operands were infinity
> and flag_trapping_math is set.
>
> - The result isn't exact and flag_rounding_math is set.
>
>
> For (x / ( y * C) -> (x / C) / y), I will add some gating for each of these
> so that the pattern is never executed if one of these would be the case.
Why not transform this directly to (x * (1/C)) / y then (and only then)?
That makes it obvious not two divisions prevail.
That said, I'm questioning this canonicalization. I can come up with a
testcase where it makes things worse:
tem = x / (y * C);
tem2 = z / (y * C);
should generate
rdivtmp = 1 / (y*C);
tem = x *rdivtmp;
tem2= z * rdivtmp;
instead of
rdivtmp = 1/y;
tem = x * 1/C * rdivtmp;
tem2 = z * 1/C * rdivtmp;
> The additional cases where this isn't converted to a multiplication by the
> reciprocal appear to be when -freciprocal-math is used, but we don't have
> -fno-rounding-math, or funsafe-math-optimizations.
>
> For the removal of (x / C +- y / C -> (x +- y) / C), there is a trade-off of
> whether it is worth having an extra pattern for these edge cases. On this
> I'm not sure.
Well, at least leave it in fold-const.c if you can't move it.
>>
>> You added
>>
>> +/* Simplify x / (- y) to -x / y. */
>> +(simplify
>> + (rdiv @0 (negate @1))
>> + (rdiv (negate @0) @1))
>
>
> Sorry, this was unclear, the changes to fold const should be split up from
> the additions to match.pd.
>
> This was one of the changes discussed before. The idea is to canonicalize
> this so that y can be extracted in the cse_reciprocals pass.
As said elsewhere I think cse_reciprocals needs a rewrite to not rely on
arbitrary canonicalization to do its work but consider association
opportunities globally with a focus on CSE.
>
>>
>> for
>>
>> /* (-A) / (-B) -> A / B */
>> if (TREE_CODE (arg0) == NEGATE_EXPR && negate_expr_p (arg1))
>> return fold_build2_loc (loc, RDIV_EXPR, type,
>> TREE_OPERAND (arg0, 0),
>> negate_expr (arg1));
>> if (TREE_CODE (arg1) == NEGATE_EXPR && negate_expr_p (arg0))
>> return fold_build2_loc (loc, RDIV_EXPR, type,
>> negate_expr (arg0),
>> TREE_OPERAND (arg1, 0));
>>
>> presumably? This isn't equivalent. It's more like
>>
>> (simplify
>> (rdiv (negate_expr_p @0) (negate @1))
>> (rdiv (negate @0) @1))
>> (simplify
>> (rdiv (negate @0) (negate_expr_p @1))
>> (rdiv @0 (negate @1)))
>>
>> and you should remove the corresponding fold-const.c code.
>>
>> Please do these changes independently to aid bisecting in case of issues.
>
>
> I'll resubmit two different patches when I can get them separated. It should
> also make it easier to review when it is clearer what belongs where.
Thanks.
>>
>> (if (flag_reciprocal_math)
>> - /* Convert (A/B)/C to A/(B*C) */
>> + /* Convert (A/B)/C to A/(B*C) where neither B nor C are constant. */
>> (simplify
>> (rdiv (rdiv:s @0 @1) @2)
>> - (rdiv @0 (mult @1 @2)))
>> + (if (TREE_CODE (@1) != REAL_CST && TREE_CODE (@1) != REAL_CST)
>> + (rdiv @0 (mult @1 @2))))
>>
>> why? I guess to avoid ping-poning with
>
>
> Yes, although there is a mistake there. It should be:
>
> (if (TREE_CODE (@1) != REAL_CST && TREE_CODE (@2) != REAL_CST)
>
> I'll fix that up.
>
>>
>> + /* Simplify x / (C * y) to (x / C) / y where C is a constant. */
>> + (simplify
>> + (rdiv @0
>> + (mult @1 REAL_CST@2))
>> + (rdiv (rdiv @0 @2) @1))
>>
>> ? If so why not just disallow for @1 == REAL_CST?
>
>
> The ping-ponging issue is there even when only one of the operands is a
> constant (which of course it has to be for this pattern to apply). For
> example, something like x / (y * 3), with '-O0 -ffast-math',
> when the reciprocal isn't computed, ping pongs back and forth.
>
> I'm not sure that it can be fixed without a condition on the pattern already
> there.
>
>
>>
>>> On O1 and up, the pattern that replaces 'x / C' with 'x * (1 / C)'
>>> is enabled and then the pattern is covered by that and
>>> (x * C +- y * C -> C * (x +- y)) (which is already present in match.pd)
>>>
>>> I have also updated the testcase for those optimizations to use 'O1' to
>>> avoid that case.
>>>
>>>
>>> On 08/24/2017 10:06 PM, Jeff Law wrote:
>>>>
>>>>
>>>> On 08/17/2017 03:55 AM, Wilco Dijkstra wrote:
>>>>>
>>>>>
>>>>> Richard Biener wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 15, 2017 at 4:11 PM, Wilco Dijkstra
>>>>>> <Wilco.Dijkstra@arm.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Richard Biener wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We also change the association of
>>>>>>>>>
>>>>>>>>> x / (y * C) -> (x / C) / y
>>>>>>>>>
>>>>>>>>> If C is a constant.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Why's that profitable?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> It enables (x * C1) / (y * C2) -> (x * C1/C2) / y for example.
>>>>>>> Also 1/y is now available to the reciprocal optimization, see
>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71026 for details.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Sure, but on its own it's going to be slower. So this isn't the
>>>>>> correct way to enable those followup transforms.
>>>>>
>>>>>
>>>>>
>>>>> How can it be any slower? It's one division and one multiply in both
>>>>> cases.
>>>>
>>>>
>>>> x / (y * C) -> (x / C) / y
>>>>
>>>> Goes from one division and one multiplication to two divisions. I'm
>>>> guessing that's what Richi is (reasonably) concerned about.
>>>>
>>>> So it may be the case that we need more complex pattern matching here
>>>> for when to perform the first transformation to ultimately enable the
>>>> second.
>>>>
>>>
>>> The only case where we don't remove the division but still execute this
>>> pattern is when run with'-O0 -freciprocal-math'.
>>>
>>> As long as we have 'O1' or greater (and -freciprocal-math), that is
>>> enough
>>> to enable the removal of the reciprocal.
>>
>>
>> I don't see this. I presume you mean this happens in the recip pass?
>
>
> It's one of the match.pd patterns that actually does the extraction since C
> is always constant. I've copied in the pattern above in response to one of
> your previous comments.
>
>> But we don't optimize this when optimizing for size (but the above pattern
>> > still applies) or when targetm.min_divisions_for_recip_mul is too large
>
>
> It was my understanding that this is used to specify the number of divisions
> you have to be able to replace with a multiplication before it is worthwhile
> inserting an extra multiplication.
>
> So for situations like this, I think that this is not quite what is being
> measured, because there isn't an extra multiplication being inserted?
>
>
>>
>> So I still think this is the wrong place to do this and instead the recip
>> pass should be extended.
>>
>>
>>>
>>> Jackson.
>>>
>>>>
>>>> Jeff
>>>>
>>>
>
next prev parent reply other threads:[~2017-08-30 12:46 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-10 14:11 Jackson Woodruff
2017-08-10 15:26 ` Jackson Woodruff
2017-08-11 0:15 ` Joseph Myers
2017-08-15 14:22 ` Richard Biener
2017-08-15 14:43 ` Wilco Dijkstra
2017-08-17 10:20 ` Richard Biener
2017-08-17 10:29 ` Wilco Dijkstra
2017-08-17 10:39 ` Richard Biener
2017-08-17 12:46 ` Joseph Myers
2017-08-24 21:26 ` Jeff Law
2017-08-29 12:13 ` Jackson Woodruff
2017-08-29 13:31 ` Richard Biener
2017-08-30 10:45 ` Jackson Woodruff
2017-08-30 13:26 ` Richard Biener [this message]
2017-09-06 9:55 ` Jackson Woodruff
2017-09-13 18:37 ` Jeff Law
2017-09-13 21:20 ` Wilco Dijkstra
2017-09-15 12:07 ` Richard Biener
2017-09-15 16:50 ` Jeff Law
2017-09-16 21:39 ` Bernhard Reutner-Fischer
2017-10-13 18:18 ` Jeff Law
2017-10-13 18:53 ` Wilco Dijkstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFiYyc0Su3E_qu=pucVnPfH2O+9pz+=AJAtGdkgFvgiwLA=7mQ@mail.gmail.com' \
--to=richard.guenther@gmail.com \
--cc=Wilco.Dijkstra@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jackson.woodruff@foss.arm.com \
--cc=joseph@codesourcery.com \
--cc=kyrylo.tkachov@foss.arm.com \
--cc=law@redhat.com \
--cc=nd@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).