Hi Richard,

On 14/09/16 21:31, Richard Biener wrote:
> On Fri, Sep 2, 2016 at 10:09 AM, Kugan Vivekanandarajah
> <kugan.vivekanandarajah@linaro.org> wrote:
>> Hi Richard,
>>
>> On 25 August 2016 at 22:24, Richard Biener <richard.guenther@gmail.com> wrote:
>>> On Thu, Aug 11, 2016 at 1:09 AM, kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>> Hi,
>>>>
>>>>
>>>> On 10/08/16 20:28, Richard Biener wrote:
>>>>>
>>>>> On Wed, Aug 10, 2016 at 10:57 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>>>
>>>>>> On Wed, Aug 10, 2016 at 08:51:32AM +1000, kugan wrote:
>>>>>>>
>>>>>>> I see it now. The problem is we are just looking at (-1) being in the
>>>>>>> ops
>>>>>>> list for passing changed to rewrite_expr_tree in the case of
>>>>>>> multiplication
>>>>>>> by negate.  If we have combined (-1), as in the testcase, we will not
>>>>>>> have
>>>>>>> the (-1) and will pass changed=false to rewrite_expr_tree.
>>>>>>>
>>>>>>> We should set changed based on what happens in try_special_add_to_ops.
>>>>>>> Attached patch does this. Bootstrap and regression testing are ongoing.
>>>>>>> Is
>>>>>>> this OK for trunk if there is no regression.
>>>>>>
>>>>>>
>>>>>> I think the bug is elsewhere.  In particular in
>>>>>> undistribute_ops_list/zero_one_operation/decrement_power.
>>>>>> All those look problematic in this regard, they change RHS of statements
>>>>>> to something that holds a different value, while keeping the LHS.
>>>>>> So, generally you should instead just add a new stmt next to the old one,
>>>>>> and adjust data structures (replace the old SSA_NAME in some ->op with
>>>>>> the new one).  decrement_power might be a problem here, dunno if all the
>>>>>> builtins are const in all cases that DSE would kill the old one,
>>>>>> Richard, any preferences for that?  reset flow sensitive info + reset
>>>>>> debug
>>>>>> stmt uses, or something different?  Though, replacing the LHS with a new
>>>>>> anonymous SSA_NAME might be needed too, in case it is before SSA_NAME of
>>>>>> a
>>>>>> user var that doesn't yet have any debug stmts.
>>>>>
>>>>>
>>>>> I'd say replacing the LHS is the way to go, with calling the appropriate
>>>>> helper
>>>>> on the old stmt to generate a debug stmt for it / its uses (would need
>>>>> to look it
>>>>> up here).
>>>>>
>>>>
>>>> Here is an attempt to fix it. The problem arises when in
>>>> undistribute_ops_list, we linearize_expr_tree such that NEGATE_EXPR is added
>>>> (-1) MULT_EXPR (OP). Real problem starts when we handle this in
>>>> zero_one_operation. Unlike what was done earlier, we now change the stmt
>>>> (with propagate_op_to_signle use or by directly) such that the value
>>>> computed by stmt is no longer what it used to be. Because of this, what is
>>>> computed in undistribute_ops_list and rewrite_expr_tree are also changed.
>>>>
>>>> undistribute_ops_list already expects this but rewrite_expr_tree will not if
>>>> we dont pass the changed as an argument.
>>>>
>>>> The way I am fixing this now is, in linearize_expr_tree, I set ops_changed
>>>> to true if we change NEGATE_EXPR to (-1) MULT_EXPR (OP). Then when we call
>>>> zero_one_operation with ops_changed = true, I replace all the LHS in
>>>> zero_one_operation with the new SSA and replace all the uses. I also call
>>>> the rewrite_expr_tree with changed = false in this case.
>>>>
>>>> Does this make sense? Bootstrapped and regression tested for
>>>> x86_64-linux-gnu without any new regressions.
>>>
>>> I don't think this solves the issue.  zero_one_operation associates the
>>> chain starting at the first *def and it will change the intermediate values
>>> of _all_ of the stmts visited until the operation to be removed is found.
>>> Note that this is independent of whether try_special_add_to_ops did anything.
>>>
>>> Even for the regular undistribution cases we get this wrong.
>>>
>>> So we need to back-track in zero_one_operation, replacing each LHS
>>> and in the end the op in the opvector of the main chain.  That's basically
>>> the same as if we'd do a regular re-assoc operation on the sub-chains.
>>> Take their subops, simulate zero_one_operation by
>>> appending the cancelling operation and optimizing the oplist, and then
>>> materializing the associated ops via rewrite_expr_tree.
>>>
>> Here is a draft patch which records the stmt chain when in
>> zero_one_operation and then fixes it when OP is removed. when we
>> update *def, that will update the ops vector. Does this looks sane?
>
> Yes.  A few comments below
>
> +  /* PR72835 - Record the stmt chain that has to be updated such that
> +     we dont use the same LHS when the values computed are different.  */
> +  auto_vec<gimple *> stmts_to_fix;
>
> use auto_vec<gimple *, 64> here so we get stack allocation only most
> of the times
Done.

>           if (stmt_is_power_of_op (stmt, op))
>             {
> +             make_new_ssa_for_all_defs (def, op, stmts_to_fix);
>               if (decrement_power (stmt) == 1)
>                 propagate_op_to_single_use (op, stmt, def);
>
> for the cases you end up with propagate_op_to_single_use its argument
> stmt is handled superfluosly in the new SSA making, I suggest to pop it
> from the stmts_to_fix vector in that case.  I suggest to break; instead
> of return in all cases and do the make_new_ssa_for_all_defs call at
> the function end instead.
>
Done.

> @@ -1253,14 +1305,18 @@ zero_one_operation (tree *def, enum tree_code
> opcode, tree op)
>               if (gimple_assign_rhs1 (stmt2) == op)
>                 {
>                   tree cst = build_minus_one_cst (TREE_TYPE (op));
> +                 stmts_to_fix.safe_push (stmt2);
> +                 make_new_ssa_for_all_defs (def, op, stmts_to_fix);
>                   propagate_op_to_single_use (cst, stmt2, def);
>                   return;
>
> this safe_push should be unnecessary for the above reason (others are
> conditionally unnecessary).
>
Done.

Bootstrapped and regression tested on X86_64-linux-gnu with no new 
regression. Is this OK?

Thanks,
Kugan

> I thought about simplifying the whole thing by instead of clearing an
> op from the chain pre-pend
> one that does the job by means of visiting the chain from reassoc
> itself but that doesn't work out
> for RDIV_EXPR nor does it play well with undistribute handling
> mutliple opportunities on the same
> chain.
>
> Thanks,
> Richard.
>
>
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu with no new regressions.
>>
>> Thanks,
>> Kugan