public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] Analysis of PR105586 and possible approaches to fix issue
@ 2022-07-26 18:54 Surya Kumari Jangala
  2022-07-27  6:58 ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Surya Kumari Jangala @ 2022-07-26 18:54 UTC (permalink / raw)
  To: gcc

Hi,
I am working on PR105586. This is a -fcompare-debug failure, with the differences starting during sched1 pass. The sequence of two instructions in a basic block (block 4) is flipped with -g.
In addition to this, another difference is that an insn is assigned a different cycle in debug vs non-debug modes.
More specifically, the 2nd instruction in basic block 4 is assigned to cycle 0 w/o -g but to cycle 1 w/ -g. I felt that this could be resulting in the flipping of the later insns in the bb, so I started to investigate the difference in cycle assignment.

In the routine schedule_block(), after scheduling an insn(schedule_insn()), prune_ready_list() is called if the ready list is not empty. This routine goes thru all the insns in the ready list and for each insn it checks if there is a state transition. If there is no state transition, then INSN_TICK(insn) is set to current_cycle+1.

After scheduling the first insn in bb 4, when prune_ready_list() is called, we see that for the debug mode run, there is no state transition for the second insn and hence it's INSN_TICK is updated. For the non-debug run, a state transition exists and the INSN_TICK is not updated. This was resulting in the second insn being scheduled in cycle 1 in the debug mode, and in cycle 0 in the non-debug mode.

It turned out that the initial dfa state of the basic block (‘init_state’ parameter of schedule_block()) was different in debug and non-debug modes. 

After scheduling a basic block, it’s current dfa state is copied to the fall-thru basic block. In other words, the initial dfa state of the fall thru bb is the current state of the bb that was just scheduled. 

Basic block 4 is the fall-thru bb for basic block 3. In non-debug mode, bb 3 has only a NOTE insn and hence scheduling of bb 3 is skipped. Since bb 3 is not scheduled, it’s state is not copied to bb 4. Whereas in debug mode, bb3 has a NOTE insn and a DEBUG insn. So bb 3 is “scheduled” and it’s dfa state is copied to bb4. [The dfa state of bb 3 is obtained from it’s parent bb, ie, bb 2]. Hence the initial dfa state of bb 4 is different in debug and non-debug modes due to the difference in the insns in the predecessor bb (bb 3).

The routine no_real_insns_p() is called to check if scheduling can be skipped for a basic block. This routine checks for NOTE and LABEL insns and it returns ‘true’ if a basic block contains only NOTE/LABEL insns. Hence, any basic block which has only NOTE or LABEL insns is not scheduled.

To fix the issue of insns being assigned different cycles, there are two possible solutions:

1. Modify no_real_insns_p() to treat a DEBUG insn as a non-real insn (similar to NOTE and LABEL). With this change, bb 3 will not be scheduled in the debug mode (as it contains only NOTE and DEBUG insns). If scheduling is skipped, then bb 3’s state is not copied to bb 4 and the initial dfa state of bb 4 will be same in both debug and non-debug modes
2. Copy dfa state of a basic block to it’s fall-thru block only if the basic block contains ‘real’ insns (ie, it should contain at least one insn which is not a LABEL, NOTE or DEBUG). This will prevent copying of dfa state from bb 3 to bb 4 in debug mode.

I decided to take approach 1 and I changed no_real_insns_p() to check for DEBUG insns in addition to NOTE and LABEL insns.

But this resulted in an internal compiler error in the bug's testcase. The reason was that dependency nodes and lists of the insns in a basic block are not freed after the bb is scheduled. The routine sched_analyze() allocates dependency nodes and lists for each insn in an extended basic block only if the insn is NONDEBUG_INSN or DEBUG_INSN. So in debug mode, the scheduler allocated dependencies for bb 3 but in non-debug mode, there are no dependencies allocated. The dependencies are freed after all the blocks in a region are scheduled. But the routine to free the dependencies is called for a bb only if no_real_insns_p() returns true for that bb. With approach 1, no_real_insns_p() returns true for bb 3 and hence the dependencies are not freed.

I added some code to not create dependencies if a bb contains only NOTE, LABEL and DEBUG insns. This makes the test pass but I am hitting an internal compiler error during bootstrapping.

I wish to get some inputs/feedback if approach 1 is the correct way to fix the issue, or should I take approach 2.

Thanks,
Surya

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Analysis of PR105586 and possible approaches to fix issue
  2022-07-26 18:54 [RFC] Analysis of PR105586 and possible approaches to fix issue Surya Kumari Jangala
@ 2022-07-27  6:58 ` Richard Biener
  2022-07-27 14:10   ` Surya Kumari Jangala
  2022-07-30 20:01   ` Jeff Law
  0 siblings, 2 replies; 4+ messages in thread
From: Richard Biener @ 2022-07-27  6:58 UTC (permalink / raw)
  To: Surya Kumari Jangala; +Cc: GCC Development

On Tue, Jul 26, 2022 at 8:55 PM Surya Kumari Jangala via Gcc
<gcc@gcc.gnu.org> wrote:
>
> Hi,
> I am working on PR105586. This is a -fcompare-debug failure, with the differences starting during sched1 pass. The sequence of two instructions in a basic block (block 4) is flipped with -g.
> In addition to this, another difference is that an insn is assigned a different cycle in debug vs non-debug modes.
> More specifically, the 2nd instruction in basic block 4 is assigned to cycle 0 w/o -g but to cycle 1 w/ -g. I felt that this could be resulting in the flipping of the later insns in the bb, so I started to investigate the difference in cycle assignment.
>
> In the routine schedule_block(), after scheduling an insn(schedule_insn()), prune_ready_list() is called if the ready list is not empty. This routine goes thru all the insns in the ready list and for each insn it checks if there is a state transition. If there is no state transition, then INSN_TICK(insn) is set to current_cycle+1.
>
> After scheduling the first insn in bb 4, when prune_ready_list() is called, we see that for the debug mode run, there is no state transition for the second insn and hence it's INSN_TICK is updated. For the non-debug run, a state transition exists and the INSN_TICK is not updated. This was resulting in the second insn being scheduled in cycle 1 in the debug mode, and in cycle 0 in the non-debug mode.
>
> It turned out that the initial dfa state of the basic block (‘init_state’ parameter of schedule_block()) was different in debug and non-debug modes.
>
> After scheduling a basic block, it’s current dfa state is copied to the fall-thru basic block. In other words, the initial dfa state of the fall thru bb is the current state of the bb that was just scheduled.
>
> Basic block 4 is the fall-thru bb for basic block 3. In non-debug mode, bb 3 has only a NOTE insn and hence scheduling of bb 3 is skipped. Since bb 3 is not scheduled, it’s state is not copied to bb 4. Whereas in debug mode, bb3 has a NOTE insn and a DEBUG insn. So bb 3 is “scheduled” and it’s dfa state is copied to bb4. [The dfa state of bb 3 is obtained from it’s parent bb, ie, bb 2]. Hence the initial dfa state of bb 4 is different in debug and non-debug modes due to the difference in the insns in the predecessor bb (bb 3).
>
> The routine no_real_insns_p() is called to check if scheduling can be skipped for a basic block. This routine checks for NOTE and LABEL insns and it returns ‘true’ if a basic block contains only NOTE/LABEL insns. Hence, any basic block which has only NOTE or LABEL insns is not scheduled.
>
> To fix the issue of insns being assigned different cycles, there are two possible solutions:
>
> 1. Modify no_real_insns_p() to treat a DEBUG insn as a non-real insn (similar to NOTE and LABEL). With this change, bb 3 will not be scheduled in the debug mode (as it contains only NOTE and DEBUG insns). If scheduling is skipped, then bb 3’s state is not copied to bb 4 and the initial dfa state of bb 4 will be same in both debug and non-debug modes
> 2. Copy dfa state of a basic block to it’s fall-thru block only if the basic block contains ‘real’ insns (ie, it should contain at least one insn which is not a LABEL, NOTE or DEBUG). This will prevent copying of dfa state from bb 3 to bb 4 in debug mode.

Do you know why the DFA state is not always copied to the fallthru
destination and then copied further even if the block does not contain
any (real) insns?  It somewhat sounds like premature optimization
breaking things here...

> I decided to take approach 1 and I changed no_real_insns_p() to check for DEBUG insns in addition to NOTE and LABEL insns.
>
> But this resulted in an internal compiler error in the bug's testcase. The reason was that dependency nodes and lists of the insns in a basic block are not freed after the bb is scheduled. The routine sched_analyze() allocates dependency nodes and lists for each insn in an extended basic block only if the insn is NONDEBUG_INSN or DEBUG_INSN. So in debug mode, the scheduler allocated dependencies for bb 3 but in non-debug mode, there are no dependencies allocated. The dependencies are freed after all the blocks in a region are scheduled. But the routine to free the dependencies is called for a bb only if no_real_insns_p() returns true for that bb. With approach 1, no_real_insns_p() returns true for bb 3 and hence the dependencies are not freed.
>
> I added some code to not create dependencies if a bb contains only NOTE, LABEL and DEBUG insns. This makes the test pass but I am hitting an internal compiler error during bootstrapping.
>
> I wish to get some inputs/feedback if approach 1 is the correct way to fix the issue, or should I take approach 2.
>
> Thanks,
> Surya

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Analysis of PR105586 and possible approaches to fix issue
  2022-07-27  6:58 ` Richard Biener
@ 2022-07-27 14:10   ` Surya Kumari Jangala
  2022-07-30 20:01   ` Jeff Law
  1 sibling, 0 replies; 4+ messages in thread
From: Surya Kumari Jangala @ 2022-07-27 14:10 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development

Hi Richard,

On 27/07/22 12:28 pm, Richard Biener wrote:
> On Tue, Jul 26, 2022 at 8:55 PM Surya Kumari Jangala via Gcc
> <gcc@gcc.gnu.org> wrote:

>> To fix the issue of insns being assigned different cycles, there are two possible solutions:
>>
>> 1. Modify no_real_insns_p() to treat a DEBUG insn as a non-real insn (similar to NOTE and LABEL). With this change, bb 3 will not be scheduled in the debug mode (as it contains only NOTE and DEBUG insns). If scheduling is skipped, then bb 3’s state is not copied to bb 4 and the initial dfa state of bb 4 will be same in both debug and non-debug modes
>> 2. Copy dfa state of a basic block to it’s fall-thru block only if the basic block contains ‘real’ insns (ie, it should contain at least one insn which is not a LABEL, NOTE or DEBUG). This will prevent copying of dfa state from bb 3 to bb 4 in debug mode.
> 
> Do you know why the DFA state is not always copied to the fallthru
> destination and then copied further even if the block does not contain

I am not sure why the DFA state is not always copied to the fallthru destination. It is not very apparent from the code.

> any (real) insns?  It somewhat sounds like premature optimization
> breaking things here...
> 

Now that you mention it, yes it does seem suboptimal that the DFA state is not always copied. I don’t see any reason why the DFA state shouldn’t be always copied. And I think this is the fix for this bug. '-g' mode is behaving correctly, it is the non-debug mode which is incorrect.

Thanks,
Surya


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Analysis of PR105586 and possible approaches to fix issue
  2022-07-27  6:58 ` Richard Biener
  2022-07-27 14:10   ` Surya Kumari Jangala
@ 2022-07-30 20:01   ` Jeff Law
  1 sibling, 0 replies; 4+ messages in thread
From: Jeff Law @ 2022-07-30 20:01 UTC (permalink / raw)
  To: gcc



On 7/27/2022 12:58 AM, Richard Biener via Gcc wrote:
> On Tue, Jul 26, 2022 at 8:55 PM Surya Kumari Jangala via Gcc
> <gcc@gcc.gnu.org> wrote:
>> Hi,
>> I am working on PR105586. This is a -fcompare-debug failure, with the differences starting during sched1 pass. The sequence of two instructions in a basic block (block 4) is flipped with -g.
>> In addition to this, another difference is that an insn is assigned a different cycle in debug vs non-debug modes.
>> More specifically, the 2nd instruction in basic block 4 is assigned to cycle 0 w/o -g but to cycle 1 w/ -g. I felt that this could be resulting in the flipping of the later insns in the bb, so I started to investigate the difference in cycle assignment.
>>
>> In the routine schedule_block(), after scheduling an insn(schedule_insn()), prune_ready_list() is called if the ready list is not empty. This routine goes thru all the insns in the ready list and for each insn it checks if there is a state transition. If there is no state transition, then INSN_TICK(insn) is set to current_cycle+1.
>>
>> After scheduling the first insn in bb 4, when prune_ready_list() is called, we see that for the debug mode run, there is no state transition for the second insn and hence it's INSN_TICK is updated. For the non-debug run, a state transition exists and the INSN_TICK is not updated. This was resulting in the second insn being scheduled in cycle 1 in the debug mode, and in cycle 0 in the non-debug mode.
>>
>> It turned out that the initial dfa state of the basic block (‘init_state’ parameter of schedule_block()) was different in debug and non-debug modes.
>>
>> After scheduling a basic block, it’s current dfa state is copied to the fall-thru basic block. In other words, the initial dfa state of the fall thru bb is the current state of the bb that was just scheduled.
>>
>> Basic block 4 is the fall-thru bb for basic block 3. In non-debug mode, bb 3 has only a NOTE insn and hence scheduling of bb 3 is skipped. Since bb 3 is not scheduled, it’s state is not copied to bb 4. Whereas in debug mode, bb3 has a NOTE insn and a DEBUG insn. So bb 3 is “scheduled” and it’s dfa state is copied to bb4. [The dfa state of bb 3 is obtained from it’s parent bb, ie, bb 2]. Hence the initial dfa state of bb 4 is different in debug and non-debug modes due to the difference in the insns in the predecessor bb (bb 3).
>>
>> The routine no_real_insns_p() is called to check if scheduling can be skipped for a basic block. This routine checks for NOTE and LABEL insns and it returns ‘true’ if a basic block contains only NOTE/LABEL insns. Hence, any basic block which has only NOTE or LABEL insns is not scheduled.
>>
>> To fix the issue of insns being assigned different cycles, there are two possible solutions:
>>
>> 1. Modify no_real_insns_p() to treat a DEBUG insn as a non-real insn (similar to NOTE and LABEL). With this change, bb 3 will not be scheduled in the debug mode (as it contains only NOTE and DEBUG insns). If scheduling is skipped, then bb 3’s state is not copied to bb 4 and the initial dfa state of bb 4 will be same in both debug and non-debug modes
>> 2. Copy dfa state of a basic block to it’s fall-thru block only if the basic block contains ‘real’ insns (ie, it should contain at least one insn which is not a LABEL, NOTE or DEBUG). This will prevent copying of dfa state from bb 3 to bb 4 in debug mode.
> Do you know why the DFA state is not always copied to the fallthru
> destination and then copied further even if the block does not contain
> any (real) insns?  It somewhat sounds like premature optimization
> breaking things here...
Yea (premature optimization) and probably just an oversight thinking 
that a block with no real insns could be totally ignored.

jeff


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-07-30 20:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-26 18:54 [RFC] Analysis of PR105586 and possible approaches to fix issue Surya Kumari Jangala
2022-07-27  6:58 ` Richard Biener
2022-07-27 14:10   ` Surya Kumari Jangala
2022-07-30 20:01   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).