public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Dependency confusion in sched-deps
@ 2013-11-07 12:48 Paulo Matos
  2013-12-05  0:39 ` Maxim Kuvyrkov
  0 siblings, 1 reply; 8+ messages in thread
From: Paulo Matos @ 2013-11-07 12:48 UTC (permalink / raw)
  To: gcc

Hello,

I am slightly unsure if the confusion is in the dependencies or it's my confusion.

I have tracked this strange behaviour which only occurs when we need to flush pending instructions due to the pending list becoming too large (gcc 4.8, haven't tried with trunk).

I have two stores: 
85: st zr, [r12] # zr is the zero register
90: st zr, [r18]

While analysing dependencies for `st zr, [r12]`, we notice that pending list is too large in sched_analyze_1 and call flush_pending_lists (deps, insn, false, true).

This in turn causes the last_pending_memory_flush to be set to:
(insn_list:REG_DEP_TRUE 85 (nil))

When insn 90 is analyzed next, it skips the flushing bit since the pending lists had just been flushed and enters the else bit where it does:
	  add_dependence_list (insn, deps->last_pending_memory_flush, 1,
			       REG_DEP_ANTI, true);

This adds the dependency: 90 has an anti-dependency to 85.
I think this should be a true dependency (write after write). It even says so in the list of last_pending_memory_flush, however add_dependence_list function ignored this and uses the dep_type passed: REG_DEP_ANTI.

Is anti the correct dependence? Why?

-- 
Paulo Matos



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dependency confusion in sched-deps
  2013-11-07 12:48 Dependency confusion in sched-deps Paulo Matos
@ 2013-12-05  0:39 ` Maxim Kuvyrkov
  2013-12-05 15:25   ` Michael Matz
  2013-12-05 19:44   ` shmeel gutl
  0 siblings, 2 replies; 8+ messages in thread
From: Maxim Kuvyrkov @ 2013-12-05  0:39 UTC (permalink / raw)
  To: Paulo Matos; +Cc: gcc

On 8/11/2013, at 1:48 am, Paulo Matos <pmatos@broadcom.com> wrote:

> Hello,
> 
> I am slightly unsure if the confusion is in the dependencies or it's my confusion.
> 
> I have tracked this strange behaviour which only occurs when we need to flush pending instructions due to the pending list becoming too large (gcc 4.8, haven't tried with trunk).
> 
> I have two stores: 
> 85: st zr, [r12] # zr is the zero register
> 90: st zr, [r18]
> 
> While analysing dependencies for `st zr, [r12]`, we notice that pending list is too large in sched_analyze_1 and call flush_pending_lists (deps, insn, false, true).
> 
> This in turn causes the last_pending_memory_flush to be set to:
> (insn_list:REG_DEP_TRUE 85 (nil))
> 
> When insn 90 is analyzed next, it skips the flushing bit since the pending lists had just been flushed and enters the else bit where it does:
> 	  add_dependence_list (insn, deps->last_pending_memory_flush, 1,
> 			       REG_DEP_ANTI, true);
> 
> This adds the dependency: 90 has an anti-dependency to 85.
> I think this should be a true dependency (write after write). It even says so in the list of last_pending_memory_flush, however add_dependence_list function ignored this and uses the dep_type passed: REG_DEP_ANTI.
> 
> Is anti the correct dependence? Why?

Output dependency is the right type (write after write).  Anti dependency is write after read, and true dependency is read after write.

Dependency type plays a role for estimating costs and latencies between instructions (which affects performance), but using wrong or imprecise dependency type does not affect correctness.  Dependency flush is a force-major occurrence during compilation, and developers tend not to spend too much time on coding best possible handling for these [hopefully] rare occurrences.

Anti dependency is a good guess for dependency type between two memory instructions.  In the above particular case it is wrong, and, I imagine, this causes a performance problem for you.  You can add better handling of this situation by remembering whether last_pending_memory_flush is memory read or memory write and then use it to select correct dependency type for insn 90: output, anti or true.

Let me know whether you want to pursue this and I can help with advice and patch review.

Thanks, 

--
Maxim Kuvyrkov
www.kugelworks.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dependency confusion in sched-deps
  2013-12-05  0:39 ` Maxim Kuvyrkov
@ 2013-12-05 15:25   ` Michael Matz
  2013-12-05 23:08     ` Maxim Kuvyrkov
  2013-12-05 19:44   ` shmeel gutl
  1 sibling, 1 reply; 8+ messages in thread
From: Michael Matz @ 2013-12-05 15:25 UTC (permalink / raw)
  To: Maxim Kuvyrkov; +Cc: Paulo Matos, gcc

Hi,

On Thu, 5 Dec 2013, Maxim Kuvyrkov wrote:

> Output dependency is the right type (write after write).  Anti 
> dependency is write after read, and true dependency is read after write.
> 
> Dependency type plays a role for estimating costs and latencies between 
> instructions (which affects performance), but using wrong or imprecise 
> dependency type does not affect correctness.

In the context of GCC and the middle ends memory model this statement is 
not correct.  For some dependency types we're using type based aliasing to 
disambiguate, i.e ignore that dependency, which for others we don't.  In 
particular a read-after-write memory-access dependency can be ignored if 
type info says they can't alias (because a program where both _would_ 
access the same memory would be invalid according to our mem model), but 
for write-after-read or write-after-write we cannot do that disambiguation 
(because the last write overrides the dynamic type of the memory cell even 
if it was incompatible with the one before).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dependency confusion in sched-deps
  2013-12-05  0:39 ` Maxim Kuvyrkov
  2013-12-05 15:25   ` Michael Matz
@ 2013-12-05 19:44   ` shmeel gutl
  2013-12-05 23:34     ` Maxim Kuvyrkov
  1 sibling, 1 reply; 8+ messages in thread
From: shmeel gutl @ 2013-12-05 19:44 UTC (permalink / raw)
  To: gcc

On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote:
> Dependency type plays a role for estimating costs and latencies between instructions (which affects performance), but using wrong or imprecise dependency type does not affect correctness.
On multi-issue architectures it does make a difference. Anti dependence 
permits the two instructions to be issued during the same cycle whereas 
true dependency and output dependency would forbid this.

Or am I misinterpreting your comment?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dependency confusion in sched-deps
  2013-12-05 15:25   ` Michael Matz
@ 2013-12-05 23:08     ` Maxim Kuvyrkov
  0 siblings, 0 replies; 8+ messages in thread
From: Maxim Kuvyrkov @ 2013-12-05 23:08 UTC (permalink / raw)
  To: Michael Matz; +Cc: Paulo Matos, gcc

On 6/12/2013, at 4:25 am, Michael Matz <matz@suse.de> wrote:

> Hi,
> 
> On Thu, 5 Dec 2013, Maxim Kuvyrkov wrote:
> 
>> Output dependency is the right type (write after write).  Anti 
>> dependency is write after read, and true dependency is read after write.
>> 
>> Dependency type plays a role for estimating costs and latencies between 
>> instructions (which affects performance), but using wrong or imprecise 
>> dependency type does not affect correctness.
> 
> In the context of GCC and the middle ends memory model this statement is 
> not correct.  For some dependency types we're using type based aliasing to 
> disambiguate, i.e ignore that dependency, which for others we don't.  In 
> particular a read-after-write memory-access dependency can be ignored if 
> type info says they can't alias (because a program where both _would_ 
> access the same memory would be invalid according to our mem model), but 
> for write-after-read or write-after-write we cannot do that disambiguation 
> (because the last write overrides the dynamic type of the memory cell even 
> if it was incompatible with the one before).

Yes, this is correct for dependencies between memory locations in the general context of GCC.  [Below clarifications are for Paolo's benefit and anyone else's who wants to find out how GCC scheduling works.]

Scheduler dependency analysis is a user of the aforementioned alias analysis and it simply won't create a dependency between instructions if alias analysis tells it that it is OK to do so.  In the context of scheduler, the dependencies (and their types) are between instructions, not individual registers or memory locations.  The mere fact of two instructions having a dependency of any kind will make the scheduler produce correct code.  The difference between two instructions having true vs anti vs output dependency will manifest itself in how close the 2nd instruction will be issued to the 1st one.

Furthermore, when two instructions have dependencies on several items (e.g., both on register and on memory location), the resulting dependency type is set to the greater of dependency types of all dependent items: true-dependency having most weight, followed by anti-dependency, followed by output-dependency.

Consider instructions

[r1] = r2
r1 = [r2]

The scheduler dependency analysis will find an anti-dependency on r1 and true-dependency on memory locations (assuming [r1] and [r2] may alias).  The resulting dependency between instructions will be true-dependency and the instructions will be scheduled several cycles apart.  However, one might argue that [r1] and [r2] are unlikely to alias and scheduling these instructions back-to-back (downgrading dependency type from true to anti) would produce better code on average.  This is one of countless improvements that could be made to GCC scheduler.

--
Maxim Kuvyrkov
www.kugelworks.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dependency confusion in sched-deps
  2013-12-05 19:44   ` shmeel gutl
@ 2013-12-05 23:34     ` Maxim Kuvyrkov
  2013-12-06  8:44       ` shmeel gutl
  0 siblings, 1 reply; 8+ messages in thread
From: Maxim Kuvyrkov @ 2013-12-05 23:34 UTC (permalink / raw)
  To: shmeel gutl; +Cc: gcc

On 6/12/2013, at 8:44 am, shmeel gutl <shmeelgutl@shmuelhome.mine.nu> wrote:

> On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote:
>> Dependency type plays a role for estimating costs and latencies between instructions (which affects performance), but using wrong or imprecise dependency type does not affect correctness.
> On multi-issue architectures it does make a difference. Anti dependence permits the two instructions to be issued during the same cycle whereas true dependency and output dependency would forbid this.
> 
> Or am I misinterpreting your comment?

On VLIW-flavoured machines without resource conflict checking -- "yes", it is critical not to use anti dependency where an output or true dependency exist.  This is the case though, only because these machines do not follow sequential semantics for instruction execution (i.e., effects from previous instructions are not necessarily observed by subsequent instructions on the same/close cycles.

On machines with internal resource conflict checking having a wrong type on the dependency should not cause wrong behavior, but "only" suboptimal performance.

Thank you,

--
Maxim Kuvyrkov
www.kugelworks.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dependency confusion in sched-deps
  2013-12-05 23:34     ` Maxim Kuvyrkov
@ 2013-12-06  8:44       ` shmeel gutl
  2013-12-10 21:55         ` Maxim Kuvyrkov
  0 siblings, 1 reply; 8+ messages in thread
From: shmeel gutl @ 2013-12-06  8:44 UTC (permalink / raw)
  To: gcc

On 06-Dec-13 01:34 AM, Maxim Kuvyrkov wrote:
> On 6/12/2013, at 8:44 am, shmeel gutl <shmeelgutl@shmuelhome.mine.nu> wrote:
>
>> On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote:
>>> Dependency type plays a role for estimating costs and latencies between instructions (which affects performance), but using wrong or imprecise dependency type does not affect correctness.
>> On multi-issue architectures it does make a difference. Anti dependence permits the two instructions to be issued during the same cycle whereas true dependency and output dependency would forbid this.
>>
>> Or am I misinterpreting your comment?
> On VLIW-flavoured machines without resource conflict checking -- "yes", it is critical not to use anti dependency where an output or true dependency exist.  This is the case though, only because these machines do not follow sequential semantics for instruction execution (i.e., effects from previous instructions are not necessarily observed by subsequent instructions on the same/close cycles.
>
> On machines with internal resource conflict checking having a wrong type on the dependency should not cause wrong behavior, but "only" suboptimal performance.
>
> Thank you,
>
> --
> Maxim Kuvyrkov
> www.kugelworks.com
>
Earlier in the thread you wrote
> Output dependency is the right type (write after write).  Anti dependency is write after read, and true dependency is read after write.
Should the code be changed to accommodate vliw machines.. It has been 
there since the module was originally checked into trunk.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dependency confusion in sched-deps
  2013-12-06  8:44       ` shmeel gutl
@ 2013-12-10 21:55         ` Maxim Kuvyrkov
  0 siblings, 0 replies; 8+ messages in thread
From: Maxim Kuvyrkov @ 2013-12-10 21:55 UTC (permalink / raw)
  To: shmeel gutl; +Cc: gcc

On 6/12/2013, at 9:44 pm, shmeel gutl <shmeelgutl@shmuelhome.mine.nu> wrote:

> On 06-Dec-13 01:34 AM, Maxim Kuvyrkov wrote:
>> On 6/12/2013, at 8:44 am, shmeel gutl <shmeelgutl@shmuelhome.mine.nu> wrote:
>> 
>>> On 05-Dec-13 02:39 AM, Maxim Kuvyrkov wrote:
>>>> Dependency type plays a role for estimating costs and latencies between instructions (which affects performance), but using wrong or imprecise dependency type does not affect correctness.
>>> On multi-issue architectures it does make a difference. Anti dependence permits the two instructions to be issued during the same cycle whereas true dependency and output dependency would forbid this.
>>> 
>>> Or am I misinterpreting your comment?
>> On VLIW-flavoured machines without resource conflict checking -- "yes", it is critical not to use anti dependency where an output or true dependency exist.  This is the case though, only because these machines do not follow sequential semantics for instruction execution (i.e., effects from previous instructions are not necessarily observed by subsequent instructions on the same/close cycles.
>> 
>> On machines with internal resource conflict checking having a wrong type on the dependency should not cause wrong behavior, but "only" suboptimal performance.
>> 
>> 
...
> Earlier in the thread you wrote
>> Output dependency is the right type (write after write).  Anti dependency is write after read, and true dependency is read after write.
> Should the code be changed to accommodate vliw machines.. It has been there since the module was originally checked into trunk.

The usual solution for VLIW machines is to have assembler split VLIW bundles that have internal dependencies and execute them on different cycles.  The idea is for compiler to strive to do its best to produce code without any internal dependencies, but it is up to assembler to do the final check and fix any occasional problems.  [A good assembler has to do this work anyway to accommodate for mistakes in hand-written assembly.]

The scheduler is expected to produces code with no internal dependencies for VLIW machines 99% of the time.  This 99% effectiveness is good enough since scheduler is often not the last pass that touches code, and subsequent transformations can screw up VLIW bundles anyway.

--
Maxim Kuvyrkov
www.kugelworks.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-12-10 21:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-07 12:48 Dependency confusion in sched-deps Paulo Matos
2013-12-05  0:39 ` Maxim Kuvyrkov
2013-12-05 15:25   ` Michael Matz
2013-12-05 23:08     ` Maxim Kuvyrkov
2013-12-05 19:44   ` shmeel gutl
2013-12-05 23:34     ` Maxim Kuvyrkov
2013-12-06  8:44       ` shmeel gutl
2013-12-10 21:55         ` Maxim Kuvyrkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).