public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Doubts regarding the _Dependent_ptr keyword
       [not found]                       ` <CAJA7tRbj6eoOj_fr+Nnm1rS=kdc0jPxkb77AoPCFrJEaaPsCSA@mail.gmail.com>
@ 2019-06-25 16:20                         ` Akshat Garg
  2019-07-02  0:29                           ` Akshat Garg
  0 siblings, 1 reply; 32+ messages in thread
From: Akshat Garg @ 2019-06-25 16:20 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Paul McKenney, gcc

On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
ramana.gcc@googlemail.com> wrote:

> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com> wrote:
> >
> > As we have some working front-end code for _Dependent_ptr, What should
> we do next? What I understand, we can start adding the library for
> dependent_ptr and its functions for C corresponding to the ones we created
> as C++ template library. Then, after that, we can move on to generating the
> assembly code part.
> >
>
>
> I think the next step is figuring out how to model the Dependent
> pointer information in the IR and figuring out what optimizations to
> allow or not with that information. At this point , I suspect we need
> a plan on record and have the conversation upstream on the lists.
>
> I think we need to put down a plan on record.
>
> Ramana

[CCing gcc mailing list]

So, shall I start looking over the pointer optimizations only and see what
information we may be needed on the same examples in the IR itself?

- Akshat

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-06-25 16:20                         ` Doubts regarding the _Dependent_ptr keyword Akshat Garg
@ 2019-07-02  0:29                           ` Akshat Garg
  2019-07-02  0:59                             ` Paul E. McKenney
  2019-07-02 10:22                             ` Ramana Radhakrishnan
  0 siblings, 2 replies; 32+ messages in thread
From: Akshat Garg @ 2019-07-02  0:29 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Paul McKenney, gcc

On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:

> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> ramana.gcc@googlemail.com> wrote:
>
>> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com> wrote:
>> >
>> > As we have some working front-end code for _Dependent_ptr, What should
>> we do next? What I understand, we can start adding the library for
>> dependent_ptr and its functions for C corresponding to the ones we created
>> as C++ template library. Then, after that, we can move on to generating the
>> assembly code part.
>> >
>>
>>
>> I think the next step is figuring out how to model the Dependent
>> pointer information in the IR and figuring out what optimizations to
>> allow or not with that information. At this point , I suspect we need
>> a plan on record and have the conversation upstream on the lists.
>>
>> I think we need to put down a plan on record.
>>
>> Ramana
>
> [CCing gcc mailing list]
>
> So, shall I start looking over the pointer optimizations only and see what
> information we may be needed on the same examples in the IR itself?
>
> - Akshat
>
I have coded an example where equality comparison kills dependency from the
document P0190R4 as shown below :

1. struct rcutest rt = {1, 2, 3};
2. void thread0 ()
3. {
4.     rt.a = -42;
5.     rt.b = -43;
6.     rt.c = -44;
7.     rcu_assign_pointer(gp, &rt);
8. }
9.
10. void thread1 ()
11. {
12.    int i = -1;
13.    int j = -1;
14.    _Dependent_ptr struct rcutest *p;
15.
16.    p = rcu_dereference(gp);
17.    j = p->a;
18.   if (p == &rt)
19.        i = p->b;  /*Dependency breaking point*/
20.   else if(p)
21.       i = p->c;
22.   assert(i<0);
23.   assert(j<0);
24. }
The gimple unoptimized code produced for lines 17-24 is shown below

1. if (p_16 == &rt)
2.     goto <bb 3>; [INV]
3.   else
4.    goto <bb 4>; [INV]
5.
6.  <bb 3> :
7.  i_19 = p_16->b;
8.  goto <bb 6>; [INV]
9.
10.  <bb 4> :
11.  if (p_16 != 0B)
12.    goto <bb 5>; [INV]
13.  else
14.    goto <bb 6>; [INV]
15.
16.  <bb 5> :
17.  i_18 = p_16->c;
18.
19.  <bb 6> :
20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
21.  _3 = i_7 < 0;
22.  _4 = (int) _3;
23.  assert (_4);
24.  _5 = j_17 < 0;
25.  _6 = (int) _5;
26.  assert (_6);
27.  return;

The optimized code after -O1 is applied for the same lines is hown below :

1. if (_2 == &rt)
2.    goto <bb 3>; [30.00%]
3. else
4.    goto <bb 4>; [70.00%]
5.
6.  <bb 3> [local count: 322122547]:
7.   i_12 = rt.b;
8.   goto <bb 6>; [100.00%]
9.
10.  <bb 4> [local count: 751619277]:
11.   if (_1 != 0)
12.   goto <bb 5>; [50.00%]
13.   else
14.    goto <bb 6>; [50.00%]
15.
16.  <bb 5> [local count: 375809638]:
17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
18.
19.   <bb 6> [local count: 1073741824]:
20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
21.   _3 = i_7 < 0;
22.   _4 = (int) _3;
23.   assert (_4);
24.  _5 = j_10 < 0;
25.  _6 = (int) _5;
26.   assert (_6);
27.   return;

Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7
in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
breaks the dependency chain. We need to figure out the pass that does that
and put some handling code in there for the _dependent_ptr qualified
pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
any other option alone does not produce the optimized code that breaks the
dependency. But applying -O1, i.e., allowing all the optimizations does so.
As passes are applied in a certain order, we need to figure out upto what
passes, the code remains same and after what pass the dependency does not
holds. So, we need to check the translated code after every pass.

Does this sounds like a workable plan for ? Let me know your thoughts. If
this sounds good then, we can do this for all the optimizations that may
kill the dependencies at somepoint.

-Akshat

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02  0:29                           ` Akshat Garg
@ 2019-07-02  0:59                             ` Paul E. McKenney
  2019-07-02  1:42                               ` nick
  2019-07-02 15:36                               ` Jason Merrill
  2019-07-02 10:22                             ` Ramana Radhakrishnan
  1 sibling, 2 replies; 32+ messages in thread
From: Paul E. McKenney @ 2019-07-02  0:59 UTC (permalink / raw)
  To: Akshat Garg; +Cc: Ramana Radhakrishnan, gcc

On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:
> 
> > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> > ramana.gcc@googlemail.com> wrote:
> >
> >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com> wrote:
> >> >
> >> > As we have some working front-end code for _Dependent_ptr, What should
> >> we do next? What I understand, we can start adding the library for
> >> dependent_ptr and its functions for C corresponding to the ones we created
> >> as C++ template library. Then, after that, we can move on to generating the
> >> assembly code part.
> >> >
> >>
> >>
> >> I think the next step is figuring out how to model the Dependent
> >> pointer information in the IR and figuring out what optimizations to
> >> allow or not with that information. At this point , I suspect we need
> >> a plan on record and have the conversation upstream on the lists.
> >>
> >> I think we need to put down a plan on record.
> >>
> >> Ramana
> >
> > [CCing gcc mailing list]
> >
> > So, shall I start looking over the pointer optimizations only and see what
> > information we may be needed on the same examples in the IR itself?
> >
> > - Akshat
> >
> I have coded an example where equality comparison kills dependency from the
> document P0190R4 as shown below :
> 
> 1. struct rcutest rt = {1, 2, 3};
> 2. void thread0 ()
> 3. {
> 4.     rt.a = -42;
> 5.     rt.b = -43;
> 6.     rt.c = -44;
> 7.     rcu_assign_pointer(gp, &rt);
> 8. }
> 9.
> 10. void thread1 ()
> 11. {
> 12.    int i = -1;
> 13.    int j = -1;
> 14.    _Dependent_ptr struct rcutest *p;
> 15.
> 16.    p = rcu_dereference(gp);
> 17.    j = p->a;
> 18.   if (p == &rt)
> 19.        i = p->b;  /*Dependency breaking point*/
> 20.   else if(p)
> 21.       i = p->c;
> 22.   assert(i<0);
> 23.   assert(j<0);
> 24. }
> The gimple unoptimized code produced for lines 17-24 is shown below
> 
> 1. if (p_16 == &rt)
> 2.     goto <bb 3>; [INV]
> 3.   else
> 4.    goto <bb 4>; [INV]
> 5.
> 6.  <bb 3> :
> 7.  i_19 = p_16->b;
> 8.  goto <bb 6>; [INV]
> 9.
> 10.  <bb 4> :
> 11.  if (p_16 != 0B)
> 12.    goto <bb 5>; [INV]
> 13.  else
> 14.    goto <bb 6>; [INV]
> 15.
> 16.  <bb 5> :
> 17.  i_18 = p_16->c;
> 18.
> 19.  <bb 6> :
> 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> 21.  _3 = i_7 < 0;
> 22.  _4 = (int) _3;
> 23.  assert (_4);
> 24.  _5 = j_17 < 0;
> 25.  _6 = (int) _5;
> 26.  assert (_6);
> 27.  return;
> 
> The optimized code after -O1 is applied for the same lines is hown below :
> 
> 1. if (_2 == &rt)
> 2.    goto <bb 3>; [30.00%]
> 3. else
> 4.    goto <bb 4>; [70.00%]
> 5.
> 6.  <bb 3> [local count: 322122547]:
> 7.   i_12 = rt.b;
> 8.   goto <bb 6>; [100.00%]
> 9.
> 10.  <bb 4> [local count: 751619277]:
> 11.   if (_1 != 0)
> 12.   goto <bb 5>; [50.00%]
> 13.   else
> 14.    goto <bb 6>; [50.00%]
> 15.
> 16.  <bb 5> [local count: 375809638]:
> 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> 18.
> 19.   <bb 6> [local count: 1073741824]:
> 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> 21.   _3 = i_7 < 0;
> 22.   _4 = (int) _3;
> 23.   assert (_4);
> 24.  _5 = j_10 < 0;
> 25.  _6 = (int) _5;
> 26.   assert (_6);
> 27.   return;

Good show on tracing this through!

> Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7
> in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> breaks the dependency chain. We need to figure out the pass that does that
> and put some handling code in there for the _dependent_ptr qualified
> pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
> any other option alone does not produce the optimized code that breaks the
> dependency. But applying -O1, i.e., allowing all the optimizations does so.
> As passes are applied in a certain order, we need to figure out upto what
> passes, the code remains same and after what pass the dependency does not
> holds. So, we need to check the translated code after every pass.
> 
> Does this sounds like a workable plan for ? Let me know your thoughts. If
> this sounds good then, we can do this for all the optimizations that may
> kill the dependencies at somepoint.

I don't know of a better plan.

My usual question...  Is there some way to script the checking of the
translated code at the end of each pass?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02  0:59                             ` Paul E. McKenney
@ 2019-07-02  1:42                               ` nick
  2019-07-02 15:36                               ` Jason Merrill
  1 sibling, 0 replies; 32+ messages in thread
From: nick @ 2019-07-02  1:42 UTC (permalink / raw)
  To: paulmck, Akshat Garg; +Cc: Ramana Radhakrishnan, gcc



On 2019-07-01 8:59 p.m., Paul E. McKenney wrote:
> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>> On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:
>>
>>> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>>> ramana.gcc@googlemail.com> wrote:
>>>
>>>> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com> wrote:
>>>>>
>>>>> As we have some working front-end code for _Dependent_ptr, What should
>>>> we do next? What I understand, we can start adding the library for
>>>> dependent_ptr and its functions for C corresponding to the ones we created
>>>> as C++ template library. Then, after that, we can move on to generating the
>>>> assembly code part.
>>>>>
>>>>
>>>>
>>>> I think the next step is figuring out how to model the Dependent
>>>> pointer information in the IR and figuring out what optimizations to
>>>> allow or not with that information. At this point , I suspect we need
>>>> a plan on record and have the conversation upstream on the lists.
>>>>
>>>> I think we need to put down a plan on record.
>>>>
>>>> Ramana
>>>
>>> [CCing gcc mailing list]
>>>
>>> So, shall I start looking over the pointer optimizations only and see what
>>> information we may be needed on the same examples in the IR itself?
>>>
>>> - Akshat
>>>
>> I have coded an example where equality comparison kills dependency from the
>> document P0190R4 as shown below :
>>
>> 1. struct rcutest rt = {1, 2, 3};
>> 2. void thread0 ()
>> 3. {
>> 4.     rt.a = -42;
>> 5.     rt.b = -43;
>> 6.     rt.c = -44;
>> 7.     rcu_assign_pointer(gp, &rt);
>> 8. }
>> 9.
>> 10. void thread1 ()
>> 11. {
>> 12.    int i = -1;
>> 13.    int j = -1;
>> 14.    _Dependent_ptr struct rcutest *p;
>> 15.
>> 16.    p = rcu_dereference(gp);
>> 17.    j = p->a;
>> 18.   if (p == &rt)
>> 19.        i = p->b;  /*Dependency breaking point*/
>> 20.   else if(p)
>> 21.       i = p->c;
>> 22.   assert(i<0);
>> 23.   assert(j<0);
>> 24. }
>> The gimple unoptimized code produced for lines 17-24 is shown below
>>
>> 1. if (p_16 == &rt)
>> 2.     goto <bb 3>; [INV]
>> 3.   else
>> 4.    goto <bb 4>; [INV]
>> 5.
>> 6.  <bb 3> :
>> 7.  i_19 = p_16->b;
>> 8.  goto <bb 6>; [INV]
>> 9.
>> 10.  <bb 4> :
>> 11.  if (p_16 != 0B)
>> 12.    goto <bb 5>; [INV]
>> 13.  else
>> 14.    goto <bb 6>; [INV]
>> 15.
>> 16.  <bb 5> :
>> 17.  i_18 = p_16->c;
>> 18.
>> 19.  <bb 6> :
>> 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>> 21.  _3 = i_7 < 0;
>> 22.  _4 = (int) _3;
>> 23.  assert (_4);
>> 24.  _5 = j_17 < 0;
>> 25.  _6 = (int) _5;
>> 26.  assert (_6);
>> 27.  return;
>>
>> The optimized code after -O1 is applied for the same lines is hown below :
>>
>> 1. if (_2 == &rt)
>> 2.    goto <bb 3>; [30.00%]
>> 3. else
>> 4.    goto <bb 4>; [70.00%]
>> 5.
>> 6.  <bb 3> [local count: 322122547]:
>> 7.   i_12 = rt.b;
>> 8.   goto <bb 6>; [100.00%]
>> 9.
>> 10.  <bb 4> [local count: 751619277]:
>> 11.   if (_1 != 0)
>> 12.   goto <bb 5>; [50.00%]
>> 13.   else
>> 14.    goto <bb 6>; [50.00%]
>> 15.
>> 16.  <bb 5> [local count: 375809638]:
>> 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>> 18.
>> 19.   <bb 6> [local count: 1073741824]:
>> 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>> 21.   _3 = i_7 < 0;
>> 22.   _4 = (int) _3;
>> 23.   assert (_4);
>> 24.  _5 = j_10 < 0;
>> 25.  _6 = (int) _5;
>> 26.   assert (_6);
>> 27.   return;
> 
> Good show on tracing this through!
> 
>> Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7
>> in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
>> breaks the dependency chain. We need to figure out the pass that does that
>> and put some handling code in there for the _dependent_ptr qualified
>> pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
>> any other option alone does not produce the optimized code that breaks the
>> dependency. But applying -O1, i.e., allowing all the optimizations does so.
>> As passes are applied in a certain order, we need to figure out upto what
>> passes, the code remains same and after what pass the dependency does not
>> holds. So, we need to check the translated code after every pass.
>>
>> Does this sounds like a workable plan for ? Let me know your thoughts. If
>> this sounds good then, we can do this for all the optimizations that may
>> kill the dependencies at somepoint.
> 
> I don't know of a better plan.
> 
> My usual question...  Is there some way to script the checking of the
> translated code at the end of each pass?
> 
> 							Thanx, Paul
> 

I don't off the top of my head where the documentation is but writing a gcc 
tool to inspect passes if one doesn't exist is the best way forward. A gcc
tool would be exposed to those internals but not sure if it's easy to do
that in the time frame due to the effort required by you or Akshat.

Cheers,

Nick

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02  0:29                           ` Akshat Garg
  2019-07-02  0:59                             ` Paul E. McKenney
@ 2019-07-02 10:22                             ` Ramana Radhakrishnan
  2019-07-02 10:39                               ` Akshat Garg
  1 sibling, 1 reply; 32+ messages in thread
From: Ramana Radhakrishnan @ 2019-07-02 10:22 UTC (permalink / raw)
  To: Akshat Garg; +Cc: Paul McKenney, gcc mailing list

On Tue, Jul 2, 2019 at 1:29 AM Akshat Garg <xkspr7@gmail.com> wrote:
>
> On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:
>>
>> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <ramana.gcc@googlemail.com> wrote:
>>>
>> [CCing gcc mailing list]
>>
>> So, shall I start looking over the pointer optimizations only and see what information we may be needed on the same examples in the IR itself?
>>
>> - Akshat
>
> I have coded an example where equality comparison kills dependency from the document P0190R4 as shown below :
>
> 1. struct rcutest rt = {1, 2, 3};
> 2. void thread0 ()
> 3. {
> 4.     rt.a = -42;
> 5.     rt.b = -43;
> 6.     rt.c = -44;
> 7.     rcu_assign_pointer(gp, &rt);
> 8. }
> 9.
> 10. void thread1 ()
> 11. {
> 12.    int i = -1;
> 13.    int j = -1;
> 14.    _Dependent_ptr struct rcutest *p;
> 15.
> 16.    p = rcu_dereference(gp);
> 17.    j = p->a;
> 18.   if (p == &rt)
> 19.        i = p->b;  /*Dependency breaking point*/
> 20.   else if(p)
> 21.       i = p->c;
> 22.   assert(i<0);
> 23.   assert(j<0);
> 24. }
> The gimple unoptimized code produced for lines 17-24 is shown below
>
> 1. if (p_16 == &rt)
> 2.     goto <bb 3>; [INV]
> 3.   else
> 4.    goto <bb 4>; [INV]
> 5.
> 6.  <bb 3> :
> 7.  i_19 = p_16->b;
> 8.  goto <bb 6>; [INV]
> 9.
> 10.  <bb 4> :
> 11.  if (p_16 != 0B)
> 12.    goto <bb 5>; [INV]
> 13.  else
> 14.    goto <bb 6>; [INV]
> 15.
> 16.  <bb 5> :
> 17.  i_18 = p_16->c;
> 18.
> 19.  <bb 6> :
> 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> 21.  _3 = i_7 < 0;
> 22.  _4 = (int) _3;
> 23.  assert (_4);
> 24.  _5 = j_17 < 0;
> 25.  _6 = (int) _5;
> 26.  assert (_6);
> 27.  return;
>
> The optimized code after -O1 is applied for the same lines is hown below :
>
> 1. if (_2 == &rt)
> 2.    goto <bb 3>; [30.00%]
> 3. else
> 4.    goto <bb 4>; [70.00%]
> 5.
> 6.  <bb 3> [local count: 322122547]:
> 7.   i_12 = rt.b;
> 8.   goto <bb 6>; [100.00%]
> 9.
> 10.  <bb 4> [local count: 751619277]:
> 11.   if (_1 != 0)
> 12.   goto <bb 5>; [50.00%]
> 13.   else
> 14.    goto <bb 6>; [50.00%]
> 15.
> 16.  <bb 5> [local count: 375809638]:
> 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> 18.
> 19.   <bb 6> [local count: 1073741824]:
> 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> 21.   _3 = i_7 < 0;
> 22.   _4 = (int) _3;
> 23.   assert (_4);
> 24.  _5 = j_10 < 0;
> 25.  _6 = (int) _5;
> 26.   assert (_6);
> 27.   return;
>
> Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7 in unoptimized code to i_12 = rt.b; in line 7 in optimized code which breaks the dependency chain. We need to figure out the pass that does that and put some handling code in there for the _dependent_ptr qualified pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or any other option alone does not produce the optimized code that breaks the dependency. But applying -O1, i.e., allowing all the optimizations does so. As passes are applied in a certain order, we need to figure out upto what passes, the code remains same and after what pass the dependency does not holds. So, we need to check the translated code after every pass.
>

It's worth figuring out what passes are doing this - however the worry
I have is that every pass now needs to be handling this case with
respect to pointer attributes. Is there some place that you are
storing said information and what is the transitive nature of
assignments with these attributes ?

regards
Ramana


> Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
>
> -Akshat

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 10:22                             ` Ramana Radhakrishnan
@ 2019-07-02 10:39                               ` Akshat Garg
  2019-07-02 11:01                                 ` Ramana Radhakrishnan
  0 siblings, 1 reply; 32+ messages in thread
From: Akshat Garg @ 2019-07-02 10:39 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Paul McKenney, gcc

On Tue, 2 Jul, 2019, 3:52 PM Ramana Radhakrishnan, <
ramana.gcc@googlemail.com> wrote:

> On Tue, Jul 2, 2019 at 1:29 AM Akshat Garg <xkspr7@gmail.com> wrote:
> >
> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:
> >>
> >> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> ramana.gcc@googlemail.com> wrote:
> >>>
> >> [CCing gcc mailing list]
> >>
> >> So, shall I start looking over the pointer optimizations only and see
> what information we may be needed on the same examples in the IR itself?
> >>
> >> - Akshat
> >
> > I have coded an example where equality comparison kills dependency from
> the document P0190R4 as shown below :
> >
> > 1. struct rcutest rt = {1, 2, 3};
> > 2. void thread0 ()
> > 3. {
> > 4.     rt.a = -42;
> > 5.     rt.b = -43;
> > 6.     rt.c = -44;
> > 7.     rcu_assign_pointer(gp, &rt);
> > 8. }
> > 9.
> > 10. void thread1 ()
> > 11. {
> > 12.    int i = -1;
> > 13.    int j = -1;
> > 14.    _Dependent_ptr struct rcutest *p;
> > 15.
> > 16.    p = rcu_dereference(gp);
> > 17.    j = p->a;
> > 18.   if (p == &rt)
> > 19.        i = p->b;  /*Dependency breaking point*/
> > 20.   else if(p)
> > 21.       i = p->c;
> > 22.   assert(i<0);
> > 23.   assert(j<0);
> > 24. }
> > The gimple unoptimized code produced for lines 17-24 is shown below
> >
> > 1. if (p_16 == &rt)
> > 2.     goto <bb 3>; [INV]
> > 3.   else
> > 4.    goto <bb 4>; [INV]
> > 5.
> > 6.  <bb 3> :
> > 7.  i_19 = p_16->b;
> > 8.  goto <bb 6>; [INV]
> > 9.
> > 10.  <bb 4> :
> > 11.  if (p_16 != 0B)
> > 12.    goto <bb 5>; [INV]
> > 13.  else
> > 14.    goto <bb 6>; [INV]
> > 15.
> > 16.  <bb 5> :
> > 17.  i_18 = p_16->c;
> > 18.
> > 19.  <bb 6> :
> > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> > 21.  _3 = i_7 < 0;
> > 22.  _4 = (int) _3;
> > 23.  assert (_4);
> > 24.  _5 = j_17 < 0;
> > 25.  _6 = (int) _5;
> > 26.  assert (_6);
> > 27.  return;
> >
> > The optimized code after -O1 is applied for the same lines is hown below
> :
> >
> > 1. if (_2 == &rt)
> > 2.    goto <bb 3>; [30.00%]
> > 3. else
> > 4.    goto <bb 4>; [70.00%]
> > 5.
> > 6.  <bb 3> [local count: 322122547]:
> > 7.   i_12 = rt.b;
> > 8.   goto <bb 6>; [100.00%]
> > 9.
> > 10.  <bb 4> [local count: 751619277]:
> > 11.   if (_1 != 0)
> > 12.   goto <bb 5>; [50.00%]
> > 13.   else
> > 14.    goto <bb 6>; [50.00%]
> > 15.
> > 16.  <bb 5> [local count: 375809638]:
> > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> > 18.
> > 19.   <bb 6> [local count: 1073741824]:
> > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> > 21.   _3 = i_7 < 0;
> > 22.   _4 = (int) _3;
> > 23.   assert (_4);
> > 24.  _5 = j_10 < 0;
> > 25.  _6 = (int) _5;
> > 26.   assert (_6);
> > 27.   return;
> >
> > Statement 19 in the program gets converted from  i_19 = p_16->b; in line
> 7 in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> breaks the dependency chain. We need to figure out the pass that does that
> and put some handling code in there for the _dependent_ptr qualified
> pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
> any other option alone does not produce the optimized code that breaks the
> dependency. But applying -O1, i.e., allowing all the optimizations does so.
> As passes are applied in a certain order, we need to figure out upto what
> passes, the code remains same and after what pass the dependency does not
> holds. So, we need to check the translated code after every pass.
> >
>
> It's worth figuring out what passes are doing this - however the worry
> I have is that every pass now needs to be handling this case with
> respect to pointer attributes. Is there some place that you are
> storing said information and what is the transitive nature of
> assignments with these attributes ?
>
> regards
> Ramana
>
I don't understand what information to store, can you explain? I was
thinking of putting some code inside the pass, which breaks the dependency
chains, which will avoid this type of conversions when it finds out  that
the pointer is _Dependent_ptr qualified otherwise don't do anything. Can
you let me know what I am missing on?

In transitive nature, are you talking about the dependent pointer being
assigned to some non-dependent pointer and after further assigned to some
other pointer?

>
> > Does this sounds like a workable plan for ? Let me know your thoughts.
> If this sounds good then, we can do this for all the optimizations that may
> kill the dependencies at somepoint.
> >
> > -Akshat
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 10:39                               ` Akshat Garg
@ 2019-07-02 11:01                                 ` Ramana Radhakrishnan
  2019-07-02 12:38                                   ` Paul E. McKenney
  0 siblings, 1 reply; 32+ messages in thread
From: Ramana Radhakrishnan @ 2019-07-02 11:01 UTC (permalink / raw)
  To: Akshat Garg; +Cc: Paul McKenney, gcc mailing list

>>
>> It's worth figuring out what passes are doing this - however the worry
>> I have is that every pass now needs to be handling this case with
>> respect to pointer attributes. Is there some place that you are
>> storing said information and what is the transitive nature of
>> assignments with these attributes ?
>>
>> regards
>> Ramana
>
> I don't understand what information to store, can you explain? I was thinking of putting some code inside the pass, which breaks the dependency chains, which will avoid this type of conversions when it finds out  that the pointer is _Dependent_ptr qualified otherwise don't do anything. Can you let me know what I am missing on?
>

That the pointer has an attribute that it's a dependent pointer . How
do you expect the later passes to detect it has a dependent_ptr
attribute attached to it ?

> In transitive nature, are you talking about the dependent pointer being assigned to some non-dependent pointer and after further assigned to some other pointer?


Yes. What's the expected behaviour as per the document ?

Ramana
>>
>>
>> > Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
>> >
>> > -Akshat

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 11:01                                 ` Ramana Radhakrishnan
@ 2019-07-02 12:38                                   ` Paul E. McKenney
  2019-07-02 13:16                                     ` Ramana Radhakrishnan
  0 siblings, 1 reply; 32+ messages in thread
From: Paul E. McKenney @ 2019-07-02 12:38 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Akshat Garg, gcc mailing list

On Tue, Jul 02, 2019 at 12:01:00PM +0100, Ramana Radhakrishnan wrote:
> >>
> >> It's worth figuring out what passes are doing this - however the worry
> >> I have is that every pass now needs to be handling this case with
> >> respect to pointer attributes. Is there some place that you are
> >> storing said information and what is the transitive nature of
> >> assignments with these attributes ?
> >>
> >> regards
> >> Ramana
> >
> > I don't understand what information to store, can you explain? I was thinking of putting some code inside the pass, which breaks the dependency chains, which will avoid this type of conversions when it finds out  that the pointer is _Dependent_ptr qualified otherwise don't do anything. Can you let me know what I am missing on?
> >
> 
> That the pointer has an attribute that it's a dependent pointer . How
> do you expect the later passes to detect it has a dependent_ptr
> attribute attached to it ?
> 
> > In transitive nature, are you talking about the dependent pointer being assigned to some non-dependent pointer and after further assigned to some other pointer?
> 
> Yes. What's the expected behaviour as per the document ?

Once a user-created non-dependent pointer is assigned to, it is OK to
break the dependency.

Or am I missing the point here?

							Thanx, Paul

> Ramana
> >>
> >>
> >> > Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
> >> >
> >> > -Akshat
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 12:38                                   ` Paul E. McKenney
@ 2019-07-02 13:16                                     ` Ramana Radhakrishnan
  2019-07-02 15:09                                       ` Paul E. McKenney
  0 siblings, 1 reply; 32+ messages in thread
From: Ramana Radhakrishnan @ 2019-07-02 13:16 UTC (permalink / raw)
  To: Paul McKenney; +Cc: Akshat Garg, gcc mailing list

On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote:

>
> Once a user-created non-dependent pointer is assigned to, it is OK to
> break the dependency.

Ok, that's good.
>
> Or am I missing the point here?

I was just trying to make sure we were on the same page. I wonder if
marking this volatile would be sufficient for prototyping. I suspect
we would need another flag somewhere which someone with gimple
knowledge might be able to help us with.

regards
Ramana

>
>                                                         Thanx, Paul
>
> > Ramana
> > >>
> > >>
> > >> > Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
> > >> >
> > >> > -Akshat
> >
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 13:16                                     ` Ramana Radhakrishnan
@ 2019-07-02 15:09                                       ` Paul E. McKenney
  2019-07-02 19:09                                         ` Akshat Garg
  0 siblings, 1 reply; 32+ messages in thread
From: Paul E. McKenney @ 2019-07-02 15:09 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: Akshat Garg, gcc mailing list

On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan wrote:
> On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
> 
> >
> > Once a user-created non-dependent pointer is assigned to, it is OK to
> > break the dependency.
> 
> Ok, that's good.
> >
> > Or am I missing the point here?
> 
> I was just trying to make sure we were on the same page. I wonder if
> marking this volatile would be sufficient for prototyping. I suspect
> we would need another flag somewhere which someone with gimple
> knowledge might be able to help us with.

I expect that marking it as volatile would do the trick.  ;-)

							Thanx, Paul

> regards
> Ramana
> 
> >
> >                                                         Thanx, Paul
> >
> > > Ramana
> > > >>
> > > >>
> > > >> > Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint.
> > > >> >
> > > >> > -Akshat
> > >
> >
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02  0:59                             ` Paul E. McKenney
  2019-07-02  1:42                               ` nick
@ 2019-07-02 15:36                               ` Jason Merrill
  2019-07-02 17:53                                 ` Richard Biener
                                                   ` (2 more replies)
  1 sibling, 3 replies; 32+ messages in thread
From: Jason Merrill @ 2019-07-02 15:36 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Akshat Garg, Ramana Radhakrishnan, gcc Mailing List

On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
>
> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:
> >
> > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> > > ramana.gcc@googlemail.com> wrote:
> > >
> > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com> wrote:
> > >> >
> > >> > As we have some working front-end code for _Dependent_ptr, What should
> > >> we do next? What I understand, we can start adding the library for
> > >> dependent_ptr and its functions for C corresponding to the ones we created
> > >> as C++ template library. Then, after that, we can move on to generating the
> > >> assembly code part.
> > >> >
> > >>
> > >>
> > >> I think the next step is figuring out how to model the Dependent
> > >> pointer information in the IR and figuring out what optimizations to
> > >> allow or not with that information. At this point , I suspect we need
> > >> a plan on record and have the conversation upstream on the lists.
> > >>
> > >> I think we need to put down a plan on record.
> > >>
> > >> Ramana
> > >
> > > [CCing gcc mailing list]
> > >
> > > So, shall I start looking over the pointer optimizations only and see what
> > > information we may be needed on the same examples in the IR itself?
> > >
> > > - Akshat
> > >
> > I have coded an example where equality comparison kills dependency from the
> > document P0190R4 as shown below :
> >
> > 1. struct rcutest rt = {1, 2, 3};
> > 2. void thread0 ()
> > 3. {
> > 4.     rt.a = -42;
> > 5.     rt.b = -43;
> > 6.     rt.c = -44;
> > 7.     rcu_assign_pointer(gp, &rt);
> > 8. }
> > 9.
> > 10. void thread1 ()
> > 11. {
> > 12.    int i = -1;
> > 13.    int j = -1;
> > 14.    _Dependent_ptr struct rcutest *p;
> > 15.
> > 16.    p = rcu_dereference(gp);
> > 17.    j = p->a;
> > 18.   if (p == &rt)
> > 19.        i = p->b;  /*Dependency breaking point*/
> > 20.   else if(p)
> > 21.       i = p->c;
> > 22.   assert(i<0);
> > 23.   assert(j<0);
> > 24. }
> > The gimple unoptimized code produced for lines 17-24 is shown below
> >
> > 1. if (p_16 == &rt)
> > 2.     goto <bb 3>; [INV]
> > 3.   else
> > 4.    goto <bb 4>; [INV]
> > 5.
> > 6.  <bb 3> :
> > 7.  i_19 = p_16->b;
> > 8.  goto <bb 6>; [INV]
> > 9.
> > 10.  <bb 4> :
> > 11.  if (p_16 != 0B)
> > 12.    goto <bb 5>; [INV]
> > 13.  else
> > 14.    goto <bb 6>; [INV]
> > 15.
> > 16.  <bb 5> :
> > 17.  i_18 = p_16->c;
> > 18.
> > 19.  <bb 6> :
> > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> > 21.  _3 = i_7 < 0;
> > 22.  _4 = (int) _3;
> > 23.  assert (_4);
> > 24.  _5 = j_17 < 0;
> > 25.  _6 = (int) _5;
> > 26.  assert (_6);
> > 27.  return;
> >
> > The optimized code after -O1 is applied for the same lines is hown below :
> >
> > 1. if (_2 == &rt)
> > 2.    goto <bb 3>; [30.00%]
> > 3. else
> > 4.    goto <bb 4>; [70.00%]
> > 5.
> > 6.  <bb 3> [local count: 322122547]:
> > 7.   i_12 = rt.b;
> > 8.   goto <bb 6>; [100.00%]
> > 9.
> > 10.  <bb 4> [local count: 751619277]:
> > 11.   if (_1 != 0)
> > 12.   goto <bb 5>; [50.00%]
> > 13.   else
> > 14.    goto <bb 6>; [50.00%]
> > 15.
> > 16.  <bb 5> [local count: 375809638]:
> > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> > 18.
> > 19.   <bb 6> [local count: 1073741824]:
> > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> > 21.   _3 = i_7 < 0;
> > 22.   _4 = (int) _3;
> > 23.   assert (_4);
> > 24.  _5 = j_10 < 0;
> > 25.  _6 = (int) _5;
> > 26.   assert (_6);
> > 27.   return;
>
> Good show on tracing this through!
>
> > Statement 19 in the program gets converted from  i_19 = p_16->b; in line 7
> > in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> > breaks the dependency chain. We need to figure out the pass that does that
> > and put some handling code in there for the _dependent_ptr qualified
> > pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or
> > any other option alone does not produce the optimized code that breaks the
> > dependency. But applying -O1, i.e., allowing all the optimizations does so.
> > As passes are applied in a certain order, we need to figure out up to what
> > passes, the code remains same and after what pass the dependency does not
> > holds. So, we need to check the translated code after every pass.
> >
> > Does this sounds like a workable plan for ? Let me know your thoughts. If
> > this sounds good then, we can do this for all the optimizations that may
> > kill the dependencies at somepoint.
>
> I don't know of a better plan.
>
> My usual question...  Is there some way to script the checking of the
> translated code at the end of each pass?

The usual way to check the output of an optimization pass is by
dumping the intermediate code at that point and matching the dump
against a regexp, as in the tree-ssa directories in the testsuite.
-fdump-tree-all will dump after all the gimple optimization passes,
and you can look through them until you find the breakage.

Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 15:36                               ` Jason Merrill
@ 2019-07-02 17:53                                 ` Richard Biener
  2019-07-03 15:09                                   ` Paul E. McKenney
  2019-07-02 19:37                                 ` Akshat Garg
  2019-07-17 10:54                                 ` Akshat Garg
  2 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2019-07-02 17:53 UTC (permalink / raw)
  To: gcc, Jason Merrill, Paul E. McKenney
  Cc: Akshat Garg, Ramana Radhakrishnan, gcc Mailing List

On July 2, 2019 5:36:08 PM GMT+02:00, Jason Merrill <jason@redhat.com> wrote:
>On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com>
>wrote:
>>
>> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com>
>wrote:
>> >
>> > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>> > > ramana.gcc@googlemail.com> wrote:
>> > >
>> > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
>wrote:
>> > >> >
>> > >> > As we have some working front-end code for _Dependent_ptr,
>What should
>> > >> we do next? What I understand, we can start adding the library
>for
>> > >> dependent_ptr and its functions for C corresponding to the ones
>we created
>> > >> as C++ template library. Then, after that, we can move on to
>generating the
>> > >> assembly code part.
>> > >> >
>> > >>
>> > >>
>> > >> I think the next step is figuring out how to model the Dependent
>> > >> pointer information in the IR and figuring out what
>optimizations to
>> > >> allow or not with that information. At this point , I suspect we
>need
>> > >> a plan on record and have the conversation upstream on the
>lists.
>> > >>
>> > >> I think we need to put down a plan on record.
>> > >>
>> > >> Ramana
>> > >
>> > > [CCing gcc mailing list]
>> > >
>> > > So, shall I start looking over the pointer optimizations only and
>see what
>> > > information we may be needed on the same examples in the IR
>itself?
>> > >
>> > > - Akshat
>> > >
>> > I have coded an example where equality comparison kills dependency
>from the
>> > document P0190R4 as shown below :
>> >
>> > 1. struct rcutest rt = {1, 2, 3};
>> > 2. void thread0 ()
>> > 3. {
>> > 4.     rt.a = -42;
>> > 5.     rt.b = -43;
>> > 6.     rt.c = -44;
>> > 7.     rcu_assign_pointer(gp, &rt);
>> > 8. }
>> > 9.
>> > 10. void thread1 ()
>> > 11. {
>> > 12.    int i = -1;
>> > 13.    int j = -1;
>> > 14.    _Dependent_ptr struct rcutest *p;
>> > 15.
>> > 16.    p = rcu_dereference(gp);
>> > 17.    j = p->a;
>> > 18.   if (p == &rt)
>> > 19.        i = p->b;  /*Dependency breaking point*/
>> > 20.   else if(p)
>> > 21.       i = p->c;
>> > 22.   assert(i<0);
>> > 23.   assert(j<0);
>> > 24. }
>> > The gimple unoptimized code produced for lines 17-24 is shown below
>> >
>> > 1. if (p_16 == &rt)
>> > 2.     goto <bb 3>; [INV]
>> > 3.   else
>> > 4.    goto <bb 4>; [INV]
>> > 5.
>> > 6.  <bb 3> :
>> > 7.  i_19 = p_16->b;
>> > 8.  goto <bb 6>; [INV]
>> > 9.
>> > 10.  <bb 4> :
>> > 11.  if (p_16 != 0B)
>> > 12.    goto <bb 5>; [INV]
>> > 13.  else
>> > 14.    goto <bb 6>; [INV]
>> > 15.
>> > 16.  <bb 5> :
>> > 17.  i_18 = p_16->c;
>> > 18.
>> > 19.  <bb 6> :
>> > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>> > 21.  _3 = i_7 < 0;
>> > 22.  _4 = (int) _3;
>> > 23.  assert (_4);
>> > 24.  _5 = j_17 < 0;
>> > 25.  _6 = (int) _5;
>> > 26.  assert (_6);
>> > 27.  return;
>> >
>> > The optimized code after -O1 is applied for the same lines is hown
>below :
>> >
>> > 1. if (_2 == &rt)
>> > 2.    goto <bb 3>; [30.00%]
>> > 3. else
>> > 4.    goto <bb 4>; [70.00%]
>> > 5.
>> > 6.  <bb 3> [local count: 322122547]:
>> > 7.   i_12 = rt.b;
>> > 8.   goto <bb 6>; [100.00%]
>> > 9.
>> > 10.  <bb 4> [local count: 751619277]:
>> > 11.   if (_1 != 0)
>> > 12.   goto <bb 5>; [50.00%]
>> > 13.   else
>> > 14.    goto <bb 6>; [50.00%]
>> > 15.
>> > 16.  <bb 5> [local count: 375809638]:
>> > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>> > 18.
>> > 19.   <bb 6> [local count: 1073741824]:
>> > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>> > 21.   _3 = i_7 < 0;
>> > 22.   _4 = (int) _3;
>> > 23.   assert (_4);
>> > 24.  _5 = j_10 < 0;
>> > 25.  _6 = (int) _5;
>> > 26.   assert (_6);
>> > 27.   return;
>>
>> Good show on tracing this through!
>>
>> > Statement 19 in the program gets converted from  i_19 = p_16->b; in
>line 7
>> > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
>which
>> > breaks the dependency chain. We need to figure out the pass that
>does that
>> > and put some handling code in there for the _dependent_ptr
>qualified
>> > pointers.

Wtf should this be for?  A type qualifier is certainly not going to work. 

Richard. 


 Passing simply -fipa-pure-const,
>-fguess-branch-probability or
>> > any other option alone does not produce the optimized code that
>breaks the
>> > dependency. But applying -O1, i.e., allowing all the optimizations
>does so.
>> > As passes are applied in a certain order, we need to figure out up
>to what
>> > passes, the code remains same and after what pass the dependency
>does not
>> > holds. So, we need to check the translated code after every pass.
>> >
>> > Does this sounds like a workable plan for ? Let me know your
>thoughts. If
>> > this sounds good then, we can do this for all the optimizations
>that may
>> > kill the dependencies at somepoint.
>>
>> I don't know of a better plan.
>>
>> My usual question...  Is there some way to script the checking of the
>> translated code at the end of each pass?
>
>The usual way to check the output of an optimization pass is by
>dumping the intermediate code at that point and matching the dump
>against a regexp, as in the tree-ssa directories in the testsuite.
>-fdump-tree-all will dump after all the gimple optimization passes,
>and you can look through them until you find the breakage.
>
>Jason

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 15:09                                       ` Paul E. McKenney
@ 2019-07-02 19:09                                         ` Akshat Garg
  2019-07-03 15:16                                           ` Paul E. McKenney
  0 siblings, 1 reply; 32+ messages in thread
From: Akshat Garg @ 2019-07-02 19:09 UTC (permalink / raw)
  To: Paul McKenney; +Cc: Ramana Radhakrishnan, gcc mailing list

On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney <paulmck@linux.ibm.com>
wrote:

> On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan wrote:
> > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney <paulmck@linux.ibm.com>
> wrote:
> >
> > >
> > > Once a user-created non-dependent pointer is assigned to, it is OK to
> > > break the dependency.
> >
> > Ok, that's good.
> > >
> > > Or am I missing the point here?
> >
> > I was just trying to make sure we were on the same page. I wonder if
> > marking this volatile would be sufficient for prototyping. I suspect
> > we would need another flag somewhere which someone with gimple
> > knowledge might be able to help us with.
>
> I expect that marking it as volatile would do the trick.  ;-)
>
>                                                         Thanx, Paul
>
So, marking this pointer as volatile will not allow the compiler to
modify/optimize the statements, the pointer is appearing in. And we don't
need to push any other code inside any of the passes. Due to this, we want
to automatically say those dependent pointers are volatile and introduce a
new flag for this. Am I getting you guys correctly? Kindly, let me know?

Akshat

>
> > regards
> > Ramana
> >
> > >
> > >                                                         Thanx, Paul
> > >
> > > > Ramana
> > > > >>
> > > > >>
> > > > >> > Does this sounds like a workable plan for ? Let me know your
> thoughts. If this sounds good then, we can do this for all the
> optimizations that may kill the dependencies at somepoint.
> > > > >> >
> > > > >> > -Akshat
> > > >
> > >
> >
>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 15:36                               ` Jason Merrill
  2019-07-02 17:53                                 ` Richard Biener
@ 2019-07-02 19:37                                 ` Akshat Garg
  2019-07-17 10:54                                 ` Akshat Garg
  2 siblings, 0 replies; 32+ messages in thread
From: Akshat Garg @ 2019-07-02 19:37 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Paul E. McKenney, Ramana Radhakrishnan, gcc Mailing List

On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <jason@redhat.com> wrote:

> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com>
> wrote:
> >
> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:
> > >
> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> > > > ramana.gcc@googlemail.com> wrote:
> > > >
> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
> wrote:
> > > >> >
> > > >> > As we have some working front-end code for _Dependent_ptr, What
> should
> > > >> we do next? What I understand, we can start adding the library for
> > > >> dependent_ptr and its functions for C corresponding to the ones we
> created
> > > >> as C++ template library. Then, after that, we can move on to
> generating the
> > > >> assembly code part.
> > > >> >
> > > >>
> > > >>
> > > >> I think the next step is figuring out how to model the Dependent
> > > >> pointer information in the IR and figuring out what optimizations to
> > > >> allow or not with that information. At this point , I suspect we
> need
> > > >> a plan on record and have the conversation upstream on the lists.
> > > >>
> > > >> I think we need to put down a plan on record.
> > > >>
> > > >> Ramana
> > > >
> > > > [CCing gcc mailing list]
> > > >
> > > > So, shall I start looking over the pointer optimizations only and
> see what
> > > > information we may be needed on the same examples in the IR itself?
> > > >
> > > > - Akshat
> > > >
> > > I have coded an example where equality comparison kills dependency
> from the
> > > document P0190R4 as shown below :
> > >
> > > 1. struct rcutest rt = {1, 2, 3};
> > > 2. void thread0 ()
> > > 3. {
> > > 4.     rt.a = -42;
> > > 5.     rt.b = -43;
> > > 6.     rt.c = -44;
> > > 7.     rcu_assign_pointer(gp, &rt);
> > > 8. }
> > > 9.
> > > 10. void thread1 ()
> > > 11. {
> > > 12.    int i = -1;
> > > 13.    int j = -1;
> > > 14.    _Dependent_ptr struct rcutest *p;
> > > 15.
> > > 16.    p = rcu_dereference(gp);
> > > 17.    j = p->a;
> > > 18.   if (p == &rt)
> > > 19.        i = p->b;  /*Dependency breaking point*/
> > > 20.   else if(p)
> > > 21.       i = p->c;
> > > 22.   assert(i<0);
> > > 23.   assert(j<0);
> > > 24. }
> > > The gimple unoptimized code produced for lines 17-24 is shown below
> > >
> > > 1. if (p_16 == &rt)
> > > 2.     goto <bb 3>; [INV]
> > > 3.   else
> > > 4.    goto <bb 4>; [INV]
> > > 5.
> > > 6.  <bb 3> :
> > > 7.  i_19 = p_16->b;
> > > 8.  goto <bb 6>; [INV]
> > > 9.
> > > 10.  <bb 4> :
> > > 11.  if (p_16 != 0B)
> > > 12.    goto <bb 5>; [INV]
> > > 13.  else
> > > 14.    goto <bb 6>; [INV]
> > > 15.
> > > 16.  <bb 5> :
> > > 17.  i_18 = p_16->c;
> > > 18.
> > > 19.  <bb 6> :
> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> > > 21.  _3 = i_7 < 0;
> > > 22.  _4 = (int) _3;
> > > 23.  assert (_4);
> > > 24.  _5 = j_17 < 0;
> > > 25.  _6 = (int) _5;
> > > 26.  assert (_6);
> > > 27.  return;
> > >
> > > The optimized code after -O1 is applied for the same lines is hown
> below :
> > >
> > > 1. if (_2 == &rt)
> > > 2.    goto <bb 3>; [30.00%]
> > > 3. else
> > > 4.    goto <bb 4>; [70.00%]
> > > 5.
> > > 6.  <bb 3> [local count: 322122547]:
> > > 7.   i_12 = rt.b;
> > > 8.   goto <bb 6>; [100.00%]
> > > 9.
> > > 10.  <bb 4> [local count: 751619277]:
> > > 11.   if (_1 != 0)
> > > 12.   goto <bb 5>; [50.00%]
> > > 13.   else
> > > 14.    goto <bb 6>; [50.00%]
> > > 15.
> > > 16.  <bb 5> [local count: 375809638]:
> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> > > 18.
> > > 19.   <bb 6> [local count: 1073741824]:
> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> > > 21.   _3 = i_7 < 0;
> > > 22.   _4 = (int) _3;
> > > 23.   assert (_4);
> > > 24.  _5 = j_10 < 0;
> > > 25.  _6 = (int) _5;
> > > 26.   assert (_6);
> > > 27.   return;
> >
> > Good show on tracing this through!
> >
> > > Statement 19 in the program gets converted from  i_19 = p_16->b; in
> line 7
> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> > > breaks the dependency chain. We need to figure out the pass that does
> that
> > > and put some handling code in there for the _dependent_ptr qualified
> > > pointers. Passing simply -fipa-pure-const, -fguess-branch-probability
> or
> > > any other option alone does not produce the optimized code that breaks
> the
> > > dependency. But applying -O1, i.e., allowing all the optimizations
> does so.
> > > As passes are applied in a certain order, we need to figure out up to
> what
> > > passes, the code remains same and after what pass the dependency does
> not
> > > holds. So, we need to check the translated code after every pass.
> > >
> > > Does this sounds like a workable plan for ? Let me know your thoughts.
> If
> > > this sounds good then, we can do this for all the optimizations that
> may
> > > kill the dependencies at somepoint.
> >
> > I don't know of a better plan.
> >
> > My usual question...  Is there some way to script the checking of the
> > translated code at the end of each pass?
>
> The usual way to check the output of an optimization pass is by
> dumping the intermediate code at that point and matching the dump
> against a regexp, as in the tree-ssa directories in the testsuite.
> -fdump-tree-all will dump after all the gimple optimization passes,
> and you can look through them until you find the breakage.
>
> Jason
>
I tried out your method. Up to the temp.c.114t.sra, the line 19 from
program is
  i_12 = MEM[(dependent_ptr struct rcutest *)_2].b;
After this in the next file, temp.c.116t.dom2, line 19 becomes
i_12 = rt.b;
Then, I believe, we need to check the pass doing this, but this will take a
lot of effort I think to handle dependent_ptr in a pass and after that
passes. Although, the approach of introducing new flag seems an easy
workaround to me. Anyway thanks for your time.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 17:53                                 ` Richard Biener
@ 2019-07-03 15:09                                   ` Paul E. McKenney
  0 siblings, 0 replies; 32+ messages in thread
From: Paul E. McKenney @ 2019-07-03 15:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc, Jason Merrill, Akshat Garg, Ramana Radhakrishnan

On Tue, Jul 02, 2019 at 07:53:20PM +0200, Richard Biener wrote:
> On July 2, 2019 5:36:08 PM GMT+02:00, Jason Merrill <jason@redhat.com> wrote:
> >On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com>
> >wrote:
> >>
> >> On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> >> > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com>
> >wrote:
> >> >
> >> > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> >> > > ramana.gcc@googlemail.com> wrote:
> >> > >
> >> > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
> >wrote:
> >> > >> >
> >> > >> > As we have some working front-end code for _Dependent_ptr,
> >What should
> >> > >> we do next? What I understand, we can start adding the library
> >for
> >> > >> dependent_ptr and its functions for C corresponding to the ones
> >we created
> >> > >> as C++ template library. Then, after that, we can move on to
> >generating the
> >> > >> assembly code part.
> >> > >> >
> >> > >>
> >> > >>
> >> > >> I think the next step is figuring out how to model the Dependent
> >> > >> pointer information in the IR and figuring out what
> >optimizations to
> >> > >> allow or not with that information. At this point , I suspect we
> >need
> >> > >> a plan on record and have the conversation upstream on the
> >lists.
> >> > >>
> >> > >> I think we need to put down a plan on record.
> >> > >>
> >> > >> Ramana
> >> > >
> >> > > [CCing gcc mailing list]
> >> > >
> >> > > So, shall I start looking over the pointer optimizations only and
> >see what
> >> > > information we may be needed on the same examples in the IR
> >itself?
> >> > >
> >> > > - Akshat
> >> > >
> >> > I have coded an example where equality comparison kills dependency
> >from the
> >> > document P0190R4 as shown below :
> >> >
> >> > 1. struct rcutest rt = {1, 2, 3};
> >> > 2. void thread0 ()
> >> > 3. {
> >> > 4.     rt.a = -42;
> >> > 5.     rt.b = -43;
> >> > 6.     rt.c = -44;
> >> > 7.     rcu_assign_pointer(gp, &rt);
> >> > 8. }
> >> > 9.
> >> > 10. void thread1 ()
> >> > 11. {
> >> > 12.    int i = -1;
> >> > 13.    int j = -1;
> >> > 14.    _Dependent_ptr struct rcutest *p;
> >> > 15.
> >> > 16.    p = rcu_dereference(gp);
> >> > 17.    j = p->a;
> >> > 18.   if (p == &rt)
> >> > 19.        i = p->b;  /*Dependency breaking point*/
> >> > 20.   else if(p)
> >> > 21.       i = p->c;
> >> > 22.   assert(i<0);
> >> > 23.   assert(j<0);
> >> > 24. }
> >> > The gimple unoptimized code produced for lines 17-24 is shown below
> >> >
> >> > 1. if (p_16 == &rt)
> >> > 2.     goto <bb 3>; [INV]
> >> > 3.   else
> >> > 4.    goto <bb 4>; [INV]
> >> > 5.
> >> > 6.  <bb 3> :
> >> > 7.  i_19 = p_16->b;
> >> > 8.  goto <bb 6>; [INV]
> >> > 9.
> >> > 10.  <bb 4> :
> >> > 11.  if (p_16 != 0B)
> >> > 12.    goto <bb 5>; [INV]
> >> > 13.  else
> >> > 14.    goto <bb 6>; [INV]
> >> > 15.
> >> > 16.  <bb 5> :
> >> > 17.  i_18 = p_16->c;
> >> > 18.
> >> > 19.  <bb 6> :
> >> > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> >> > 21.  _3 = i_7 < 0;
> >> > 22.  _4 = (int) _3;
> >> > 23.  assert (_4);
> >> > 24.  _5 = j_17 < 0;
> >> > 25.  _6 = (int) _5;
> >> > 26.  assert (_6);
> >> > 27.  return;
> >> >
> >> > The optimized code after -O1 is applied for the same lines is hown
> >below :
> >> >
> >> > 1. if (_2 == &rt)
> >> > 2.    goto <bb 3>; [30.00%]
> >> > 3. else
> >> > 4.    goto <bb 4>; [70.00%]
> >> > 5.
> >> > 6.  <bb 3> [local count: 322122547]:
> >> > 7.   i_12 = rt.b;
> >> > 8.   goto <bb 6>; [100.00%]
> >> > 9.
> >> > 10.  <bb 4> [local count: 751619277]:
> >> > 11.   if (_1 != 0)
> >> > 12.   goto <bb 5>; [50.00%]
> >> > 13.   else
> >> > 14.    goto <bb 6>; [50.00%]
> >> > 15.
> >> > 16.  <bb 5> [local count: 375809638]:
> >> > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> >> > 18.
> >> > 19.   <bb 6> [local count: 1073741824]:
> >> > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> >> > 21.   _3 = i_7 < 0;
> >> > 22.   _4 = (int) _3;
> >> > 23.   assert (_4);
> >> > 24.  _5 = j_10 < 0;
> >> > 25.  _6 = (int) _5;
> >> > 26.   assert (_6);
> >> > 27.   return;
> >>
> >> Good show on tracing this through!
> >>
> >> > Statement 19 in the program gets converted from  i_19 = p_16->b; in
> >line 7
> >> > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
> >which
> >> > breaks the dependency chain. We need to figure out the pass that
> >does that
> >> > and put some handling code in there for the _dependent_ptr
> >qualified
> >> > pointers.
> 
> Wtf should this be for?  A type qualifier is certainly not going to work. 

I might be wrong, but I don't believe that Akshat is claiming that this
is already a complete solution.

But please tell us more.  Given what Akshat is trying to do, what else
is missing or otherwise in need of fixing?

							Thanx, Paul

> Richard. 
> 
> 
>  Passing simply -fipa-pure-const,
> >-fguess-branch-probability or
> >> > any other option alone does not produce the optimized code that
> >breaks the
> >> > dependency. But applying -O1, i.e., allowing all the optimizations
> >does so.
> >> > As passes are applied in a certain order, we need to figure out up
> >to what
> >> > passes, the code remains same and after what pass the dependency
> >does not
> >> > holds. So, we need to check the translated code after every pass.
> >> >
> >> > Does this sounds like a workable plan for ? Let me know your
> >thoughts. If
> >> > this sounds good then, we can do this for all the optimizations
> >that may
> >> > kill the dependencies at somepoint.
> >>
> >> I don't know of a better plan.
> >>
> >> My usual question...  Is there some way to script the checking of the
> >> translated code at the end of each pass?
> >
> >The usual way to check the output of an optimization pass is by
> >dumping the intermediate code at that point and matching the dump
> >against a regexp, as in the tree-ssa directories in the testsuite.
> >-fdump-tree-all will dump after all the gimple optimization passes,
> >and you can look through them until you find the breakage.
> >
> >Jason
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 19:09                                         ` Akshat Garg
@ 2019-07-03 15:16                                           ` Paul E. McKenney
  2019-07-03 15:48                                             ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Paul E. McKenney @ 2019-07-03 15:16 UTC (permalink / raw)
  To: Akshat Garg; +Cc: Ramana Radhakrishnan, gcc mailing list

On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney <paulmck@linux.ibm.com>
> wrote:
> 
> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan wrote:
> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney <paulmck@linux.ibm.com>
> > wrote:
> > >
> > > >
> > > > Once a user-created non-dependent pointer is assigned to, it is OK to
> > > > break the dependency.
> > >
> > > Ok, that's good.
> > > >
> > > > Or am I missing the point here?
> > >
> > > I was just trying to make sure we were on the same page. I wonder if
> > > marking this volatile would be sufficient for prototyping. I suspect
> > > we would need another flag somewhere which someone with gimple
> > > knowledge might be able to help us with.
> >
> > I expect that marking it as volatile would do the trick.  ;-)
> >
> >                                                         Thanx, Paul
> >
> So, marking this pointer as volatile will not allow the compiler to
> modify/optimize the statements, the pointer is appearing in. And we don't
> need to push any other code inside any of the passes. Due to this, we want
> to automatically say those dependent pointers are volatile and introduce a
> new flag for this. Am I getting you guys correctly? Kindly, let me know?

While I suspect that this might work, it would suppress way more
optimizations than would be good.  For but one example, consider:

	_Dependent_ptr int *p;

	p = atomic_load_explicit(gp, memory_order_consume);
	a = p->a;
	b = p->b;

If "p" is volatile, then the compiler will be prevented from keeping
it in a register, which would not make people coding fastpaths at
all happy.  ;-)

Still, use of volatile might be a good technique for prototyping and
analysis of _Dependent_ptr.

							Thanx, Paul

> Akshat
> 
> >
> > > regards
> > > Ramana
> > >
> > > >
> > > >                                                         Thanx, Paul
> > > >
> > > > > Ramana
> > > > > >>
> > > > > >>
> > > > > >> > Does this sounds like a workable plan for ? Let me know your
> > thoughts. If this sounds good then, we can do this for all the
> > optimizations that may kill the dependencies at somepoint.
> > > > > >> >
> > > > > >> > -Akshat
> > > > >
> > > >
> > >
> >
> >

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-03 15:16                                           ` Paul E. McKenney
@ 2019-07-03 15:48                                             ` Richard Biener
  2019-07-03 16:34                                               ` Paul E. McKenney
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2019-07-03 15:48 UTC (permalink / raw)
  To: paulmck, Paul E. McKenney, Akshat Garg
  Cc: Ramana Radhakrishnan, gcc mailing list

On July 3, 2019 5:14:58 PM GMT+02:00, "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
>On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
>> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney
><paulmck@linux.ibm.com>
>> wrote:
>> 
>> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan
>wrote:
>> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney
><paulmck@linux.ibm.com>
>> > wrote:
>> > >
>> > > >
>> > > > Once a user-created non-dependent pointer is assigned to, it is
>OK to
>> > > > break the dependency.
>> > >
>> > > Ok, that's good.
>> > > >
>> > > > Or am I missing the point here?
>> > >
>> > > I was just trying to make sure we were on the same page. I wonder
>if
>> > > marking this volatile would be sufficient for prototyping. I
>suspect
>> > > we would need another flag somewhere which someone with gimple
>> > > knowledge might be able to help us with.
>> >
>> > I expect that marking it as volatile would do the trick.  ;-)
>> >
>> >                                                         Thanx, Paul
>> >
>> So, marking this pointer as volatile will not allow the compiler to
>> modify/optimize the statements, the pointer is appearing in. And we
>don't
>> need to push any other code inside any of the passes. Due to this, we
>want
>> to automatically say those dependent pointers are volatile and
>introduce a
>> new flag for this. Am I getting you guys correctly? Kindly, let me
>know?
>
>While I suspect that this might work, it would suppress way more
>optimizations than would be good.  For but one example, consider:
>
>	_Dependent_ptr int *p;
>
>	p = atomic_load_explicit(gp, memory_order_consume);
>	a = p->a;
>	b = p->b;
>
>If "p" is volatile, then the compiler will be prevented from keeping
>it in a register, which would not make people coding fastpaths at
>all happy.  ;-)
>
>Still, use of volatile might be a good technique for prototyping and
>analysis of _Dependent_ptr.

With this example can you quickly summarize what kind of guarantees _Dependent_ptr gives and how a compiler
Could possibly break those? 

Richard. 

>
>							Thanx, Paul
>
>> Akshat
>> 
>> >
>> > > regards
>> > > Ramana
>> > >
>> > > >
>> > > >                                                         Thanx,
>Paul
>> > > >
>> > > > > Ramana
>> > > > > >>
>> > > > > >>
>> > > > > >> > Does this sounds like a workable plan for ? Let me know
>your
>> > thoughts. If this sounds good then, we can do this for all the
>> > optimizations that may kill the dependencies at somepoint.
>> > > > > >> >
>> > > > > >> > -Akshat
>> > > > >
>> > > >
>> > >
>> >
>> >

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-03 15:48                                             ` Richard Biener
@ 2019-07-03 16:34                                               ` Paul E. McKenney
  2019-07-04 11:00                                                 ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Paul E. McKenney @ 2019-07-03 16:34 UTC (permalink / raw)
  To: Richard Biener; +Cc: Akshat Garg, Ramana Radhakrishnan, gcc mailing list

On Wed, Jul 03, 2019 at 05:47:56PM +0200, Richard Biener wrote:
> On July 3, 2019 5:14:58 PM GMT+02:00, "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> >On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
> >> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney
> ><paulmck@linux.ibm.com>
> >> wrote:
> >> 
> >> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan
> >wrote:
> >> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney
> ><paulmck@linux.ibm.com>
> >> > wrote:
> >> > >
> >> > > >
> >> > > > Once a user-created non-dependent pointer is assigned to, it is
> >OK to
> >> > > > break the dependency.
> >> > >
> >> > > Ok, that's good.
> >> > > >
> >> > > > Or am I missing the point here?
> >> > >
> >> > > I was just trying to make sure we were on the same page. I wonder
> >if
> >> > > marking this volatile would be sufficient for prototyping. I
> >suspect
> >> > > we would need another flag somewhere which someone with gimple
> >> > > knowledge might be able to help us with.
> >> >
> >> > I expect that marking it as volatile would do the trick.  ;-)
> >> >
> >> >                                                         Thanx, Paul
> >> >
> >> So, marking this pointer as volatile will not allow the compiler to
> >> modify/optimize the statements, the pointer is appearing in. And we
> >don't
> >> need to push any other code inside any of the passes. Due to this, we
> >want
> >> to automatically say those dependent pointers are volatile and
> >introduce a
> >> new flag for this. Am I getting you guys correctly? Kindly, let me
> >know?
> >
> >While I suspect that this might work, it would suppress way more
> >optimizations than would be good.  For but one example, consider:
> >
> >	_Dependent_ptr int *p;
> >
> >	p = atomic_load_explicit(gp, memory_order_consume);
> >	a = p->a;
> >	b = p->b;
> >
> >If "p" is volatile, then the compiler will be prevented from keeping
> >it in a register, which would not make people coding fastpaths at
> >all happy.  ;-)
> >
> >Still, use of volatile might be a good technique for prototyping and
> >analysis of _Dependent_ptr.
> 
> With this example can you quickly summarize what kind of guarantees _Dependent_ptr gives and how a compiler
> Could possibly break those? 

First I suppose I should fix the bug in the above code.  Or one of the
bugs, at least.  :-/

	struct foo {
		int a;
		int b;
	};

	_Dependent_ptr struct foo *p;

	p = atomic_load_explicit(gp, memory_order_consume);
	a = p->a;
	b = p->b;

And then let me tweak the example a bit.  For the first tweak:

	struct foo {
		int a;
		int b;
	};

	struct foo default_foo = { .a = 42, .b = 43 };
	int *gp = &default_foo;

	...

	_Dependent_ptr int *p;

	p = atomic_load_explicit(gp, memory_order_consume);
	a = p->a;
	b = p->b;

Suppose that the compiler used feedback-driven optimization, and noticed
that the value of gp was almost always &default_foo.  The compiler might
decide to transform the last three lines as follows:

	p = atomic_load_explicit(gp, memory_order_consume);
	if (p == &default_foo) {
		a = default_foo.a;
		b = default_foo.b;
	} else {
		a = p->a;
		b = p->b;
	}

Now, as long as the value of gp had remained &default_foo for the full
duration of execution, no harm done.  But suppose the following code
was executing concurrently with the above transformed code:

	struct foo *q;

	q = malloc(sizeof(*q));
	assert(q);
	q->a = 1729;
	q->b = 1730;
	atomic_store_explicit(gp, q, memory_order_release);
	do_something();
	default_foo.a = 1;
	default_foo.b = 2;
	atomic_store_explicit(gp, &default_foo, memory_order_release);

In this case, if the memory_order_consume() came just after the pointer
was reset to &default_foo, it is possible that the transformed code
would set "a" to 42 and "b" to 43, which might not be what the guy
writing the code wanted to happen.

One of the purposes of _Dependent_ptr is to prevent this transformation.

This transformation can also happen if the developer's code contained a
comparison to &default_foo -- an ARM or PowerPC compiler backend, upon
seeing two pointers containing the same bits, would likely consider the
two pointers as being interchangeable, and thus might do the dereferences
using the copy that was not tagged with the hardware dependencies.

There are quite a few other examples.  The C++ standards committee
working papers shown below go through a number of them, in case the
above example is not convincing.  Or you could tell me what you would
like to see, and I would attempt to find/create a suitable example.

Does that help, or am I missing your point?

							Thanx, Paul

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0098r0.pdf
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0462r1.pdf

> Richard. 
> 
> >
> >							Thanx, Paul
> >
> >> Akshat
> >> 
> >> >
> >> > > regards
> >> > > Ramana
> >> > >
> >> > > >
> >> > > >                                                         Thanx,
> >Paul
> >> > > >
> >> > > > > Ramana
> >> > > > > >>
> >> > > > > >>
> >> > > > > >> > Does this sounds like a workable plan for ? Let me know
> >your
> >> > thoughts. If this sounds good then, we can do this for all the
> >> > optimizations that may kill the dependencies at somepoint.
> >> > > > > >> >
> >> > > > > >> > -Akshat
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> >
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-03 16:34                                               ` Paul E. McKenney
@ 2019-07-04 11:00                                                 ` Richard Biener
  2019-07-04 17:40                                                   ` Paul E. McKenney
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2019-07-04 11:00 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Akshat Garg, Ramana Radhakrishnan, gcc mailing list

On Wed, Jul 3, 2019 at 6:33 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
>
> On Wed, Jul 03, 2019 at 05:47:56PM +0200, Richard Biener wrote:
> > On July 3, 2019 5:14:58 PM GMT+02:00, "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> > >On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
> > >> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney
> > ><paulmck@linux.ibm.com>
> > >> wrote:
> > >>
> > >> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan
> > >wrote:
> > >> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney
> > ><paulmck@linux.ibm.com>
> > >> > wrote:
> > >> > >
> > >> > > >
> > >> > > > Once a user-created non-dependent pointer is assigned to, it is
> > >OK to
> > >> > > > break the dependency.
> > >> > >
> > >> > > Ok, that's good.
> > >> > > >
> > >> > > > Or am I missing the point here?
> > >> > >
> > >> > > I was just trying to make sure we were on the same page. I wonder
> > >if
> > >> > > marking this volatile would be sufficient for prototyping. I
> > >suspect
> > >> > > we would need another flag somewhere which someone with gimple
> > >> > > knowledge might be able to help us with.
> > >> >
> > >> > I expect that marking it as volatile would do the trick.  ;-)
> > >> >
> > >> >                                                         Thanx, Paul
> > >> >
> > >> So, marking this pointer as volatile will not allow the compiler to
> > >> modify/optimize the statements, the pointer is appearing in. And we
> > >don't
> > >> need to push any other code inside any of the passes. Due to this, we
> > >want
> > >> to automatically say those dependent pointers are volatile and
> > >introduce a
> > >> new flag for this. Am I getting you guys correctly? Kindly, let me
> > >know?
> > >
> > >While I suspect that this might work, it would suppress way more
> > >optimizations than would be good.  For but one example, consider:
> > >
> > >     _Dependent_ptr int *p;
> > >
> > >     p = atomic_load_explicit(gp, memory_order_consume);
> > >     a = p->a;
> > >     b = p->b;
> > >
> > >If "p" is volatile, then the compiler will be prevented from keeping
> > >it in a register, which would not make people coding fastpaths at
> > >all happy.  ;-)
> > >
> > >Still, use of volatile might be a good technique for prototyping and
> > >analysis of _Dependent_ptr.
> >
> > With this example can you quickly summarize what kind of guarantees _Dependent_ptr gives and how a compiler
> > Could possibly break those?
>
> First I suppose I should fix the bug in the above code.  Or one of the
> bugs, at least.  :-/
>
>         struct foo {
>                 int a;
>                 int b;
>         };
>
>         _Dependent_ptr struct foo *p;
>
>         p = atomic_load_explicit(gp, memory_order_consume);
>         a = p->a;
>         b = p->b;
>
> And then let me tweak the example a bit.  For the first tweak:
>
>         struct foo {
>                 int a;
>                 int b;
>         };
>
>         struct foo default_foo = { .a = 42, .b = 43 };
>         int *gp = &default_foo;
>
>         ...
>
>         _Dependent_ptr int *p;
>
>         p = atomic_load_explicit(gp, memory_order_consume);
>         a = p->a;
>         b = p->b;
>
> Suppose that the compiler used feedback-driven optimization, and noticed
> that the value of gp was almost always &default_foo.  The compiler might
> decide to transform the last three lines as follows:
>
>         p = atomic_load_explicit(gp, memory_order_consume);
>         if (p == &default_foo) {
>                 a = default_foo.a;
>                 b = default_foo.b;
>         } else {
>                 a = p->a;
>                 b = p->b;
>         }
>
> Now, as long as the value of gp had remained &default_foo for the full
> duration of execution, no harm done.  But suppose the following code
> was executing concurrently with the above transformed code:
>
>         struct foo *q;
>
>         q = malloc(sizeof(*q));
>         assert(q);
>         q->a = 1729;
>         q->b = 1730;
>         atomic_store_explicit(gp, q, memory_order_release);
>         do_something();
>         default_foo.a = 1;
>         default_foo.b = 2;
>         atomic_store_explicit(gp, &default_foo, memory_order_release);
>
> In this case, if the memory_order_consume() came just after the pointer
> was reset to &default_foo, it is possible that the transformed code
> would set "a" to 42 and "b" to 43, which might not be what the guy
> writing the code wanted to happen.
>
> One of the purposes of _Dependent_ptr is to prevent this transformation.
>
> This transformation can also happen if the developer's code contained a
> comparison to &default_foo -- an ARM or PowerPC compiler backend, upon
> seeing two pointers containing the same bits, would likely consider the
> two pointers as being interchangeable, and thus might do the dereferences
> using the copy that was not tagged with the hardware dependencies.
>
> There are quite a few other examples.  The C++ standards committee
> working papers shown below go through a number of them, in case the
> above example is not convincing.  Or you could tell me what you would
> like to see, and I would attempt to find/create a suitable example.
>
> Does that help, or am I missing your point?

Browsed through that resources, so yes.  So the important thing
is that data dependences in the source are not to be removed or
replaced by control dependences because that enables out-of-order
execution on the CPU (or by the compiler).  This is only needed
for data typed with the _Dependent_ptr qualifier.

I think fully guaranteeing this is hard (besides when you use
volatile), we have the very same issue when dealing with
pointer provenance rules, known for years and not fixed
(and I don't see a good way to fix these issues without
sacrifying performance everywhere).

Good luck ;)

Maybe performance isn't so much of an issue for _Dependent_ptr
(compared to when all pointers are affected).

Richard.

>
>                                                         Thanx, Paul
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0098r0.pdf
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0462r1.pdf
>
> > Richard.
> >
> > >
> > >                                                     Thanx, Paul
> > >
> > >> Akshat
> > >>
> > >> >
> > >> > > regards
> > >> > > Ramana
> > >> > >
> > >> > > >
> > >> > > >                                                         Thanx,
> > >Paul
> > >> > > >
> > >> > > > > Ramana
> > >> > > > > >>
> > >> > > > > >>
> > >> > > > > >> > Does this sounds like a workable plan for ? Let me know
> > >your
> > >> > thoughts. If this sounds good then, we can do this for all the
> > >> > optimizations that may kill the dependencies at somepoint.
> > >> > > > > >> >
> > >> > > > > >> > -Akshat
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >> >
> >
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-04 11:00                                                 ` Richard Biener
@ 2019-07-04 17:40                                                   ` Paul E. McKenney
  2019-07-04 18:09                                                     ` Jakub Jelinek
  0 siblings, 1 reply; 32+ messages in thread
From: Paul E. McKenney @ 2019-07-04 17:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: Akshat Garg, Ramana Radhakrishnan, gcc mailing list

On Thu, Jul 04, 2019 at 01:00:18PM +0200, Richard Biener wrote:
> On Wed, Jul 3, 2019 at 6:33 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
> >
> > On Wed, Jul 03, 2019 at 05:47:56PM +0200, Richard Biener wrote:
> > > On July 3, 2019 5:14:58 PM GMT+02:00, "Paul E. McKenney" <paulmck@linux.ibm.com> wrote:
> > > >On Wed, Jul 03, 2019 at 12:39:41AM +0530, Akshat Garg wrote:
> > > >> On Tue, Jul 2, 2019 at 8:40 PM Paul E. McKenney
> > > ><paulmck@linux.ibm.com>
> > > >> wrote:
> > > >>
> > > >> > On Tue, Jul 02, 2019 at 02:15:55PM +0100, Ramana Radhakrishnan
> > > >wrote:
> > > >> > > On Tue, Jul 2, 2019 at 1:38 PM Paul E. McKenney
> > > ><paulmck@linux.ibm.com>
> > > >> > wrote:
> > > >> > >
> > > >> > > >
> > > >> > > > Once a user-created non-dependent pointer is assigned to, it is
> > > >OK to
> > > >> > > > break the dependency.
> > > >> > >
> > > >> > > Ok, that's good.
> > > >> > > >
> > > >> > > > Or am I missing the point here?
> > > >> > >
> > > >> > > I was just trying to make sure we were on the same page. I wonder
> > > >if
> > > >> > > marking this volatile would be sufficient for prototyping. I
> > > >suspect
> > > >> > > we would need another flag somewhere which someone with gimple
> > > >> > > knowledge might be able to help us with.
> > > >> >
> > > >> > I expect that marking it as volatile would do the trick.  ;-)
> > > >> >
> > > >> >                                                         Thanx, Paul
> > > >> >
> > > >> So, marking this pointer as volatile will not allow the compiler to
> > > >> modify/optimize the statements, the pointer is appearing in. And we
> > > >don't
> > > >> need to push any other code inside any of the passes. Due to this, we
> > > >want
> > > >> to automatically say those dependent pointers are volatile and
> > > >introduce a
> > > >> new flag for this. Am I getting you guys correctly? Kindly, let me
> > > >know?
> > > >
> > > >While I suspect that this might work, it would suppress way more
> > > >optimizations than would be good.  For but one example, consider:
> > > >
> > > >     _Dependent_ptr int *p;
> > > >
> > > >     p = atomic_load_explicit(gp, memory_order_consume);
> > > >     a = p->a;
> > > >     b = p->b;
> > > >
> > > >If "p" is volatile, then the compiler will be prevented from keeping
> > > >it in a register, which would not make people coding fastpaths at
> > > >all happy.  ;-)
> > > >
> > > >Still, use of volatile might be a good technique for prototyping and
> > > >analysis of _Dependent_ptr.
> > >
> > > With this example can you quickly summarize what kind of guarantees _Dependent_ptr gives and how a compiler
> > > Could possibly break those?
> >
> > First I suppose I should fix the bug in the above code.  Or one of the
> > bugs, at least.  :-/
> >
> >         struct foo {
> >                 int a;
> >                 int b;
> >         };
> >
> >         _Dependent_ptr struct foo *p;
> >
> >         p = atomic_load_explicit(gp, memory_order_consume);
> >         a = p->a;
> >         b = p->b;
> >
> > And then let me tweak the example a bit.  For the first tweak:
> >
> >         struct foo {
> >                 int a;
> >                 int b;
> >         };
> >
> >         struct foo default_foo = { .a = 42, .b = 43 };
> >         int *gp = &default_foo;
> >
> >         ...
> >
> >         _Dependent_ptr int *p;
> >
> >         p = atomic_load_explicit(gp, memory_order_consume);
> >         a = p->a;
> >         b = p->b;
> >
> > Suppose that the compiler used feedback-driven optimization, and noticed
> > that the value of gp was almost always &default_foo.  The compiler might
> > decide to transform the last three lines as follows:
> >
> >         p = atomic_load_explicit(gp, memory_order_consume);
> >         if (p == &default_foo) {
> >                 a = default_foo.a;
> >                 b = default_foo.b;
> >         } else {
> >                 a = p->a;
> >                 b = p->b;
> >         }
> >
> > Now, as long as the value of gp had remained &default_foo for the full
> > duration of execution, no harm done.  But suppose the following code
> > was executing concurrently with the above transformed code:
> >
> >         struct foo *q;
> >
> >         q = malloc(sizeof(*q));
> >         assert(q);
> >         q->a = 1729;
> >         q->b = 1730;
> >         atomic_store_explicit(gp, q, memory_order_release);
> >         do_something();
> >         default_foo.a = 1;
> >         default_foo.b = 2;
> >         atomic_store_explicit(gp, &default_foo, memory_order_release);
> >
> > In this case, if the memory_order_consume() came just after the pointer
> > was reset to &default_foo, it is possible that the transformed code
> > would set "a" to 42 and "b" to 43, which might not be what the guy
> > writing the code wanted to happen.
> >
> > One of the purposes of _Dependent_ptr is to prevent this transformation.
> >
> > This transformation can also happen if the developer's code contained a
> > comparison to &default_foo -- an ARM or PowerPC compiler backend, upon
> > seeing two pointers containing the same bits, would likely consider the
> > two pointers as being interchangeable, and thus might do the dereferences
> > using the copy that was not tagged with the hardware dependencies.
> >
> > There are quite a few other examples.  The C++ standards committee
> > working papers shown below go through a number of them, in case the
> > above example is not convincing.  Or you could tell me what you would
> > like to see, and I would attempt to find/create a suitable example.
> >
> > Does that help, or am I missing your point?
> 
> Browsed through that resources, so yes.  So the important thing
> is that data dependences in the source are not to be removed or
> replaced by control dependences because that enables out-of-order
> execution on the CPU (or by the compiler).  This is only needed
> for data typed with the _Dependent_ptr qualifier.
> 
> I think fully guaranteeing this is hard (besides when you use
> volatile), we have the very same issue when dealing with
> pointer provenance rules, known for years and not fixed
> (and I don't see a good way to fix these issues without
> sacrifying performance everywhere).
> 
> Good luck ;)

Thank you, we will need it!  ;-)

> Maybe performance isn't so much of an issue for _Dependent_ptr
> (compared to when all pointers are affected).

Here is hoping!

							Thanx, Paul

> Richard.
> 
> >
> >                                                         Thanx, Paul
> >
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0098r0.pdf
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0462r1.pdf
> >
> > > Richard.
> > >
> > > >
> > > >                                                     Thanx, Paul
> > > >
> > > >> Akshat
> > > >>
> > > >> >
> > > >> > > regards
> > > >> > > Ramana
> > > >> > >
> > > >> > > >
> > > >> > > >                                                         Thanx,
> > > >Paul
> > > >> > > >
> > > >> > > > > Ramana
> > > >> > > > > >>
> > > >> > > > > >>
> > > >> > > > > >> > Does this sounds like a workable plan for ? Let me know
> > > >your
> > > >> > thoughts. If this sounds good then, we can do this for all the
> > > >> > optimizations that may kill the dependencies at somepoint.
> > > >> > > > > >> >
> > > >> > > > > >> > -Akshat
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >> >
> > >
> >

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-04 17:40                                                   ` Paul E. McKenney
@ 2019-07-04 18:09                                                     ` Jakub Jelinek
  2019-07-04 23:08                                                       ` Akshat Garg
  0 siblings, 1 reply; 32+ messages in thread
From: Jakub Jelinek @ 2019-07-04 18:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Richard Biener, Akshat Garg, Ramana Radhakrishnan, gcc mailing list

On Thu, Jul 04, 2019 at 10:40:15AM -0700, Paul E. McKenney wrote:
> > I think fully guaranteeing this is hard (besides when you use
> > volatile), we have the very same issue when dealing with
> > pointer provenance rules, known for years and not fixed
> > (and I don't see a good way to fix these issues without
> > sacrifying performance everywhere).
> > 
> > Good luck ;)
> 
> Thank you, we will need it!  ;-)

Well, luck probably isn't all you need, you'll need some way to represent
those dependencies in the IL (both GIMPLE and RTL) that would ensure that
they aren't optimized away, because just hoping that optimization don't mess
that up or just patching a few optimizations you discover first isn't going
to be very reliable.  Say don't rewrite those into SSA form and represent
them close to how volatile ptrs are handled at least for the beginning, and
if there are safe optimizations special case those, rather than allowing all
optimizations on those and hope it will work out.

	Jakub

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-04 18:09                                                     ` Jakub Jelinek
@ 2019-07-04 23:08                                                       ` Akshat Garg
  2019-07-05 11:20                                                         ` Richard Biener
       [not found]                                                         ` <20190705014806.GM26519@linux.ibm.com>
  0 siblings, 2 replies; 32+ messages in thread
From: Akshat Garg @ 2019-07-04 23:08 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Paul E. McKenney, Richard Biener, Ramana Radhakrishnan, gcc mailing list

On Thu, Jul 4, 2019 at 11:39 PM Jakub Jelinek <jakub@redhat.com> wrote:

> On Thu, Jul 04, 2019 at 10:40:15AM -0700, Paul E. McKenney wrote:
> > > I think fully guaranteeing this is hard (besides when you use
> > > volatile), we have the very same issue when dealing with
> > > pointer provenance rules, known for years and not fixed
> > > (and I don't see a good way to fix these issues without
> > > sacrifying performance everywhere).
> > >
> > > Good luck ;)
> >
> > Thank you, we will need it!  ;-)
>
> Well, luck probably isn't all you need, you'll need some way to represent
> those dependencies in the IL (both GIMPLE and RTL) that would ensure that
> they aren't optimized away, because just hoping that optimization don't
> mess
> that up or just patching a few optimizations you discover first isn't going
> to be very reliable.  Say don't rewrite those into SSA form and represent
> them close to how volatile ptrs are handled at least for the beginning, and
> if there are safe optimizations special case those, rather than allowing
> all
> optimizations on those and hope it will work out.
>
>         Jakub
>
I don't understand this statement completely "Say don't rewrite those into
SSA form and represent them close to how volatile ptrs are handled at least
for the beginning, and if there are safe optimizations special case those,
rather than allowing all optimizations on those and hope it will work
out.".
Are you saying that we should try to represent dependent ptrs as volatile
ptrs for some places?  We should figure out all the places where a check
for volatile ptrs happens and we check for dependent ptrs there also and
see if we can allow the optimization or not.
Am I getting you correctly, please tell?

-Akshat

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-04 23:08                                                       ` Akshat Garg
@ 2019-07-05 11:20                                                         ` Richard Biener
  2019-07-05 15:38                                                           ` Akshat Garg
       [not found]                                                         ` <20190705014806.GM26519@linux.ibm.com>
  1 sibling, 1 reply; 32+ messages in thread
From: Richard Biener @ 2019-07-05 11:20 UTC (permalink / raw)
  To: Akshat Garg
  Cc: Jakub Jelinek, Paul E. McKenney, Ramana Radhakrishnan, gcc mailing list

On Fri, Jul 5, 2019 at 1:08 AM Akshat Garg <xkspr7@gmail.com> wrote:
>
> On Thu, Jul 4, 2019 at 11:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>
>> On Thu, Jul 04, 2019 at 10:40:15AM -0700, Paul E. McKenney wrote:
>> > > I think fully guaranteeing this is hard (besides when you use
>> > > volatile), we have the very same issue when dealing with
>> > > pointer provenance rules, known for years and not fixed
>> > > (and I don't see a good way to fix these issues without
>> > > sacrifying performance everywhere).
>> > >
>> > > Good luck ;)
>> >
>> > Thank you, we will need it!  ;-)
>>
>> Well, luck probably isn't all you need, you'll need some way to represent
>> those dependencies in the IL (both GIMPLE and RTL) that would ensure that
>> they aren't optimized away, because just hoping that optimization don't mess
>> that up or just patching a few optimizations you discover first isn't going
>> to be very reliable.  Say don't rewrite those into SSA form and represent
>> them close to how volatile ptrs are handled at least for the beginning, and
>> if there are safe optimizations special case those, rather than allowing all
>> optimizations on those and hope it will work out.
>>
>>         Jakub
>
> I don't understand this statement completely "Say don't rewrite those into SSA form and represent them close to how volatile ptrs are handled at least for the beginning, and if there are safe optimizations special case those, rather than allowing all optimizations on those and hope it will work out.".
> Are you saying that we should try to represent dependent ptrs as volatile ptrs for some places?  We should figure out all the places where a check for volatile ptrs happens and we check for dependent ptrs there also and see if we can allow the optimization or not.
> Am I getting you correctly, please tell?

I think Jakub means not to use volatile as is but mimic some of its
properties (like we do not rewrite volatile
vars into SSA form which fends off most optimizations turing data
dependences into crontrol dependences,
not all speculation though).

I think "fixing" the GIMPLE optimizers to not mess up is possible even
with SSA (just a bit tedious and fragile),
but real issues will start at the RTL level where you completely lose
properties like "this is a [dependent] pointer".

Richard.

>
> -Akshat

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-05 11:20                                                         ` Richard Biener
@ 2019-07-05 15:38                                                           ` Akshat Garg
  0 siblings, 0 replies; 32+ messages in thread
From: Akshat Garg @ 2019-07-05 15:38 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Paul McKenney, Ramana Radhakrishnan, gcc mailing list

On Fri, 5 Jul, 2019, 4:50 PM Richard Biener, <richard.guenther@gmail.com>
wrote:

> On Fri, Jul 5, 2019 at 1:08 AM Akshat Garg <xkspr7@gmail.com> wrote:
> >
> > On Thu, Jul 4, 2019 at 11:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >>
> >> On Thu, Jul 04, 2019 at 10:40:15AM -0700, Paul E. McKenney wrote:
> >> > > I think fully guaranteeing this is hard (besides when you use
> >> > > volatile), we have the very same issue when dealing with
> >> > > pointer provenance rules, known for years and not fixed
> >> > > (and I don't see a good way to fix these issues without
> >> > > sacrifying performance everywhere).
> >> > >
> >> > > Good luck ;)
> >> >
> >> > Thank you, we will need it!  ;-)
> >>
> >> Well, luck probably isn't all you need, you'll need some way to
> represent
> >> those dependencies in the IL (both GIMPLE and RTL) that would ensure
> that
> >> they aren't optimized away, because just hoping that optimization don't
> mess
> >> that up or just patching a few optimizations you discover first isn't
> going
> >> to be very reliable.  Say don't rewrite those into SSA form and
> represent
> >> them close to how volatile ptrs are handled at least for the beginning,
> and
> >> if there are safe optimizations special case those, rather than
> allowing all
> >> optimizations on those and hope it will work out.
> >>
> >>         Jakub
> >
> > I don't understand this statement completely "Say don't rewrite those
> into SSA form and represent them close to how volatile ptrs are handled at
> least for the beginning, and if there are safe optimizations special case
> those, rather than allowing all optimizations on those and hope it will
> work out.".
> > Are you saying that we should try to represent dependent ptrs as
> volatile ptrs for some places?  We should figure out all the places where a
> check for volatile ptrs happens and we check for dependent ptrs there also
> and see if we can allow the optimization or not.
> > Am I getting you correctly, please tell?
>
> I think Jakub means not to use volatile as is but mimic some of its
> properties (like we do not rewrite volatile
> vars into SSA form which fends off most optimizations turing data
> dependences into crontrol dependences,
> not all speculation though).
>
> I think "fixing" the GIMPLE optimizers to not mess up is possible even
> with SSA (just a bit tedious and fragile),
> but real issues will start at the RTL level where you completely lose
> properties like "this is a [dependent] pointer".
>
> Richard.
>
Thank you, Richard and Jakub, both for giving your valuable suggestions and
directions.

Akshat

>
> >
> > -Akshat
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
       [not found]                                                                 ` <CAMEQ7YyuzvN_OzU4oMJGaFMBpnQwkCt4=ZTo-AmSmvsr2dBOMw@mail.gmail.com>
@ 2019-07-07 21:44                                                                   ` Akshat Garg
  0 siblings, 0 replies; 32+ messages in thread
From: Akshat Garg @ 2019-07-07 21:44 UTC (permalink / raw)
  To: gcc mailing list
  Cc: Ramana Radhakrishnan, Paul McKenney, Jakub Jelinek, Richard Biener

On Sun, Jul 7, 2019 at 9:48 PM Akshat Garg <xkspr7@gmail.com> wrote:

> On Sun, Jul 7, 2019 at 7:49 PM Paul E. McKenney <paulmck@linux.ibm.com>
> wrote:
>
>> On Sat, Jul 06, 2019 at 12:39:45PM +0530, Akshat Garg wrote:
>> > On Sat, Jul 6, 2019 at 1:09 AM Akshat Garg <xkspr7@gmail.com> wrote:
>> >
>> > > On Fri, Jul 5, 2019 at 7:18 AM Paul E. McKenney <
>> paulmck@linux.ibm.com>
>> > > wrote:
>> > >
>> > >> [CCing Ramana]
>> > >>
>> > >> My guess is that Jakub is saying that you should start by
>> > >> replacing all instances of TREE_VOLATILE() with (TREE_VOLATILE()
>> > >> || TREE_DEPENDENT_PTR()), give or take my getting the names wrong.
>> > >> Then selectively remove instances of TREE_DEPENDENT_PTR() where it
>> can
>> > >> be shown to be safe to do so.
>> > >>
>> > >
>> > >> The advantage of Jakub's approach over selectively inserting
>> > >> TREE_DEPENDENT_PTR() where it is needed is less chance of bugs due
>> > >> to missing TREE_DEPENDENT_PTR() instances.
>> > >>
>> > > I have checked that replacing TREE_VOLATILE() with (TREE_VOLATILE() ||
>> > > TREE_DEPENDENT_PTR()) is not sufficient to make the dependent ptr act
>> > > similar to volatile ptrs. I am looking over the source and will let
>> you
>> > > know if I am able to do it. If any of like, I can share the patch.
>> > >
>> > > Akshat
>> > >
>> > I have made some changes to the previous commit for _Dependent_ptr
>> > qualifier. We need to use this (
>> >
>> https://github.com/AKG001/gcc/blob/fb4187bc3872a50880159232cf336f0a03505fa8/gcc/tree-core.h#L943
>> )
>> > side-effect flag also with the dependent_ptr flag therefore, I have
>> pulled
>> > out the dependent_ptr from the union. Also, I have introduced a new
>> macro
>> > for dependent ptr TREE_THIS_DEPENDENT_PTR. Similar macro
>> > (TREE_THIS_VOLATILE) is used to check whether anything is volatile or
>> not.
>> > The initial plan was to change all occurrences of checks for volatile
>> with
>> > volatile or dependent_ptr, i.e., TREE_THIS_VOLATILE(NODE) with
>> > (TREE_THIS_DEPENDENT_PTR(NODE) || TREE_THIS_VOLATILE(NODE)). But, I
>> found
>> > 250 places to do this. Would that be okay, if I still change over all
>> the
>> > places?
>>
>> There is only one way to find out!  And in any case, this sounds like
>> one reason why your previous changes didn't fully mimic the volatile
>> keyword.  ;-)
>>
>> So I recommend making the changes.
>>
>> Probably not quite large enough a change to make it worthwhile learning
>> Coccinelle, but that is nevertheless an option.  (This tool allows you
>> to specify update patterns in C code, which it will automatically apply.)
>> http://coccinelle.lip6.fr/ if you want to try it out.
>>
>>                                                         Thanx, Paul
>
>
 I have made the changes shown here
https://github.com/AKG001/gcc/commit/e4ffd77f62ace986c124a70b90a662c083c570ba
The changes are not much tested on the edge cases. The test it was breaking
earlier shows here
https://github.com/AKG001/test/blob/master/dependency_breaking.c I am
checking for other test cases.
The optimized code with -O3 after applying the patch is here
https://github.com/AKG001/test/blob/master/dependency_breaking.c.231t.optimized
Please, I need your opinion and further directions on this.

Thanks,
Akshat

> > >> Jakub also seems to be pointing out that optimizations in the backend
> > >> also need to be handled, which I believe is what he is getting at
> with his
> > >> mention of RTL.  Of course, there is a very large number of backends,
> so
> > >> there needs to be a way of only doing some of them.  One way of doing
> that
> > >> is to handle only the multicore weakly ordered CPUs (ARM, ARMv8,
> PowerPC,
> > >> MIPS, maybe RISC-V), and on other CPUs continue the current behavior
> > >> of transforming the memory_order_consume load to memory_order_acquire.
> > >> For example, memory_order_acquire is almost free on x86, s390, and
> > >> the like.
> > >>
> > >> Does that help?
> > >>
> > >>                                                         Thanx, Paul
> > >>
> > >> On Fri, Jul 05, 2019 at 04:38:03AM +0530, Akshat Garg wrote:
> > >> > On Thu, Jul 4, 2019 at 11:39 PM Jakub Jelinek <jakub@redhat.com>
> wrote:
> > >> >
> > >> > > On Thu, Jul 04, 2019 at 10:40:15AM -0700, Paul E. McKenney wrote:
> > >> > > > > I think fully guaranteeing this is hard (besides when you use
> > >> > > > > volatile), we have the very same issue when dealing with
> > >> > > > > pointer provenance rules, known for years and not fixed
> > >> > > > > (and I don't see a good way to fix these issues without
> > >> > > > > sacrifying performance everywhere).
> > >> > > > >
> > >> > > > > Good luck ;)
> > >> > > >
> > >> > > > Thank you, we will need it!  ;-)
> > >> > >
> > >> > > Well, luck probably isn't all you need, you'll need some way to
> > >> represent
> > >> > > those dependencies in the IL (both GIMPLE and RTL) that would
> ensure
> > >> that
> > >> > > they aren't optimized away, because just hoping that optimization
> > >> don't
> > >> > > mess
> > >> > > that up or just patching a few optimizations you discover first
> isn't
> > >> going
> > >> > > to be very reliable.  Say don't rewrite those into SSA form and
> > >> represent
> > >> > > them close to how volatile ptrs are handled at least for the
> > >> beginning, and
> > >> > > if there are safe optimizations special case those, rather than
> > >> allowing
> > >> > > all
> > >> > > optimizations on those and hope it will work out.
> > >> > >
> > >> > >         Jakub
> > >> > >
> > >> > I don't understand this statement completely "Say don't rewrite
> those
> > >> into
> > >> > SSA form and represent them close to how volatile ptrs are handled
> at
> > >> least
> > >> > for the beginning, and if there are safe optimizations special case
> > >> those,
> > >> > rather than allowing all optimizations on those and hope it will
> work
> > >> > out.".
> > >> > Are you saying that we should try to represent dependent ptrs as
> > >> volatile
> > >> > ptrs for some places?  We should figure out all the places where a
> check
> > >> > for volatile ptrs happens and we check for dependent ptrs there
> also and
> > >> > see if we can allow the optimization or not.
> > >> > Am I getting you correctly, please tell?
> > >> >
> > >> > -Akshat
> > >>
> > >>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-02 15:36                               ` Jason Merrill
  2019-07-02 17:53                                 ` Richard Biener
  2019-07-02 19:37                                 ` Akshat Garg
@ 2019-07-17 10:54                                 ` Akshat Garg
  2019-07-22  0:27                                   ` Akshat Garg
  2 siblings, 1 reply; 32+ messages in thread
From: Akshat Garg @ 2019-07-17 10:54 UTC (permalink / raw)
  To: gcc mailing list
  Cc: Paul E. McKenney, Ramana Radhakrishnan, Jason Merrill,
	Richard Biener, Jakub Jelinek

On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <jason@redhat.com> wrote:

> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com>
> wrote:
> >
> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:
> > >
> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
> > > > ramana.gcc@googlemail.com> wrote:
> > > >
> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
> wrote:
> > > >> >
> > > >> > As we have some working front-end code for _Dependent_ptr, What
> should
> > > >> we do next? What I understand, we can start adding the library for
> > > >> dependent_ptr and its functions for C corresponding to the ones we
> created
> > > >> as C++ template library. Then, after that, we can move on to
> generating the
> > > >> assembly code part.
> > > >> >
> > > >>
> > > >>
> > > >> I think the next step is figuring out how to model the Dependent
> > > >> pointer information in the IR and figuring out what optimizations to
> > > >> allow or not with that information. At this point , I suspect we
> need
> > > >> a plan on record and have the conversation upstream on the lists.
> > > >>
> > > >> I think we need to put down a plan on record.
> > > >>
> > > >> Ramana
> > > >
> > > > [CCing gcc mailing list]
> > > >
> > > > So, shall I start looking over the pointer optimizations only and
> see what
> > > > information we may be needed on the same examples in the IR itself?
> > > >
> > > > - Akshat
> > > >
> > > I have coded an example where equality comparison kills dependency
> from the
> > > document P0190R4 as shown below :
> > >
> > > 1. struct rcutest rt = {1, 2, 3};
> > > 2. void thread0 ()
> > > 3. {
> > > 4.     rt.a = -42;
> > > 5.     rt.b = -43;
> > > 6.     rt.c = -44;
> > > 7.     rcu_assign_pointer(gp, &rt);
> > > 8. }
> > > 9.
> > > 10. void thread1 ()
> > > 11. {
> > > 12.    int i = -1;
> > > 13.    int j = -1;
> > > 14.    _Dependent_ptr struct rcutest *p;
> > > 15.
> > > 16.    p = rcu_dereference(gp);
> > > 17.    j = p->a;
> > > 18.   if (p == &rt)
> > > 19.        i = p->b;  /*Dependency breaking point*/
> > > 20.   else if(p)
> > > 21.       i = p->c;
> > > 22.   assert(i<0);
> > > 23.   assert(j<0);
> > > 24. }
> > > The gimple unoptimized code produced for lines 17-24 is shown below
> > >
> > > 1. if (p_16 == &rt)
> > > 2.     goto <bb 3>; [INV]
> > > 3.   else
> > > 4.    goto <bb 4>; [INV]
> > > 5.
> > > 6.  <bb 3> :
> > > 7.  i_19 = p_16->b;
> > > 8.  goto <bb 6>; [INV]
> > > 9.
> > > 10.  <bb 4> :
> > > 11.  if (p_16 != 0B)
> > > 12.    goto <bb 5>; [INV]
> > > 13.  else
> > > 14.    goto <bb 6>; [INV]
> > > 15.
> > > 16.  <bb 5> :
> > > 17.  i_18 = p_16->c;
> > > 18.
> > > 19.  <bb 6> :
> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
> > > 21.  _3 = i_7 < 0;
> > > 22.  _4 = (int) _3;
> > > 23.  assert (_4);
> > > 24.  _5 = j_17 < 0;
> > > 25.  _6 = (int) _5;
> > > 26.  assert (_6);
> > > 27.  return;
> > >
> > > The optimized code after -O1 is applied for the same lines is hown
> below :
> > >
> > > 1. if (_2 == &rt)
> > > 2.    goto <bb 3>; [30.00%]
> > > 3. else
> > > 4.    goto <bb 4>; [70.00%]
> > > 5.
> > > 6.  <bb 3> [local count: 322122547]:
> > > 7.   i_12 = rt.b;
> > > 8.   goto <bb 6>; [100.00%]
> > > 9.
> > > 10.  <bb 4> [local count: 751619277]:
> > > 11.   if (_1 != 0)
> > > 12.   goto <bb 5>; [50.00%]
> > > 13.   else
> > > 14.    goto <bb 6>; [50.00%]
> > > 15.
> > > 16.  <bb 5> [local count: 375809638]:
> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
> > > 18.
> > > 19.   <bb 6> [local count: 1073741824]:
> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
> > > 21.   _3 = i_7 < 0;
> > > 22.   _4 = (int) _3;
> > > 23.   assert (_4);
> > > 24.  _5 = j_10 < 0;
> > > 25.  _6 = (int) _5;
> > > 26.   assert (_6);
> > > 27.   return;
> >
> > Good show on tracing this through!
> >
> > > Statement 19 in the program gets converted from  i_19 = p_16->b; in
> line 7
> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
> > > breaks the dependency chain. We need to figure out the pass that does
> that
> > > and put some handling code in there for the _dependent_ptr qualified
> > > pointers. Passing simply -fipa-pure-const, -fguess-branch-probability
> or
> > > any other option alone does not produce the optimized code that breaks
> the
> > > dependency. But applying -O1, i.e., allowing all the optimizations
> does so.
> > > As passes are applied in a certain order, we need to figure out up to
> what
> > > passes, the code remains same and after what pass the dependency does
> not
> > > holds. So, we need to check the translated code after every pass.
> > >
> > > Does this sounds like a workable plan for ? Let me know your thoughts.
> If
> > > this sounds good then, we can do this for all the optimizations that
> may
> > > kill the dependencies at somepoint.
> >
> > I don't know of a better plan.
> >
> > My usual question...  Is there some way to script the checking of the
> > translated code at the end of each pass?
>
> The usual way to check the output of an optimization pass is by
> dumping the intermediate code at that point and matching the dump
> against a regexp, as in the tree-ssa directories in the testsuite.
> -fdump-tree-all will dump after all the gimple optimization passes,
> and you can look through them until you find the breakage.
>
> Jason
>
Thanks Jason for the advice, I followed it.

I coded an example shown in figure 26 from here (
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf).
The example:
https://github.com/AKG001/test/blob/master/dependency_breaking.c
The .objsz1 code:
https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1
The .ccp1 code:
https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1
The .sra code:
https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra
The .dom2 code:
https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2

Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed after
constant copy propagation (ccp) pass(
https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77). In
.objsz1 file at line 53, the dependencies start with
p_16 = _14;
.......
i_19 = p_16->b;
......
as user specified and after that all the references are made through struct
p or its variants. But in .ccp1 file at line 41, the "struct p" is replaced
with variable _2, now, variable _2 starts the dependencies as we can see.
_2 = (atomic struct rcutest *) _1;
...........
i_19 = MEM[(struct rcutest *)_2].b;
..........
I don't know whether it is okay to say at this point, the dependencies are
still maintained or not. Or we have to pass the dependent pointer
qualification to new variables that replaces them. Hoping someone could
clarify.

Also, In .sra file at line 43 in thread1(), the statement is:
    i_12 = MEM[(struct rcutest *)_2].b;
In .dom2 file at line 61, the above statement gets converted to:
    i_12 = rt.b;
So, I believe that sure does breaks the dependency specified by the user.
For this, we need to do some modifications in tree-ssa-dom.c file. Hope
this makes sense.

Thanks,
Akshat

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-17 10:54                                 ` Akshat Garg
@ 2019-07-22  0:27                                   ` Akshat Garg
  2019-07-22  8:41                                     ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Akshat Garg @ 2019-07-22  0:27 UTC (permalink / raw)
  To: gcc mailing list
  Cc: Paul E. McKenney, Ramana Radhakrishnan, Jason Merrill,
	Richard Biener, Jakub Jelinek

Hi all,
Consider part of an example(figure 20) from doc P0190R4(
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf) shown
below:

1.  void thread1 (void)
2.  {
3.    int * volatile p;
4.    p = rcu_dereference(gip);
5.    if (p)
6.    assert(*(p+p[0]) == 42);
7.  }
The .gimple code produced is :

1.  thread1 ()
2.  {
3.   atomic int * D.1992;
4.    int * volatile p;
5.  {
6.    atomic int * * __atomic_load_ptr;
7.   atomic int * __atomic_load_tmp;
8.    try
9.     {
10.        __atomic_load_ptr = &gip;
11.        _1 = __atomic_load_8 (__atomic_load_ptr, 1);
12.        _2 = (atomic int *) _1;
13.        __atomic_load_tmp = _2;
14.        D.1992 = __atomic_load_tmp;
15.     }
16.    finally
17.      {
18.        __atomic_load_tmp = {CLOBBER};
19.      }
20.  }
21.   p = D.1992;
22.   p.2_3 = p;
23.  if (p.2_3 != 0B) goto <D.1994>; else goto <D.1995>;
24.  <D.1994>:
25.   p.3_4 = p;
26.  p.4_5 = p;
27.   _6 = *p.4_5;
28.  _7 = (long unsigned int) _6;
29.  _8 = _7 * 4;
30.  _9 = p.3_4 + _8;
31.  _10 = *_9;
32.  _11 = _10 == 42;
33.  _12 = (int) _11;
34.  assert (_12);
35.  <D.1995>:
36. }

assert at line 34 in .gimple code still breaks the dependency given by the
user. I believe, there should be some ssa defined variable of p or p itself
in assert. This is happening when I am considering pointer as volatile
qualified. If I consider it as _Dependent_ptr qualified then it surely
produces the broken dependency code. Let me know, if I wrong somewhere.

-Akshat




On Wed, Jul 17, 2019 at 4:23 PM Akshat Garg <xkspr7@gmail.com> wrote:

> On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <jason@redhat.com> wrote:
>
>> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com>
>> wrote:
>> >
>> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com> wrote:
>> > >
>> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>> > > > ramana.gcc@googlemail.com> wrote:
>> > > >
>> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
>> wrote:
>> > > >> >
>> > > >> > As we have some working front-end code for _Dependent_ptr, What
>> should
>> > > >> we do next? What I understand, we can start adding the library for
>> > > >> dependent_ptr and its functions for C corresponding to the ones we
>> created
>> > > >> as C++ template library. Then, after that, we can move on to
>> generating the
>> > > >> assembly code part.
>> > > >> >
>> > > >>
>> > > >>
>> > > >> I think the next step is figuring out how to model the Dependent
>> > > >> pointer information in the IR and figuring out what optimizations
>> to
>> > > >> allow or not with that information. At this point , I suspect we
>> need
>> > > >> a plan on record and have the conversation upstream on the lists.
>> > > >>
>> > > >> I think we need to put down a plan on record.
>> > > >>
>> > > >> Ramana
>> > > >
>> > > > [CCing gcc mailing list]
>> > > >
>> > > > So, shall I start looking over the pointer optimizations only and
>> see what
>> > > > information we may be needed on the same examples in the IR itself?
>> > > >
>> > > > - Akshat
>> > > >
>> > > I have coded an example where equality comparison kills dependency
>> from the
>> > > document P0190R4 as shown below :
>> > >
>> > > 1. struct rcutest rt = {1, 2, 3};
>> > > 2. void thread0 ()
>> > > 3. {
>> > > 4.     rt.a = -42;
>> > > 5.     rt.b = -43;
>> > > 6.     rt.c = -44;
>> > > 7.     rcu_assign_pointer(gp, &rt);
>> > > 8. }
>> > > 9.
>> > > 10. void thread1 ()
>> > > 11. {
>> > > 12.    int i = -1;
>> > > 13.    int j = -1;
>> > > 14.    _Dependent_ptr struct rcutest *p;
>> > > 15.
>> > > 16.    p = rcu_dereference(gp);
>> > > 17.    j = p->a;
>> > > 18.   if (p == &rt)
>> > > 19.        i = p->b;  /*Dependency breaking point*/
>> > > 20.   else if(p)
>> > > 21.       i = p->c;
>> > > 22.   assert(i<0);
>> > > 23.   assert(j<0);
>> > > 24. }
>> > > The gimple unoptimized code produced for lines 17-24 is shown below
>> > >
>> > > 1. if (p_16 == &rt)
>> > > 2.     goto <bb 3>; [INV]
>> > > 3.   else
>> > > 4.    goto <bb 4>; [INV]
>> > > 5.
>> > > 6.  <bb 3> :
>> > > 7.  i_19 = p_16->b;
>> > > 8.  goto <bb 6>; [INV]
>> > > 9.
>> > > 10.  <bb 4> :
>> > > 11.  if (p_16 != 0B)
>> > > 12.    goto <bb 5>; [INV]
>> > > 13.  else
>> > > 14.    goto <bb 6>; [INV]
>> > > 15.
>> > > 16.  <bb 5> :
>> > > 17.  i_18 = p_16->c;
>> > > 18.
>> > > 19.  <bb 6> :
>> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>> > > 21.  _3 = i_7 < 0;
>> > > 22.  _4 = (int) _3;
>> > > 23.  assert (_4);
>> > > 24.  _5 = j_17 < 0;
>> > > 25.  _6 = (int) _5;
>> > > 26.  assert (_6);
>> > > 27.  return;
>> > >
>> > > The optimized code after -O1 is applied for the same lines is hown
>> below :
>> > >
>> > > 1. if (_2 == &rt)
>> > > 2.    goto <bb 3>; [30.00%]
>> > > 3. else
>> > > 4.    goto <bb 4>; [70.00%]
>> > > 5.
>> > > 6.  <bb 3> [local count: 322122547]:
>> > > 7.   i_12 = rt.b;
>> > > 8.   goto <bb 6>; [100.00%]
>> > > 9.
>> > > 10.  <bb 4> [local count: 751619277]:
>> > > 11.   if (_1 != 0)
>> > > 12.   goto <bb 5>; [50.00%]
>> > > 13.   else
>> > > 14.    goto <bb 6>; [50.00%]
>> > > 15.
>> > > 16.  <bb 5> [local count: 375809638]:
>> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>> > > 18.
>> > > 19.   <bb 6> [local count: 1073741824]:
>> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>> > > 21.   _3 = i_7 < 0;
>> > > 22.   _4 = (int) _3;
>> > > 23.   assert (_4);
>> > > 24.  _5 = j_10 < 0;
>> > > 25.  _6 = (int) _5;
>> > > 26.   assert (_6);
>> > > 27.   return;
>> >
>> > Good show on tracing this through!
>> >
>> > > Statement 19 in the program gets converted from  i_19 = p_16->b; in
>> line 7
>> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
>> > > breaks the dependency chain. We need to figure out the pass that does
>> that
>> > > and put some handling code in there for the _dependent_ptr qualified
>> > > pointers. Passing simply -fipa-pure-const, -fguess-branch-probability
>> or
>> > > any other option alone does not produce the optimized code that
>> breaks the
>> > > dependency. But applying -O1, i.e., allowing all the optimizations
>> does so.
>> > > As passes are applied in a certain order, we need to figure out up to
>> what
>> > > passes, the code remains same and after what pass the dependency does
>> not
>> > > holds. So, we need to check the translated code after every pass.
>> > >
>> > > Does this sounds like a workable plan for ? Let me know your
>> thoughts. If
>> > > this sounds good then, we can do this for all the optimizations that
>> may
>> > > kill the dependencies at somepoint.
>> >
>> > I don't know of a better plan.
>> >
>> > My usual question...  Is there some way to script the checking of the
>> > translated code at the end of each pass?
>>
>> The usual way to check the output of an optimization pass is by
>> dumping the intermediate code at that point and matching the dump
>> against a regexp, as in the tree-ssa directories in the testsuite.
>> -fdump-tree-all will dump after all the gimple optimization passes,
>> and you can look through them until you find the breakage.
>>
>> Jason
>>
> Thanks Jason for the advice, I followed it.
>
> I coded an example shown in figure 26 from here (
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf).
> The example:
> https://github.com/AKG001/test/blob/master/dependency_breaking.c
> The .objsz1 code:
> https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1
> The .ccp1 code:
> https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1
> The .sra code:
> https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra
> The .dom2 code:
> https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2
>
> Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed after
> constant copy propagation (ccp) pass(
> https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77). In
> .objsz1 file at line 53, the dependencies start with
> p_16 = _14;
> .......
> i_19 = p_16->b;
> ......
> as user specified and after that all the references are made through
> struct p or its variants. But in .ccp1 file at line 41, the "struct p" is
> replaced with variable _2, now, variable _2 starts the dependencies as we
> can see.
> _2 = (atomic struct rcutest *) _1;
> ...........
> i_19 = MEM[(struct rcutest *)_2].b;
> ..........
> I don't know whether it is okay to say at this point, the dependencies are
> still maintained or not. Or we have to pass the dependent pointer
> qualification to new variables that replaces them. Hoping someone could
> clarify.
>
> Also, In .sra file at line 43 in thread1(), the statement is:
>     i_12 = MEM[(struct rcutest *)_2].b;
> In .dom2 file at line 61, the above statement gets converted to:
>     i_12 = rt.b;
> So, I believe that sure does breaks the dependency specified by the user.
> For this, we need to do some modifications in tree-ssa-dom.c file. Hope
> this makes sense.
>
> Thanks,
> Akshat
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-22  0:27                                   ` Akshat Garg
@ 2019-07-22  8:41                                     ` Richard Biener
  2019-07-22  8:54                                       ` Akshat Garg
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2019-07-22  8:41 UTC (permalink / raw)
  To: Akshat Garg
  Cc: gcc mailing list, Paul E. McKenney, Ramana Radhakrishnan,
	Jason Merrill, Jakub Jelinek

On Mon, Jul 22, 2019 at 2:27 AM Akshat Garg <xkspr7@gmail.com> wrote:

> Hi all,
> Consider part of an example(figure 20) from doc P0190R4(
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf)
> shown below:
>
> 1.  void thread1 (void)
> 2.  {
> 3.    int * volatile p;
> 4.    p = rcu_dereference(gip);
> 5.    if (p)
> 6.    assert(*(p+p[0]) == 42);
> 7.  }
> The .gimple code produced is :
>
> 1.  thread1 ()
> 2.  {
> 3.   atomic int * D.1992;
> 4.    int * volatile p;
> 5.  {
> 6.    atomic int * * __atomic_load_ptr;
> 7.   atomic int * __atomic_load_tmp;
> 8.    try
> 9.     {
> 10.        __atomic_load_ptr = &gip;
> 11.        _1 = __atomic_load_8 (__atomic_load_ptr, 1);
> 12.        _2 = (atomic int *) _1;
> 13.        __atomic_load_tmp = _2;
> 14.        D.1992 = __atomic_load_tmp;
> 15.     }
> 16.    finally
> 17.      {
> 18.        __atomic_load_tmp = {CLOBBER};
> 19.      }
> 20.  }
> 21.   p = D.1992;
> 22.   p.2_3 = p;
> 23.  if (p.2_3 != 0B) goto <D.1994>; else goto <D.1995>;
> 24.  <D.1994>:
> 25.   p.3_4 = p;
> 26.  p.4_5 = p;
> 27.   _6 = *p.4_5;
> 28.  _7 = (long unsigned int) _6;
> 29.  _8 = _7 * 4;
> 30.  _9 = p.3_4 + _8;
> 31.  _10 = *_9;
> 32.  _11 = _10 == 42;
> 33.  _12 = (int) _11;
> 34.  assert (_12);
> 35.  <D.1995>:
> 36. }
>
> assert at line 34 in .gimple code still breaks the dependency given by the
> user. I believe, there should be some ssa defined variable of p or p itself
> in assert. This is happening when I am considering pointer as volatile
> qualified. If I consider it as _Dependent_ptr qualified then it surely
> produces the broken dependency code. Let me know, if I wrong somewhere.
>
>
p appears as memory here which we load its value to p.3_4 which we then
offset by _8 and load from the
computed address into _10 which then appears in the assert condition.  I
think that's as good as it can
get ...

Richard.


> -Akshat
>
>
>
>
> On Wed, Jul 17, 2019 at 4:23 PM Akshat Garg <xkspr7@gmail.com> wrote:
>
>> On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <jason@redhat.com> wrote:
>>
>>> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com>
>>> wrote:
>>> >
>>> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>>> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com>
>>> wrote:
>>> > >
>>> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>>> > > > ramana.gcc@googlemail.com> wrote:
>>> > > >
>>> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
>>> wrote:
>>> > > >> >
>>> > > >> > As we have some working front-end code for _Dependent_ptr, What
>>> should
>>> > > >> we do next? What I understand, we can start adding the library for
>>> > > >> dependent_ptr and its functions for C corresponding to the ones
>>> we created
>>> > > >> as C++ template library. Then, after that, we can move on to
>>> generating the
>>> > > >> assembly code part.
>>> > > >> >
>>> > > >>
>>> > > >>
>>> > > >> I think the next step is figuring out how to model the Dependent
>>> > > >> pointer information in the IR and figuring out what optimizations
>>> to
>>> > > >> allow or not with that information. At this point , I suspect we
>>> need
>>> > > >> a plan on record and have the conversation upstream on the lists.
>>> > > >>
>>> > > >> I think we need to put down a plan on record.
>>> > > >>
>>> > > >> Ramana
>>> > > >
>>> > > > [CCing gcc mailing list]
>>> > > >
>>> > > > So, shall I start looking over the pointer optimizations only and
>>> see what
>>> > > > information we may be needed on the same examples in the IR itself?
>>> > > >
>>> > > > - Akshat
>>> > > >
>>> > > I have coded an example where equality comparison kills dependency
>>> from the
>>> > > document P0190R4 as shown below :
>>> > >
>>> > > 1. struct rcutest rt = {1, 2, 3};
>>> > > 2. void thread0 ()
>>> > > 3. {
>>> > > 4.     rt.a = -42;
>>> > > 5.     rt.b = -43;
>>> > > 6.     rt.c = -44;
>>> > > 7.     rcu_assign_pointer(gp, &rt);
>>> > > 8. }
>>> > > 9.
>>> > > 10. void thread1 ()
>>> > > 11. {
>>> > > 12.    int i = -1;
>>> > > 13.    int j = -1;
>>> > > 14.    _Dependent_ptr struct rcutest *p;
>>> > > 15.
>>> > > 16.    p = rcu_dereference(gp);
>>> > > 17.    j = p->a;
>>> > > 18.   if (p == &rt)
>>> > > 19.        i = p->b;  /*Dependency breaking point*/
>>> > > 20.   else if(p)
>>> > > 21.       i = p->c;
>>> > > 22.   assert(i<0);
>>> > > 23.   assert(j<0);
>>> > > 24. }
>>> > > The gimple unoptimized code produced for lines 17-24 is shown below
>>> > >
>>> > > 1. if (p_16 == &rt)
>>> > > 2.     goto <bb 3>; [INV]
>>> > > 3.   else
>>> > > 4.    goto <bb 4>; [INV]
>>> > > 5.
>>> > > 6.  <bb 3> :
>>> > > 7.  i_19 = p_16->b;
>>> > > 8.  goto <bb 6>; [INV]
>>> > > 9.
>>> > > 10.  <bb 4> :
>>> > > 11.  if (p_16 != 0B)
>>> > > 12.    goto <bb 5>; [INV]
>>> > > 13.  else
>>> > > 14.    goto <bb 6>; [INV]
>>> > > 15.
>>> > > 16.  <bb 5> :
>>> > > 17.  i_18 = p_16->c;
>>> > > 18.
>>> > > 19.  <bb 6> :
>>> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>>> > > 21.  _3 = i_7 < 0;
>>> > > 22.  _4 = (int) _3;
>>> > > 23.  assert (_4);
>>> > > 24.  _5 = j_17 < 0;
>>> > > 25.  _6 = (int) _5;
>>> > > 26.  assert (_6);
>>> > > 27.  return;
>>> > >
>>> > > The optimized code after -O1 is applied for the same lines is hown
>>> below :
>>> > >
>>> > > 1. if (_2 == &rt)
>>> > > 2.    goto <bb 3>; [30.00%]
>>> > > 3. else
>>> > > 4.    goto <bb 4>; [70.00%]
>>> > > 5.
>>> > > 6.  <bb 3> [local count: 322122547]:
>>> > > 7.   i_12 = rt.b;
>>> > > 8.   goto <bb 6>; [100.00%]
>>> > > 9.
>>> > > 10.  <bb 4> [local count: 751619277]:
>>> > > 11.   if (_1 != 0)
>>> > > 12.   goto <bb 5>; [50.00%]
>>> > > 13.   else
>>> > > 14.    goto <bb 6>; [50.00%]
>>> > > 15.
>>> > > 16.  <bb 5> [local count: 375809638]:
>>> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>>> > > 18.
>>> > > 19.   <bb 6> [local count: 1073741824]:
>>> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>>> > > 21.   _3 = i_7 < 0;
>>> > > 22.   _4 = (int) _3;
>>> > > 23.   assert (_4);
>>> > > 24.  _5 = j_10 < 0;
>>> > > 25.  _6 = (int) _5;
>>> > > 26.   assert (_6);
>>> > > 27.   return;
>>> >
>>> > Good show on tracing this through!
>>> >
>>> > > Statement 19 in the program gets converted from  i_19 = p_16->b; in
>>> line 7
>>> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code which
>>> > > breaks the dependency chain. We need to figure out the pass that
>>> does that
>>> > > and put some handling code in there for the _dependent_ptr qualified
>>> > > pointers. Passing simply -fipa-pure-const,
>>> -fguess-branch-probability or
>>> > > any other option alone does not produce the optimized code that
>>> breaks the
>>> > > dependency. But applying -O1, i.e., allowing all the optimizations
>>> does so.
>>> > > As passes are applied in a certain order, we need to figure out up
>>> to what
>>> > > passes, the code remains same and after what pass the dependency
>>> does not
>>> > > holds. So, we need to check the translated code after every pass.
>>> > >
>>> > > Does this sounds like a workable plan for ? Let me know your
>>> thoughts. If
>>> > > this sounds good then, we can do this for all the optimizations that
>>> may
>>> > > kill the dependencies at somepoint.
>>> >
>>> > I don't know of a better plan.
>>> >
>>> > My usual question...  Is there some way to script the checking of the
>>> > translated code at the end of each pass?
>>>
>>> The usual way to check the output of an optimization pass is by
>>> dumping the intermediate code at that point and matching the dump
>>> against a regexp, as in the tree-ssa directories in the testsuite.
>>> -fdump-tree-all will dump after all the gimple optimization passes,
>>> and you can look through them until you find the breakage.
>>>
>>> Jason
>>>
>> Thanks Jason for the advice, I followed it.
>>
>> I coded an example shown in figure 26 from here (
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf).
>> The example:
>> https://github.com/AKG001/test/blob/master/dependency_breaking.c
>> The .objsz1 code:
>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1
>> The .ccp1 code:
>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1
>> The .sra code:
>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra
>> The .dom2 code:
>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2
>>
>> Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed after
>> constant copy propagation (ccp) pass(
>> https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77). In
>> .objsz1 file at line 53, the dependencies start with
>> p_16 = _14;
>> .......
>> i_19 = p_16->b;
>> ......
>> as user specified and after that all the references are made through
>> struct p or its variants. But in .ccp1 file at line 41, the "struct p" is
>> replaced with variable _2, now, variable _2 starts the dependencies as we
>> can see.
>> _2 = (atomic struct rcutest *) _1;
>> ...........
>> i_19 = MEM[(struct rcutest *)_2].b;
>> ..........
>> I don't know whether it is okay to say at this point, the dependencies
>> are still maintained or not. Or we have to pass the dependent pointer
>> qualification to new variables that replaces them. Hoping someone could
>> clarify.
>>
>> Also, In .sra file at line 43 in thread1(), the statement is:
>>     i_12 = MEM[(struct rcutest *)_2].b;
>> In .dom2 file at line 61, the above statement gets converted to:
>>     i_12 = rt.b;
>> So, I believe that sure does breaks the dependency specified by the user.
>> For this, we need to do some modifications in tree-ssa-dom.c file. Hope
>> this makes sense.
>>
>> Thanks,
>> Akshat
>>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-22  8:41                                     ` Richard Biener
@ 2019-07-22  8:54                                       ` Akshat Garg
  2019-07-22  9:46                                         ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Akshat Garg @ 2019-07-22  8:54 UTC (permalink / raw)
  To: Richard Biener
  Cc: gcc mailing list, Paul E. McKenney, Ramana Radhakrishnan,
	Jason Merrill, Jakub Jelinek

On Mon, Jul 22, 2019 at 2:11 PM Richard Biener <richard.guenther@gmail.com>
wrote:

> On Mon, Jul 22, 2019 at 2:27 AM Akshat Garg <xkspr7@gmail.com> wrote:
>
>> Hi all,
>> Consider part of an example(figure 20) from doc P0190R4(
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf)
>> shown below:
>>
>> 1.  void thread1 (void)
>> 2.  {
>> 3.    int * volatile p;
>> 4.    p = rcu_dereference(gip);
>> 5.    if (p)
>> 6.    assert(*(p+p[0]) == 42);
>> 7.  }
>> The .gimple code produced is :
>>
>> 1.  thread1 ()
>> 2.  {
>> 3.   atomic int * D.1992;
>> 4.    int * volatile p;
>> 5.  {
>> 6.    atomic int * * __atomic_load_ptr;
>> 7.   atomic int * __atomic_load_tmp;
>> 8.    try
>> 9.     {
>> 10.        __atomic_load_ptr = &gip;
>> 11.        _1 = __atomic_load_8 (__atomic_load_ptr, 1);
>> 12.        _2 = (atomic int *) _1;
>> 13.        __atomic_load_tmp = _2;
>> 14.        D.1992 = __atomic_load_tmp;
>> 15.     }
>> 16.    finally
>> 17.      {
>> 18.        __atomic_load_tmp = {CLOBBER};
>> 19.      }
>> 20.  }
>> 21.   p = D.1992;
>> 22.   p.2_3 = p;
>> 23.  if (p.2_3 != 0B) goto <D.1994>; else goto <D.1995>;
>> 24.  <D.1994>:
>> 25.   p.3_4 = p;
>> 26.  p.4_5 = p;
>> 27.   _6 = *p.4_5;
>> 28.  _7 = (long unsigned int) _6;
>> 29.  _8 = _7 * 4;
>> 30.  _9 = p.3_4 + _8;
>> 31.  _10 = *_9;
>> 32.  _11 = _10 == 42;
>> 33.  _12 = (int) _11;
>> 34.  assert (_12);
>> 35.  <D.1995>:
>> 36. }
>>
>> assert at line 34 in .gimple code still breaks the dependency given by
>> the user. I believe, there should be some ssa defined variable of p or p
>> itself in assert. This is happening when I am considering pointer as
>> volatile qualified. If I consider it as _Dependent_ptr qualified then it
>> surely produces the broken dependency code. Let me know, if I wrong
>> somewhere.
>>
>>
> p appears as memory here which we load its value to p.3_4 which we then
> offset by _8 and load from the
> computed address into _10 which then appears in the assert condition.  I
> think that's as good as it can
> get ...
>
> Richard.
>

Thank you for your reply. For, the same example above, consider this (
https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L402)
instruction at rtl level changed form this (
https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L231)
during the cse pass. The variable p.2_3 gets replaced by a temporary _1 but
_1 is not any dependent pointer where, p.2_3 was. Is this also not breaking
any dependencies?

-Akshat
>>
>>
>>
>>
>> On Wed, Jul 17, 2019 at 4:23 PM Akshat Garg <xkspr7@gmail.com> wrote:
>>
>>> On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <jason@redhat.com> wrote:
>>>
>>>> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com>
>>>> wrote:
>>>> >
>>>> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>>>> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com>
>>>> wrote:
>>>> > >
>>>> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>>>> > > > ramana.gcc@googlemail.com> wrote:
>>>> > > >
>>>> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
>>>> wrote:
>>>> > > >> >
>>>> > > >> > As we have some working front-end code for _Dependent_ptr,
>>>> What should
>>>> > > >> we do next? What I understand, we can start adding the library
>>>> for
>>>> > > >> dependent_ptr and its functions for C corresponding to the ones
>>>> we created
>>>> > > >> as C++ template library. Then, after that, we can move on to
>>>> generating the
>>>> > > >> assembly code part.
>>>> > > >> >
>>>> > > >>
>>>> > > >>
>>>> > > >> I think the next step is figuring out how to model the Dependent
>>>> > > >> pointer information in the IR and figuring out what
>>>> optimizations to
>>>> > > >> allow or not with that information. At this point , I suspect we
>>>> need
>>>> > > >> a plan on record and have the conversation upstream on the lists.
>>>> > > >>
>>>> > > >> I think we need to put down a plan on record.
>>>> > > >>
>>>> > > >> Ramana
>>>> > > >
>>>> > > > [CCing gcc mailing list]
>>>> > > >
>>>> > > > So, shall I start looking over the pointer optimizations only and
>>>> see what
>>>> > > > information we may be needed on the same examples in the IR
>>>> itself?
>>>> > > >
>>>> > > > - Akshat
>>>> > > >
>>>> > > I have coded an example where equality comparison kills dependency
>>>> from the
>>>> > > document P0190R4 as shown below :
>>>> > >
>>>> > > 1. struct rcutest rt = {1, 2, 3};
>>>> > > 2. void thread0 ()
>>>> > > 3. {
>>>> > > 4.     rt.a = -42;
>>>> > > 5.     rt.b = -43;
>>>> > > 6.     rt.c = -44;
>>>> > > 7.     rcu_assign_pointer(gp, &rt);
>>>> > > 8. }
>>>> > > 9.
>>>> > > 10. void thread1 ()
>>>> > > 11. {
>>>> > > 12.    int i = -1;
>>>> > > 13.    int j = -1;
>>>> > > 14.    _Dependent_ptr struct rcutest *p;
>>>> > > 15.
>>>> > > 16.    p = rcu_dereference(gp);
>>>> > > 17.    j = p->a;
>>>> > > 18.   if (p == &rt)
>>>> > > 19.        i = p->b;  /*Dependency breaking point*/
>>>> > > 20.   else if(p)
>>>> > > 21.       i = p->c;
>>>> > > 22.   assert(i<0);
>>>> > > 23.   assert(j<0);
>>>> > > 24. }
>>>> > > The gimple unoptimized code produced for lines 17-24 is shown below
>>>> > >
>>>> > > 1. if (p_16 == &rt)
>>>> > > 2.     goto <bb 3>; [INV]
>>>> > > 3.   else
>>>> > > 4.    goto <bb 4>; [INV]
>>>> > > 5.
>>>> > > 6.  <bb 3> :
>>>> > > 7.  i_19 = p_16->b;
>>>> > > 8.  goto <bb 6>; [INV]
>>>> > > 9.
>>>> > > 10.  <bb 4> :
>>>> > > 11.  if (p_16 != 0B)
>>>> > > 12.    goto <bb 5>; [INV]
>>>> > > 13.  else
>>>> > > 14.    goto <bb 6>; [INV]
>>>> > > 15.
>>>> > > 16.  <bb 5> :
>>>> > > 17.  i_18 = p_16->c;
>>>> > > 18.
>>>> > > 19.  <bb 6> :
>>>> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>>>> > > 21.  _3 = i_7 < 0;
>>>> > > 22.  _4 = (int) _3;
>>>> > > 23.  assert (_4);
>>>> > > 24.  _5 = j_17 < 0;
>>>> > > 25.  _6 = (int) _5;
>>>> > > 26.  assert (_6);
>>>> > > 27.  return;
>>>> > >
>>>> > > The optimized code after -O1 is applied for the same lines is hown
>>>> below :
>>>> > >
>>>> > > 1. if (_2 == &rt)
>>>> > > 2.    goto <bb 3>; [30.00%]
>>>> > > 3. else
>>>> > > 4.    goto <bb 4>; [70.00%]
>>>> > > 5.
>>>> > > 6.  <bb 3> [local count: 322122547]:
>>>> > > 7.   i_12 = rt.b;
>>>> > > 8.   goto <bb 6>; [100.00%]
>>>> > > 9.
>>>> > > 10.  <bb 4> [local count: 751619277]:
>>>> > > 11.   if (_1 != 0)
>>>> > > 12.   goto <bb 5>; [50.00%]
>>>> > > 13.   else
>>>> > > 14.    goto <bb 6>; [50.00%]
>>>> > > 15.
>>>> > > 16.  <bb 5> [local count: 375809638]:
>>>> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>>>> > > 18.
>>>> > > 19.   <bb 6> [local count: 1073741824]:
>>>> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>>>> > > 21.   _3 = i_7 < 0;
>>>> > > 22.   _4 = (int) _3;
>>>> > > 23.   assert (_4);
>>>> > > 24.  _5 = j_10 < 0;
>>>> > > 25.  _6 = (int) _5;
>>>> > > 26.   assert (_6);
>>>> > > 27.   return;
>>>> >
>>>> > Good show on tracing this through!
>>>> >
>>>> > > Statement 19 in the program gets converted from  i_19 = p_16->b; in
>>>> line 7
>>>> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
>>>> which
>>>> > > breaks the dependency chain. We need to figure out the pass that
>>>> does that
>>>> > > and put some handling code in there for the _dependent_ptr qualified
>>>> > > pointers. Passing simply -fipa-pure-const,
>>>> -fguess-branch-probability or
>>>> > > any other option alone does not produce the optimized code that
>>>> breaks the
>>>> > > dependency. But applying -O1, i.e., allowing all the optimizations
>>>> does so.
>>>> > > As passes are applied in a certain order, we need to figure out up
>>>> to what
>>>> > > passes, the code remains same and after what pass the dependency
>>>> does not
>>>> > > holds. So, we need to check the translated code after every pass.
>>>> > >
>>>> > > Does this sounds like a workable plan for ? Let me know your
>>>> thoughts. If
>>>> > > this sounds good then, we can do this for all the optimizations
>>>> that may
>>>> > > kill the dependencies at somepoint.
>>>> >
>>>> > I don't know of a better plan.
>>>> >
>>>> > My usual question...  Is there some way to script the checking of the
>>>> > translated code at the end of each pass?
>>>>
>>>> The usual way to check the output of an optimization pass is by
>>>> dumping the intermediate code at that point and matching the dump
>>>> against a regexp, as in the tree-ssa directories in the testsuite.
>>>> -fdump-tree-all will dump after all the gimple optimization passes,
>>>> and you can look through them until you find the breakage.
>>>>
>>>> Jason
>>>>
>>> Thanks Jason for the advice, I followed it.
>>>
>>> I coded an example shown in figure 26 from here (
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf).
>>> The example:
>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c
>>> The .objsz1 code:
>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1
>>> The .ccp1 code:
>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1
>>> The .sra code:
>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra
>>> The .dom2 code:
>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2
>>>
>>> Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed after
>>> constant copy propagation (ccp) pass(
>>> https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77). In
>>> .objsz1 file at line 53, the dependencies start with
>>> p_16 = _14;
>>> .......
>>> i_19 = p_16->b;
>>> ......
>>> as user specified and after that all the references are made through
>>> struct p or its variants. But in .ccp1 file at line 41, the "struct p" is
>>> replaced with variable _2, now, variable _2 starts the dependencies as we
>>> can see.
>>> _2 = (atomic struct rcutest *) _1;
>>> ...........
>>> i_19 = MEM[(struct rcutest *)_2].b;
>>> ..........
>>> I don't know whether it is okay to say at this point, the dependencies
>>> are still maintained or not. Or we have to pass the dependent pointer
>>> qualification to new variables that replaces them. Hoping someone could
>>> clarify.
>>>
>>> Also, In .sra file at line 43 in thread1(), the statement is:
>>>     i_12 = MEM[(struct rcutest *)_2].b;
>>> In .dom2 file at line 61, the above statement gets converted to:
>>>     i_12 = rt.b;
>>> So, I believe that sure does breaks the dependency specified by the
>>> user. For this, we need to do some modifications in tree-ssa-dom.c file.
>>> Hope this makes sense.
>>>
>>> Thanks,
>>> Akshat
>>>
>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-22  8:54                                       ` Akshat Garg
@ 2019-07-22  9:46                                         ` Richard Biener
  2019-07-23 20:11                                           ` Akshat Garg
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2019-07-22  9:46 UTC (permalink / raw)
  To: Akshat Garg
  Cc: gcc mailing list, Paul E. McKenney, Ramana Radhakrishnan,
	Jason Merrill, Jakub Jelinek

On Mon, Jul 22, 2019 at 10:54 AM Akshat Garg <xkspr7@gmail.com> wrote:

> On Mon, Jul 22, 2019 at 2:11 PM Richard Biener <richard.guenther@gmail.com>
> wrote:
>
>> On Mon, Jul 22, 2019 at 2:27 AM Akshat Garg <xkspr7@gmail.com> wrote:
>>
>>> Hi all,
>>> Consider part of an example(figure 20) from doc P0190R4(
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf)
>>> shown below:
>>>
>>> 1.  void thread1 (void)
>>> 2.  {
>>> 3.    int * volatile p;
>>> 4.    p = rcu_dereference(gip);
>>> 5.    if (p)
>>> 6.    assert(*(p+p[0]) == 42);
>>> 7.  }
>>> The .gimple code produced is :
>>>
>>> 1.  thread1 ()
>>> 2.  {
>>> 3.   atomic int * D.1992;
>>> 4.    int * volatile p;
>>> 5.  {
>>> 6.    atomic int * * __atomic_load_ptr;
>>> 7.   atomic int * __atomic_load_tmp;
>>> 8.    try
>>> 9.     {
>>> 10.        __atomic_load_ptr = &gip;
>>> 11.        _1 = __atomic_load_8 (__atomic_load_ptr, 1);
>>> 12.        _2 = (atomic int *) _1;
>>> 13.        __atomic_load_tmp = _2;
>>> 14.        D.1992 = __atomic_load_tmp;
>>> 15.     }
>>> 16.    finally
>>> 17.      {
>>> 18.        __atomic_load_tmp = {CLOBBER};
>>> 19.      }
>>> 20.  }
>>> 21.   p = D.1992;
>>> 22.   p.2_3 = p;
>>> 23.  if (p.2_3 != 0B) goto <D.1994>; else goto <D.1995>;
>>> 24.  <D.1994>:
>>> 25.   p.3_4 = p;
>>> 26.  p.4_5 = p;
>>> 27.   _6 = *p.4_5;
>>> 28.  _7 = (long unsigned int) _6;
>>> 29.  _8 = _7 * 4;
>>> 30.  _9 = p.3_4 + _8;
>>> 31.  _10 = *_9;
>>> 32.  _11 = _10 == 42;
>>> 33.  _12 = (int) _11;
>>> 34.  assert (_12);
>>> 35.  <D.1995>:
>>> 36. }
>>>
>>> assert at line 34 in .gimple code still breaks the dependency given by
>>> the user. I believe, there should be some ssa defined variable of p or p
>>> itself in assert. This is happening when I am considering pointer as
>>> volatile qualified. If I consider it as _Dependent_ptr qualified then it
>>> surely produces the broken dependency code. Let me know, if I wrong
>>> somewhere.
>>>
>>>
>> p appears as memory here which we load its value to p.3_4 which we then
>> offset by _8 and load from the
>> computed address into _10 which then appears in the assert condition.  I
>> think that's as good as it can
>> get ...
>>
>> Richard.
>>
>
> Thank you for your reply. For, the same example above, consider this (
> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L402)
> instruction at rtl level changed form this (
> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L231)
> during the cse pass. The variable p.2_3 gets replaced by a temporary _1 but
> _1 is not any dependent pointer where, p.2_3 was. Is this also not breaking
> any dependencies
>

I'm not sure.  In general CSE can break dependences.  If the dependent
pointer chain needs to conver multiple levels of
indirections from the original atomic operation you need to make sure to
not expose atomics as CSEable.  Thus on
RTL have them all UNSPECs.

Richard.


> -Akshat
>>>
>>>
>>>
>>>
>>> On Wed, Jul 17, 2019 at 4:23 PM Akshat Garg <xkspr7@gmail.com> wrote:
>>>
>>>> On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <jason@redhat.com> wrote:
>>>>
>>>>> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paulmck@linux.ibm.com>
>>>>> wrote:
>>>>> >
>>>>> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>>>>> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com>
>>>>> wrote:
>>>>> > >
>>>>> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>>>>> > > > ramana.gcc@googlemail.com> wrote:
>>>>> > > >
>>>>> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
>>>>> wrote:
>>>>> > > >> >
>>>>> > > >> > As we have some working front-end code for _Dependent_ptr,
>>>>> What should
>>>>> > > >> we do next? What I understand, we can start adding the library
>>>>> for
>>>>> > > >> dependent_ptr and its functions for C corresponding to the ones
>>>>> we created
>>>>> > > >> as C++ template library. Then, after that, we can move on to
>>>>> generating the
>>>>> > > >> assembly code part.
>>>>> > > >> >
>>>>> > > >>
>>>>> > > >>
>>>>> > > >> I think the next step is figuring out how to model the Dependent
>>>>> > > >> pointer information in the IR and figuring out what
>>>>> optimizations to
>>>>> > > >> allow or not with that information. At this point , I suspect
>>>>> we need
>>>>> > > >> a plan on record and have the conversation upstream on the
>>>>> lists.
>>>>> > > >>
>>>>> > > >> I think we need to put down a plan on record.
>>>>> > > >>
>>>>> > > >> Ramana
>>>>> > > >
>>>>> > > > [CCing gcc mailing list]
>>>>> > > >
>>>>> > > > So, shall I start looking over the pointer optimizations only
>>>>> and see what
>>>>> > > > information we may be needed on the same examples in the IR
>>>>> itself?
>>>>> > > >
>>>>> > > > - Akshat
>>>>> > > >
>>>>> > > I have coded an example where equality comparison kills dependency
>>>>> from the
>>>>> > > document P0190R4 as shown below :
>>>>> > >
>>>>> > > 1. struct rcutest rt = {1, 2, 3};
>>>>> > > 2. void thread0 ()
>>>>> > > 3. {
>>>>> > > 4.     rt.a = -42;
>>>>> > > 5.     rt.b = -43;
>>>>> > > 6.     rt.c = -44;
>>>>> > > 7.     rcu_assign_pointer(gp, &rt);
>>>>> > > 8. }
>>>>> > > 9.
>>>>> > > 10. void thread1 ()
>>>>> > > 11. {
>>>>> > > 12.    int i = -1;
>>>>> > > 13.    int j = -1;
>>>>> > > 14.    _Dependent_ptr struct rcutest *p;
>>>>> > > 15.
>>>>> > > 16.    p = rcu_dereference(gp);
>>>>> > > 17.    j = p->a;
>>>>> > > 18.   if (p == &rt)
>>>>> > > 19.        i = p->b;  /*Dependency breaking point*/
>>>>> > > 20.   else if(p)
>>>>> > > 21.       i = p->c;
>>>>> > > 22.   assert(i<0);
>>>>> > > 23.   assert(j<0);
>>>>> > > 24. }
>>>>> > > The gimple unoptimized code produced for lines 17-24 is shown below
>>>>> > >
>>>>> > > 1. if (p_16 == &rt)
>>>>> > > 2.     goto <bb 3>; [INV]
>>>>> > > 3.   else
>>>>> > > 4.    goto <bb 4>; [INV]
>>>>> > > 5.
>>>>> > > 6.  <bb 3> :
>>>>> > > 7.  i_19 = p_16->b;
>>>>> > > 8.  goto <bb 6>; [INV]
>>>>> > > 9.
>>>>> > > 10.  <bb 4> :
>>>>> > > 11.  if (p_16 != 0B)
>>>>> > > 12.    goto <bb 5>; [INV]
>>>>> > > 13.  else
>>>>> > > 14.    goto <bb 6>; [INV]
>>>>> > > 15.
>>>>> > > 16.  <bb 5> :
>>>>> > > 17.  i_18 = p_16->c;
>>>>> > > 18.
>>>>> > > 19.  <bb 6> :
>>>>> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>>>>> > > 21.  _3 = i_7 < 0;
>>>>> > > 22.  _4 = (int) _3;
>>>>> > > 23.  assert (_4);
>>>>> > > 24.  _5 = j_17 < 0;
>>>>> > > 25.  _6 = (int) _5;
>>>>> > > 26.  assert (_6);
>>>>> > > 27.  return;
>>>>> > >
>>>>> > > The optimized code after -O1 is applied for the same lines is hown
>>>>> below :
>>>>> > >
>>>>> > > 1. if (_2 == &rt)
>>>>> > > 2.    goto <bb 3>; [30.00%]
>>>>> > > 3. else
>>>>> > > 4.    goto <bb 4>; [70.00%]
>>>>> > > 5.
>>>>> > > 6.  <bb 3> [local count: 322122547]:
>>>>> > > 7.   i_12 = rt.b;
>>>>> > > 8.   goto <bb 6>; [100.00%]
>>>>> > > 9.
>>>>> > > 10.  <bb 4> [local count: 751619277]:
>>>>> > > 11.   if (_1 != 0)
>>>>> > > 12.   goto <bb 5>; [50.00%]
>>>>> > > 13.   else
>>>>> > > 14.    goto <bb 6>; [50.00%]
>>>>> > > 15.
>>>>> > > 16.  <bb 5> [local count: 375809638]:
>>>>> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>>>>> > > 18.
>>>>> > > 19.   <bb 6> [local count: 1073741824]:
>>>>> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>>>>> > > 21.   _3 = i_7 < 0;
>>>>> > > 22.   _4 = (int) _3;
>>>>> > > 23.   assert (_4);
>>>>> > > 24.  _5 = j_10 < 0;
>>>>> > > 25.  _6 = (int) _5;
>>>>> > > 26.   assert (_6);
>>>>> > > 27.   return;
>>>>> >
>>>>> > Good show on tracing this through!
>>>>> >
>>>>> > > Statement 19 in the program gets converted from  i_19 = p_16->b;
>>>>> in line 7
>>>>> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
>>>>> which
>>>>> > > breaks the dependency chain. We need to figure out the pass that
>>>>> does that
>>>>> > > and put some handling code in there for the _dependent_ptr
>>>>> qualified
>>>>> > > pointers. Passing simply -fipa-pure-const,
>>>>> -fguess-branch-probability or
>>>>> > > any other option alone does not produce the optimized code that
>>>>> breaks the
>>>>> > > dependency. But applying -O1, i.e., allowing all the optimizations
>>>>> does so.
>>>>> > > As passes are applied in a certain order, we need to figure out up
>>>>> to what
>>>>> > > passes, the code remains same and after what pass the dependency
>>>>> does not
>>>>> > > holds. So, we need to check the translated code after every pass.
>>>>> > >
>>>>> > > Does this sounds like a workable plan for ? Let me know your
>>>>> thoughts. If
>>>>> > > this sounds good then, we can do this for all the optimizations
>>>>> that may
>>>>> > > kill the dependencies at somepoint.
>>>>> >
>>>>> > I don't know of a better plan.
>>>>> >
>>>>> > My usual question...  Is there some way to script the checking of the
>>>>> > translated code at the end of each pass?
>>>>>
>>>>> The usual way to check the output of an optimization pass is by
>>>>> dumping the intermediate code at that point and matching the dump
>>>>> against a regexp, as in the tree-ssa directories in the testsuite.
>>>>> -fdump-tree-all will dump after all the gimple optimization passes,
>>>>> and you can look through them until you find the breakage.
>>>>>
>>>>> Jason
>>>>>
>>>> Thanks Jason for the advice, I followed it.
>>>>
>>>> I coded an example shown in figure 26 from here (
>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf).
>>>> The example:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c
>>>> The .objsz1 code:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1
>>>> The .ccp1 code:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1
>>>> The .sra code:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra
>>>> The .dom2 code:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2
>>>>
>>>> Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed after
>>>> constant copy propagation (ccp) pass(
>>>> https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77). In
>>>> .objsz1 file at line 53, the dependencies start with
>>>> p_16 = _14;
>>>> .......
>>>> i_19 = p_16->b;
>>>> ......
>>>> as user specified and after that all the references are made through
>>>> struct p or its variants. But in .ccp1 file at line 41, the "struct p" is
>>>> replaced with variable _2, now, variable _2 starts the dependencies as we
>>>> can see.
>>>> _2 = (atomic struct rcutest *) _1;
>>>> ...........
>>>> i_19 = MEM[(struct rcutest *)_2].b;
>>>> ..........
>>>> I don't know whether it is okay to say at this point, the dependencies
>>>> are still maintained or not. Or we have to pass the dependent pointer
>>>> qualification to new variables that replaces them. Hoping someone could
>>>> clarify.
>>>>
>>>> Also, In .sra file at line 43 in thread1(), the statement is:
>>>>     i_12 = MEM[(struct rcutest *)_2].b;
>>>> In .dom2 file at line 61, the above statement gets converted to:
>>>>     i_12 = rt.b;
>>>> So, I believe that sure does breaks the dependency specified by the
>>>> user. For this, we need to do some modifications in tree-ssa-dom.c file.
>>>> Hope this makes sense.
>>>>
>>>> Thanks,
>>>> Akshat
>>>>
>>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-22  9:46                                         ` Richard Biener
@ 2019-07-23 20:11                                           ` Akshat Garg
  2019-07-23 20:50                                             ` Akshat Garg
  0 siblings, 1 reply; 32+ messages in thread
From: Akshat Garg @ 2019-07-23 20:11 UTC (permalink / raw)
  To: gcc mailing list
  Cc: Paul E. McKenney, Ramana Radhakrishnan, Richard Biener,
	Jason Merrill, Jakub Jelinek

Hi all,

I have tried to make the dependent_ptr qualification act as volatile during
the RTL passes to bypass the RTL optimizations for now. Here is the patch
https://github.com/AKG001/gcc/commit/14c05ae546554f822f667fdb72080b7fe52fea32
For this (https://github.com/AKG001/test/blob/master/dependency_breaking.c)
example, I have been able to get this (
https://github.com/AKG001/test/blob/master/dependency_breaking.c.317r.final)
.final code. I believe that there are no dependencies that are broken.
Hoping someone else could also confirm if I have mistaken somewhere. I have
given the dependent_ptr qualification to only memory type objects for now.
The flag /d represents that this memory represents the dependent pointer in
.final code.
I am checking up the other examples form document P0190R4 to see what other
things need to be done. Let me know what you guys think.

-Akshat

On Mon, Jul 22, 2019 at 3:16 PM Richard Biener <richard.guenther@gmail.com>
wrote:

> On Mon, Jul 22, 2019 at 10:54 AM Akshat Garg <xkspr7@gmail.com> wrote:
>
>> On Mon, Jul 22, 2019 at 2:11 PM Richard Biener <
>> richard.guenther@gmail.com> wrote:
>>
>>> On Mon, Jul 22, 2019 at 2:27 AM Akshat Garg <xkspr7@gmail.com> wrote:
>>>
>>>> Hi all,
>>>> Consider part of an example(figure 20) from doc P0190R4(
>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf)
>>>> shown below:
>>>>
>>>> 1.  void thread1 (void)
>>>> 2.  {
>>>> 3.    int * volatile p;
>>>> 4.    p = rcu_dereference(gip);
>>>> 5.    if (p)
>>>> 6.    assert(*(p+p[0]) == 42);
>>>> 7.  }
>>>> The .gimple code produced is :
>>>>
>>>> 1.  thread1 ()
>>>> 2.  {
>>>> 3.   atomic int * D.1992;
>>>> 4.    int * volatile p;
>>>> 5.  {
>>>> 6.    atomic int * * __atomic_load_ptr;
>>>> 7.   atomic int * __atomic_load_tmp;
>>>> 8.    try
>>>> 9.     {
>>>> 10.        __atomic_load_ptr = &gip;
>>>> 11.        _1 = __atomic_load_8 (__atomic_load_ptr, 1);
>>>> 12.        _2 = (atomic int *) _1;
>>>> 13.        __atomic_load_tmp = _2;
>>>> 14.        D.1992 = __atomic_load_tmp;
>>>> 15.     }
>>>> 16.    finally
>>>> 17.      {
>>>> 18.        __atomic_load_tmp = {CLOBBER};
>>>> 19.      }
>>>> 20.  }
>>>> 21.   p = D.1992;
>>>> 22.   p.2_3 = p;
>>>> 23.  if (p.2_3 != 0B) goto <D.1994>; else goto <D.1995>;
>>>> 24.  <D.1994>:
>>>> 25.   p.3_4 = p;
>>>> 26.  p.4_5 = p;
>>>> 27.   _6 = *p.4_5;
>>>> 28.  _7 = (long unsigned int) _6;
>>>> 29.  _8 = _7 * 4;
>>>> 30.  _9 = p.3_4 + _8;
>>>> 31.  _10 = *_9;
>>>> 32.  _11 = _10 == 42;
>>>> 33.  _12 = (int) _11;
>>>> 34.  assert (_12);
>>>> 35.  <D.1995>:
>>>> 36. }
>>>>
>>>> assert at line 34 in .gimple code still breaks the dependency given by
>>>> the user. I believe, there should be some ssa defined variable of p or p
>>>> itself in assert. This is happening when I am considering pointer as
>>>> volatile qualified. If I consider it as _Dependent_ptr qualified then it
>>>> surely produces the broken dependency code. Let me know, if I wrong
>>>> somewhere.
>>>>
>>>>
>>> p appears as memory here which we load its value to p.3_4 which we then
>>> offset by _8 and load from the
>>> computed address into _10 which then appears in the assert condition.  I
>>> think that's as good as it can
>>> get ...
>>>
>>> Richard.
>>>
>>
>> Thank you for your reply. For, the same example above, consider this (
>> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L402)
>> instruction at rtl level changed form this (
>> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L231)
>> during the cse pass. The variable p.2_3 gets replaced by a temporary _1 but
>> _1 is not any dependent pointer where, p.2_3 was. Is this also not breaking
>> any dependencies
>>
>
> I'm not sure.  In general CSE can break dependences.  If the dependent
> pointer chain needs to conver multiple levels of
> indirections from the original atomic operation you need to make sure to
> not expose atomics as CSEable.  Thus on
> RTL have them all UNSPECs.
>
> Richard.
>
>
>> -Akshat
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jul 17, 2019 at 4:23 PM Akshat Garg <xkspr7@gmail.com> wrote:
>>>>
>>>>> On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <jason@redhat.com> wrote:
>>>>>
>>>>>> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <
>>>>>> paulmck@linux.ibm.com> wrote:
>>>>>> >
>>>>>> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>>>>>> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com>
>>>>>> wrote:
>>>>>> > >
>>>>>> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>>>>>> > > > ramana.gcc@googlemail.com> wrote:
>>>>>> > > >
>>>>>> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xkspr7@gmail.com>
>>>>>> wrote:
>>>>>> > > >> >
>>>>>> > > >> > As we have some working front-end code for _Dependent_ptr,
>>>>>> What should
>>>>>> > > >> we do next? What I understand, we can start adding the library
>>>>>> for
>>>>>> > > >> dependent_ptr and its functions for C corresponding to the
>>>>>> ones we created
>>>>>> > > >> as C++ template library. Then, after that, we can move on to
>>>>>> generating the
>>>>>> > > >> assembly code part.
>>>>>> > > >> >
>>>>>> > > >>
>>>>>> > > >>
>>>>>> > > >> I think the next step is figuring out how to model the
>>>>>> Dependent
>>>>>> > > >> pointer information in the IR and figuring out what
>>>>>> optimizations to
>>>>>> > > >> allow or not with that information. At this point , I suspect
>>>>>> we need
>>>>>> > > >> a plan on record and have the conversation upstream on the
>>>>>> lists.
>>>>>> > > >>
>>>>>> > > >> I think we need to put down a plan on record.
>>>>>> > > >>
>>>>>> > > >> Ramana
>>>>>> > > >
>>>>>> > > > [CCing gcc mailing list]
>>>>>> > > >
>>>>>> > > > So, shall I start looking over the pointer optimizations only
>>>>>> and see what
>>>>>> > > > information we may be needed on the same examples in the IR
>>>>>> itself?
>>>>>> > > >
>>>>>> > > > - Akshat
>>>>>> > > >
>>>>>> > > I have coded an example where equality comparison kills
>>>>>> dependency from the
>>>>>> > > document P0190R4 as shown below :
>>>>>> > >
>>>>>> > > 1. struct rcutest rt = {1, 2, 3};
>>>>>> > > 2. void thread0 ()
>>>>>> > > 3. {
>>>>>> > > 4.     rt.a = -42;
>>>>>> > > 5.     rt.b = -43;
>>>>>> > > 6.     rt.c = -44;
>>>>>> > > 7.     rcu_assign_pointer(gp, &rt);
>>>>>> > > 8. }
>>>>>> > > 9.
>>>>>> > > 10. void thread1 ()
>>>>>> > > 11. {
>>>>>> > > 12.    int i = -1;
>>>>>> > > 13.    int j = -1;
>>>>>> > > 14.    _Dependent_ptr struct rcutest *p;
>>>>>> > > 15.
>>>>>> > > 16.    p = rcu_dereference(gp);
>>>>>> > > 17.    j = p->a;
>>>>>> > > 18.   if (p == &rt)
>>>>>> > > 19.        i = p->b;  /*Dependency breaking point*/
>>>>>> > > 20.   else if(p)
>>>>>> > > 21.       i = p->c;
>>>>>> > > 22.   assert(i<0);
>>>>>> > > 23.   assert(j<0);
>>>>>> > > 24. }
>>>>>> > > The gimple unoptimized code produced for lines 17-24 is shown
>>>>>> below
>>>>>> > >
>>>>>> > > 1. if (p_16 == &rt)
>>>>>> > > 2.     goto <bb 3>; [INV]
>>>>>> > > 3.   else
>>>>>> > > 4.    goto <bb 4>; [INV]
>>>>>> > > 5.
>>>>>> > > 6.  <bb 3> :
>>>>>> > > 7.  i_19 = p_16->b;
>>>>>> > > 8.  goto <bb 6>; [INV]
>>>>>> > > 9.
>>>>>> > > 10.  <bb 4> :
>>>>>> > > 11.  if (p_16 != 0B)
>>>>>> > > 12.    goto <bb 5>; [INV]
>>>>>> > > 13.  else
>>>>>> > > 14.    goto <bb 6>; [INV]
>>>>>> > > 15.
>>>>>> > > 16.  <bb 5> :
>>>>>> > > 17.  i_18 = p_16->c;
>>>>>> > > 18.
>>>>>> > > 19.  <bb 6> :
>>>>>> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>>>>>> > > 21.  _3 = i_7 < 0;
>>>>>> > > 22.  _4 = (int) _3;
>>>>>> > > 23.  assert (_4);
>>>>>> > > 24.  _5 = j_17 < 0;
>>>>>> > > 25.  _6 = (int) _5;
>>>>>> > > 26.  assert (_6);
>>>>>> > > 27.  return;
>>>>>> > >
>>>>>> > > The optimized code after -O1 is applied for the same lines is
>>>>>> hown below :
>>>>>> > >
>>>>>> > > 1. if (_2 == &rt)
>>>>>> > > 2.    goto <bb 3>; [30.00%]
>>>>>> > > 3. else
>>>>>> > > 4.    goto <bb 4>; [70.00%]
>>>>>> > > 5.
>>>>>> > > 6.  <bb 3> [local count: 322122547]:
>>>>>> > > 7.   i_12 = rt.b;
>>>>>> > > 8.   goto <bb 6>; [100.00%]
>>>>>> > > 9.
>>>>>> > > 10.  <bb 4> [local count: 751619277]:
>>>>>> > > 11.   if (_1 != 0)
>>>>>> > > 12.   goto <bb 5>; [50.00%]
>>>>>> > > 13.   else
>>>>>> > > 14.    goto <bb 6>; [50.00%]
>>>>>> > > 15.
>>>>>> > > 16.  <bb 5> [local count: 375809638]:
>>>>>> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>>>>>> > > 18.
>>>>>> > > 19.   <bb 6> [local count: 1073741824]:
>>>>>> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>>>>>> > > 21.   _3 = i_7 < 0;
>>>>>> > > 22.   _4 = (int) _3;
>>>>>> > > 23.   assert (_4);
>>>>>> > > 24.  _5 = j_10 < 0;
>>>>>> > > 25.  _6 = (int) _5;
>>>>>> > > 26.   assert (_6);
>>>>>> > > 27.   return;
>>>>>> >
>>>>>> > Good show on tracing this through!
>>>>>> >
>>>>>> > > Statement 19 in the program gets converted from  i_19 = p_16->b;
>>>>>> in line 7
>>>>>> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
>>>>>> which
>>>>>> > > breaks the dependency chain. We need to figure out the pass that
>>>>>> does that
>>>>>> > > and put some handling code in there for the _dependent_ptr
>>>>>> qualified
>>>>>> > > pointers. Passing simply -fipa-pure-const,
>>>>>> -fguess-branch-probability or
>>>>>> > > any other option alone does not produce the optimized code that
>>>>>> breaks the
>>>>>> > > dependency. But applying -O1, i.e., allowing all the
>>>>>> optimizations does so.
>>>>>> > > As passes are applied in a certain order, we need to figure out
>>>>>> up to what
>>>>>> > > passes, the code remains same and after what pass the dependency
>>>>>> does not
>>>>>> > > holds. So, we need to check the translated code after every pass.
>>>>>> > >
>>>>>> > > Does this sounds like a workable plan for ? Let me know your
>>>>>> thoughts. If
>>>>>> > > this sounds good then, we can do this for all the optimizations
>>>>>> that may
>>>>>> > > kill the dependencies at somepoint.
>>>>>> >
>>>>>> > I don't know of a better plan.
>>>>>> >
>>>>>> > My usual question...  Is there some way to script the checking of
>>>>>> the
>>>>>> > translated code at the end of each pass?
>>>>>>
>>>>>> The usual way to check the output of an optimization pass is by
>>>>>> dumping the intermediate code at that point and matching the dump
>>>>>> against a regexp, as in the tree-ssa directories in the testsuite.
>>>>>> -fdump-tree-all will dump after all the gimple optimization passes,
>>>>>> and you can look through them until you find the breakage.
>>>>>>
>>>>>> Jason
>>>>>>
>>>>> Thanks Jason for the advice, I followed it.
>>>>>
>>>>> I coded an example shown in figure 26 from here (
>>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf).
>>>>> The example:
>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c
>>>>> The .objsz1 code:
>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1
>>>>> The .ccp1 code:
>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1
>>>>> The .sra code:
>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra
>>>>> The .dom2 code:
>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2
>>>>>
>>>>> Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed
>>>>> after constant copy propagation (ccp) pass(
>>>>> https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77). In
>>>>> .objsz1 file at line 53, the dependencies start with
>>>>> p_16 = _14;
>>>>> .......
>>>>> i_19 = p_16->b;
>>>>> ......
>>>>> as user specified and after that all the references are made through
>>>>> struct p or its variants. But in .ccp1 file at line 41, the "struct p" is
>>>>> replaced with variable _2, now, variable _2 starts the dependencies as we
>>>>> can see.
>>>>> _2 = (atomic struct rcutest *) _1;
>>>>> ...........
>>>>> i_19 = MEM[(struct rcutest *)_2].b;
>>>>> ..........
>>>>> I don't know whether it is okay to say at this point, the dependencies
>>>>> are still maintained or not. Or we have to pass the dependent pointer
>>>>> qualification to new variables that replaces them. Hoping someone could
>>>>> clarify.
>>>>>
>>>>> Also, In .sra file at line 43 in thread1(), the statement is:
>>>>>     i_12 = MEM[(struct rcutest *)_2].b;
>>>>> In .dom2 file at line 61, the above statement gets converted to:
>>>>>     i_12 = rt.b;
>>>>> So, I believe that sure does breaks the dependency specified by the
>>>>> user. For this, we need to do some modifications in tree-ssa-dom.c file.
>>>>> Hope this makes sense.
>>>>>
>>>>> Thanks,
>>>>> Akshat
>>>>>
>>>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Doubts regarding the _Dependent_ptr keyword
  2019-07-23 20:11                                           ` Akshat Garg
@ 2019-07-23 20:50                                             ` Akshat Garg
  0 siblings, 0 replies; 32+ messages in thread
From: Akshat Garg @ 2019-07-23 20:50 UTC (permalink / raw)
  To: gcc mailing list
  Cc: Paul E. McKenney, Ramana Radhakrishnan, Richard Biener,
	Jason Merrill, Jakub Jelinek

On Wed, Jul 24, 2019 at 1:41 AM Akshat Garg <xkspr7@gmail.com> wrote:

> Hi all,
>
> I have tried to make the dependent_ptr qualification act as volatile
> during the RTL passes to bypass the RTL optimizations for now. Here is the
> patch
> https://github.com/AKG001/gcc/commit/14c05ae546554f822f667fdb72080b7fe52fea32
> For this (https://github.com/AKG001/test/blob/master/dependency_breaking.c)
> example, I have been able to get this (
> https://github.com/AKG001/test/blob/master/dependency_breaking.c.317r.final)
> .final code. I believe that there are no dependencies that are broken.
> Hoping someone else could also confirm if I have mistaken somewhere. I have
> given the dependent_ptr qualification to only memory type objects for now.
> The flag /d represents that this memory represents the dependent pointer in
> .final code.
> I am checking up the other examples form document P0190R4 to see what
> other things need to be done. Let me know what you guys think.
>
Sorry, I forgot to give the .optimized code for the above example. Here, it
is
https://github.com/AKG001/test/blob/master/dependency_breaking.c.231t.optimize
<https://github.com/AKG001/test/blob/master/dependency_breaking.c.231t.optimized>
May be needed.

>
> -Akshat
>
> On Mon, Jul 22, 2019 at 3:16 PM Richard Biener <richard.guenther@gmail.com>
> wrote:
>
>> On Mon, Jul 22, 2019 at 10:54 AM Akshat Garg <xkspr7@gmail.com> wrote:
>>
>>> On Mon, Jul 22, 2019 at 2:11 PM Richard Biener <
>>> richard.guenther@gmail.com> wrote:
>>>
>>>> On Mon, Jul 22, 2019 at 2:27 AM Akshat Garg <xkspr7@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>> Consider part of an example(figure 20) from doc P0190R4(
>>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf)
>>>>> shown below:
>>>>>
>>>>> 1.  void thread1 (void)
>>>>> 2.  {
>>>>> 3.    int * volatile p;
>>>>> 4.    p = rcu_dereference(gip);
>>>>> 5.    if (p)
>>>>> 6.    assert(*(p+p[0]) == 42);
>>>>> 7.  }
>>>>> The .gimple code produced is :
>>>>>
>>>>> 1.  thread1 ()
>>>>> 2.  {
>>>>> 3.   atomic int * D.1992;
>>>>> 4.    int * volatile p;
>>>>> 5.  {
>>>>> 6.    atomic int * * __atomic_load_ptr;
>>>>> 7.   atomic int * __atomic_load_tmp;
>>>>> 8.    try
>>>>> 9.     {
>>>>> 10.        __atomic_load_ptr = &gip;
>>>>> 11.        _1 = __atomic_load_8 (__atomic_load_ptr, 1);
>>>>> 12.        _2 = (atomic int *) _1;
>>>>> 13.        __atomic_load_tmp = _2;
>>>>> 14.        D.1992 = __atomic_load_tmp;
>>>>> 15.     }
>>>>> 16.    finally
>>>>> 17.      {
>>>>> 18.        __atomic_load_tmp = {CLOBBER};
>>>>> 19.      }
>>>>> 20.  }
>>>>> 21.   p = D.1992;
>>>>> 22.   p.2_3 = p;
>>>>> 23.  if (p.2_3 != 0B) goto <D.1994>; else goto <D.1995>;
>>>>> 24.  <D.1994>:
>>>>> 25.   p.3_4 = p;
>>>>> 26.  p.4_5 = p;
>>>>> 27.   _6 = *p.4_5;
>>>>> 28.  _7 = (long unsigned int) _6;
>>>>> 29.  _8 = _7 * 4;
>>>>> 30.  _9 = p.3_4 + _8;
>>>>> 31.  _10 = *_9;
>>>>> 32.  _11 = _10 == 42;
>>>>> 33.  _12 = (int) _11;
>>>>> 34.  assert (_12);
>>>>> 35.  <D.1995>:
>>>>> 36. }
>>>>>
>>>>> assert at line 34 in .gimple code still breaks the dependency given by
>>>>> the user. I believe, there should be some ssa defined variable of p or p
>>>>> itself in assert. This is happening when I am considering pointer as
>>>>> volatile qualified. If I consider it as _Dependent_ptr qualified then it
>>>>> surely produces the broken dependency code. Let me know, if I wrong
>>>>> somewhere.
>>>>>
>>>>>
>>>> p appears as memory here which we load its value to p.3_4 which we then
>>>> offset by _8 and load from the
>>>> computed address into _10 which then appears in the assert condition.
>>>> I think that's as good as it can
>>>> get ...
>>>>
>>>> Richard.
>>>>
>>>
>>> Thank you for your reply. For, the same example above, consider this (
>>> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L402)
>>> instruction at rtl level changed form this (
>>> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L231)
>>> during the cse pass. The variable p.2_3 gets replaced by a temporary _1 but
>>> _1 is not any dependent pointer where, p.2_3 was. Is this also not breaking
>>> any dependencies
>>>
>>
>> I'm not sure.  In general CSE can break dependences.  If the dependent
>> pointer chain needs to conver multiple levels of
>> indirections from the original atomic operation you need to make sure to
>> not expose atomics as CSEable.  Thus on
>> RTL have them all UNSPECs.
>>
>> Richard.
>>
>>
>>> -Akshat
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 17, 2019 at 4:23 PM Akshat Garg <xkspr7@gmail.com> wrote:
>>>>>
>>>>>> On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <jason@redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <
>>>>>>> paulmck@linux.ibm.com> wrote:
>>>>>>> >
>>>>>>> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>>>>>>> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xkspr7@gmail.com>
>>>>>>> wrote:
>>>>>>> > >
>>>>>>> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>>>>>>> > > > ramana.gcc@googlemail.com> wrote:
>>>>>>> > > >
>>>>>>> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <
>>>>>>> xkspr7@gmail.com> wrote:
>>>>>>> > > >> >
>>>>>>> > > >> > As we have some working front-end code for _Dependent_ptr,
>>>>>>> What should
>>>>>>> > > >> we do next? What I understand, we can start adding the
>>>>>>> library for
>>>>>>> > > >> dependent_ptr and its functions for C corresponding to the
>>>>>>> ones we created
>>>>>>> > > >> as C++ template library. Then, after that, we can move on to
>>>>>>> generating the
>>>>>>> > > >> assembly code part.
>>>>>>> > > >> >
>>>>>>> > > >>
>>>>>>> > > >>
>>>>>>> > > >> I think the next step is figuring out how to model the
>>>>>>> Dependent
>>>>>>> > > >> pointer information in the IR and figuring out what
>>>>>>> optimizations to
>>>>>>> > > >> allow or not with that information. At this point , I suspect
>>>>>>> we need
>>>>>>> > > >> a plan on record and have the conversation upstream on the
>>>>>>> lists.
>>>>>>> > > >>
>>>>>>> > > >> I think we need to put down a plan on record.
>>>>>>> > > >>
>>>>>>> > > >> Ramana
>>>>>>> > > >
>>>>>>> > > > [CCing gcc mailing list]
>>>>>>> > > >
>>>>>>> > > > So, shall I start looking over the pointer optimizations only
>>>>>>> and see what
>>>>>>> > > > information we may be needed on the same examples in the IR
>>>>>>> itself?
>>>>>>> > > >
>>>>>>> > > > - Akshat
>>>>>>> > > >
>>>>>>> > > I have coded an example where equality comparison kills
>>>>>>> dependency from the
>>>>>>> > > document P0190R4 as shown below :
>>>>>>> > >
>>>>>>> > > 1. struct rcutest rt = {1, 2, 3};
>>>>>>> > > 2. void thread0 ()
>>>>>>> > > 3. {
>>>>>>> > > 4.     rt.a = -42;
>>>>>>> > > 5.     rt.b = -43;
>>>>>>> > > 6.     rt.c = -44;
>>>>>>> > > 7.     rcu_assign_pointer(gp, &rt);
>>>>>>> > > 8. }
>>>>>>> > > 9.
>>>>>>> > > 10. void thread1 ()
>>>>>>> > > 11. {
>>>>>>> > > 12.    int i = -1;
>>>>>>> > > 13.    int j = -1;
>>>>>>> > > 14.    _Dependent_ptr struct rcutest *p;
>>>>>>> > > 15.
>>>>>>> > > 16.    p = rcu_dereference(gp);
>>>>>>> > > 17.    j = p->a;
>>>>>>> > > 18.   if (p == &rt)
>>>>>>> > > 19.        i = p->b;  /*Dependency breaking point*/
>>>>>>> > > 20.   else if(p)
>>>>>>> > > 21.       i = p->c;
>>>>>>> > > 22.   assert(i<0);
>>>>>>> > > 23.   assert(j<0);
>>>>>>> > > 24. }
>>>>>>> > > The gimple unoptimized code produced for lines 17-24 is shown
>>>>>>> below
>>>>>>> > >
>>>>>>> > > 1. if (p_16 == &rt)
>>>>>>> > > 2.     goto <bb 3>; [INV]
>>>>>>> > > 3.   else
>>>>>>> > > 4.    goto <bb 4>; [INV]
>>>>>>> > > 5.
>>>>>>> > > 6.  <bb 3> :
>>>>>>> > > 7.  i_19 = p_16->b;
>>>>>>> > > 8.  goto <bb 6>; [INV]
>>>>>>> > > 9.
>>>>>>> > > 10.  <bb 4> :
>>>>>>> > > 11.  if (p_16 != 0B)
>>>>>>> > > 12.    goto <bb 5>; [INV]
>>>>>>> > > 13.  else
>>>>>>> > > 14.    goto <bb 6>; [INV]
>>>>>>> > > 15.
>>>>>>> > > 16.  <bb 5> :
>>>>>>> > > 17.  i_18 = p_16->c;
>>>>>>> > > 18.
>>>>>>> > > 19.  <bb 6> :
>>>>>>> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>>>>>>> > > 21.  _3 = i_7 < 0;
>>>>>>> > > 22.  _4 = (int) _3;
>>>>>>> > > 23.  assert (_4);
>>>>>>> > > 24.  _5 = j_17 < 0;
>>>>>>> > > 25.  _6 = (int) _5;
>>>>>>> > > 26.  assert (_6);
>>>>>>> > > 27.  return;
>>>>>>> > >
>>>>>>> > > The optimized code after -O1 is applied for the same lines is
>>>>>>> hown below :
>>>>>>> > >
>>>>>>> > > 1. if (_2 == &rt)
>>>>>>> > > 2.    goto <bb 3>; [30.00%]
>>>>>>> > > 3. else
>>>>>>> > > 4.    goto <bb 4>; [70.00%]
>>>>>>> > > 5.
>>>>>>> > > 6.  <bb 3> [local count: 322122547]:
>>>>>>> > > 7.   i_12 = rt.b;
>>>>>>> > > 8.   goto <bb 6>; [100.00%]
>>>>>>> > > 9.
>>>>>>> > > 10.  <bb 4> [local count: 751619277]:
>>>>>>> > > 11.   if (_1 != 0)
>>>>>>> > > 12.   goto <bb 5>; [50.00%]
>>>>>>> > > 13.   else
>>>>>>> > > 14.    goto <bb 6>; [50.00%]
>>>>>>> > > 15.
>>>>>>> > > 16.  <bb 5> [local count: 375809638]:
>>>>>>> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>>>>>>> > > 18.
>>>>>>> > > 19.   <bb 6> [local count: 1073741824]:
>>>>>>> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>>>>>>> > > 21.   _3 = i_7 < 0;
>>>>>>> > > 22.   _4 = (int) _3;
>>>>>>> > > 23.   assert (_4);
>>>>>>> > > 24.  _5 = j_10 < 0;
>>>>>>> > > 25.  _6 = (int) _5;
>>>>>>> > > 26.   assert (_6);
>>>>>>> > > 27.   return;
>>>>>>> >
>>>>>>> > Good show on tracing this through!
>>>>>>> >
>>>>>>> > > Statement 19 in the program gets converted from  i_19 = p_16->b;
>>>>>>> in line 7
>>>>>>> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
>>>>>>> which
>>>>>>> > > breaks the dependency chain. We need to figure out the pass that
>>>>>>> does that
>>>>>>> > > and put some handling code in there for the _dependent_ptr
>>>>>>> qualified
>>>>>>> > > pointers. Passing simply -fipa-pure-const,
>>>>>>> -fguess-branch-probability or
>>>>>>> > > any other option alone does not produce the optimized code that
>>>>>>> breaks the
>>>>>>> > > dependency. But applying -O1, i.e., allowing all the
>>>>>>> optimizations does so.
>>>>>>> > > As passes are applied in a certain order, we need to figure out
>>>>>>> up to what
>>>>>>> > > passes, the code remains same and after what pass the dependency
>>>>>>> does not
>>>>>>> > > holds. So, we need to check the translated code after every pass.
>>>>>>> > >
>>>>>>> > > Does this sounds like a workable plan for ? Let me know your
>>>>>>> thoughts. If
>>>>>>> > > this sounds good then, we can do this for all the optimizations
>>>>>>> that may
>>>>>>> > > kill the dependencies at somepoint.
>>>>>>> >
>>>>>>> > I don't know of a better plan.
>>>>>>> >
>>>>>>> > My usual question...  Is there some way to script the checking of
>>>>>>> the
>>>>>>> > translated code at the end of each pass?
>>>>>>>
>>>>>>> The usual way to check the output of an optimization pass is by
>>>>>>> dumping the intermediate code at that point and matching the dump
>>>>>>> against a regexp, as in the tree-ssa directories in the testsuite.
>>>>>>> -fdump-tree-all will dump after all the gimple optimization passes,
>>>>>>> and you can look through them until you find the breakage.
>>>>>>>
>>>>>>> Jason
>>>>>>>
>>>>>> Thanks Jason for the advice, I followed it.
>>>>>>
>>>>>> I coded an example shown in figure 26 from here (
>>>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf).
>>>>>> The example:
>>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c
>>>>>> The .objsz1 code:
>>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1
>>>>>> The .ccp1 code:
>>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1
>>>>>> The .sra code:
>>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra
>>>>>> The .dom2 code:
>>>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2
>>>>>>
>>>>>> Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed
>>>>>> after constant copy propagation (ccp) pass(
>>>>>> https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77).
>>>>>> In .objsz1 file at line 53, the dependencies start with
>>>>>> p_16 = _14;
>>>>>> .......
>>>>>> i_19 = p_16->b;
>>>>>> ......
>>>>>> as user specified and after that all the references are made through
>>>>>> struct p or its variants. But in .ccp1 file at line 41, the "struct p" is
>>>>>> replaced with variable _2, now, variable _2 starts the dependencies as we
>>>>>> can see.
>>>>>> _2 = (atomic struct rcutest *) _1;
>>>>>> ...........
>>>>>> i_19 = MEM[(struct rcutest *)_2].b;
>>>>>> ..........
>>>>>> I don't know whether it is okay to say at this point, the
>>>>>> dependencies are still maintained or not. Or we have to pass the dependent
>>>>>> pointer qualification to new variables that replaces them. Hoping someone
>>>>>> could clarify.
>>>>>>
>>>>>> Also, In .sra file at line 43 in thread1(), the statement is:
>>>>>>     i_12 = MEM[(struct rcutest *)_2].b;
>>>>>> In .dom2 file at line 61, the above statement gets converted to:
>>>>>>     i_12 = rt.b;
>>>>>> So, I believe that sure does breaks the dependency specified by the
>>>>>> user. For this, we need to do some modifications in tree-ssa-dom.c file.
>>>>>> Hope this makes sense.
>>>>>>
>>>>>> Thanks,
>>>>>> Akshat
>>>>>>
>>>>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-07-23 20:50 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAMEQ7Yw5mV+zoDV-kmOnWoy4+_CiDeUyJmipkkrTGhxninp3Hw@mail.gmail.com>
     [not found] ` <20190617023256.GJ26519@linux.ibm.com>
     [not found]   ` <CAMEQ7Yxp-o+4dPd_o=eaPYqpV1n7KfqMS_Un1U1n9-ut_HK_uw@mail.gmail.com>
     [not found]     ` <20190618211927.GY26519@linux.ibm.com>
     [not found]       ` <CAJA7tRYJPaFuvR+bC40kGdKOLVTHEFnD1SkBqF6oUcpv-bLMbw@mail.gmail.com>
     [not found]         ` <CAMEQ7YyN72aMAEsuKW4GQua_Vyi8Wd6F6BKcTo8dS_JcLm2TvQ@mail.gmail.com>
     [not found]           ` <CAMEQ7YxT4mwNvgr_iqxKAJE=yeuOmhcNkHL1ET-84hwv9uMwdQ@mail.gmail.com>
     [not found]             ` <CAJA7tRZVhVtabJLgWJ3BK3HhR2pf+X0fiG+sF=y2Y=PkWg8eTQ@mail.gmail.com>
     [not found]               ` <CAMEQ7YyLp0YymRiYD4O_1A=YswbbDBh0gQz9kGtw4t8mfvU4Jg@mail.gmail.com>
     [not found]                 ` <20190619164145.GJ26519@linux.ibm.com>
     [not found]                   ` <20190620140600.GA15142@linux.ibm.com>
     [not found]                     ` <CAMEQ7Yx=dg0TWcXKA8eZk=RCN+az116Y9whKGQSB3fRVZwbvQw@mail.gmail.com>
     [not found]                       ` <CAJA7tRbj6eoOj_fr+Nnm1rS=kdc0jPxkb77AoPCFrJEaaPsCSA@mail.gmail.com>
2019-06-25 16:20                         ` Doubts regarding the _Dependent_ptr keyword Akshat Garg
2019-07-02  0:29                           ` Akshat Garg
2019-07-02  0:59                             ` Paul E. McKenney
2019-07-02  1:42                               ` nick
2019-07-02 15:36                               ` Jason Merrill
2019-07-02 17:53                                 ` Richard Biener
2019-07-03 15:09                                   ` Paul E. McKenney
2019-07-02 19:37                                 ` Akshat Garg
2019-07-17 10:54                                 ` Akshat Garg
2019-07-22  0:27                                   ` Akshat Garg
2019-07-22  8:41                                     ` Richard Biener
2019-07-22  8:54                                       ` Akshat Garg
2019-07-22  9:46                                         ` Richard Biener
2019-07-23 20:11                                           ` Akshat Garg
2019-07-23 20:50                                             ` Akshat Garg
2019-07-02 10:22                             ` Ramana Radhakrishnan
2019-07-02 10:39                               ` Akshat Garg
2019-07-02 11:01                                 ` Ramana Radhakrishnan
2019-07-02 12:38                                   ` Paul E. McKenney
2019-07-02 13:16                                     ` Ramana Radhakrishnan
2019-07-02 15:09                                       ` Paul E. McKenney
2019-07-02 19:09                                         ` Akshat Garg
2019-07-03 15:16                                           ` Paul E. McKenney
2019-07-03 15:48                                             ` Richard Biener
2019-07-03 16:34                                               ` Paul E. McKenney
2019-07-04 11:00                                                 ` Richard Biener
2019-07-04 17:40                                                   ` Paul E. McKenney
2019-07-04 18:09                                                     ` Jakub Jelinek
2019-07-04 23:08                                                       ` Akshat Garg
2019-07-05 11:20                                                         ` Richard Biener
2019-07-05 15:38                                                           ` Akshat Garg
     [not found]                                                         ` <20190705014806.GM26519@linux.ibm.com>
     [not found]                                                           ` <CAMEQ7YyrjkGVhnLUCXMrbyicPkRZkpZvH_3KQ56cNh_6+Tr4iQ@mail.gmail.com>
     [not found]                                                             ` <CAMEQ7YzOqU68kQN99g_zhwJUkNaJa9q_Dv9pue+4NFK4rYubFg@mail.gmail.com>
     [not found]                                                               ` <20190707141928.GE26519@linux.ibm.com>
     [not found]                                                                 ` <CAMEQ7YyuzvN_OzU4oMJGaFMBpnQwkCt4=ZTo-AmSmvsr2dBOMw@mail.gmail.com>
2019-07-07 21:44                                                                   ` Akshat Garg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).