From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 126571 invoked by alias); 23 Jul 2019 20:11:34 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 126418 invoked by uid 89); 23 Jul 2019 20:11:34 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=no version=3.3.1 spammy=guys, HX-Received:6808, H*f:CADzB X-HELO: mail-oi1-f195.google.com Received: from mail-oi1-f195.google.com (HELO mail-oi1-f195.google.com) (209.85.167.195) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 23 Jul 2019 20:11:30 +0000 Received: by mail-oi1-f195.google.com with SMTP id e189so33187416oib.11 for ; Tue, 23 Jul 2019 13:11:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=8UJ8efXWcgrZx3BB891FGmqUKQXkSErTkgOXXPB7e9Y=; b=S7xT4Z52XEEvqIfnNVsvpfSWjzZ3iwfEvyH9qiW79vMprIFyMjh4QAAXH+txf2Y/7/ lzZRl4ZfqIR/io5WqyVEyExxoaUfmJsv8rmSbXYjr7HWMDcIeTDqNhPFdyCtOiw4EgzI SQIYQhgnnZu8Sem2gLqLg8/FweRTrf6Yz33tKmnTmHpiKL/usM9B2gJEsXbkAfD+L+5U iGPYHD+SgDKK3W9HZND+FSjN6sfSZP3vreqvqheoSSri87VzfPhqOf5GjlSP0GSl1Fi4 jM0uFM3iHbYUqdAgaQZjTH20BZzz45GXYjQdm8LoXBshjunsZYA9uVCmWJ5AL31ySIy3 c9HA== MIME-Version: 1.0 References: <20190619164145.GJ26519@linux.ibm.com> <20190620140600.GA15142@linux.ibm.com> <20190702005935.GB26519@linux.ibm.com> In-Reply-To: From: Akshat Garg Date: Tue, 23 Jul 2019 20:11:00 -0000 Message-ID: Subject: Re: Doubts regarding the _Dependent_ptr keyword To: gcc mailing list Cc: "Paul E. McKenney" , Ramana Radhakrishnan , Richard Biener , Jason Merrill , Jakub Jelinek Content-Type: text/plain; charset="UTF-8" X-SW-Source: 2019-07/txt/msg00164.txt.bz2 Hi all, I have tried to make the dependent_ptr qualification act as volatile during the RTL passes to bypass the RTL optimizations for now. Here is the patch https://github.com/AKG001/gcc/commit/14c05ae546554f822f667fdb72080b7fe52fea32 For this (https://github.com/AKG001/test/blob/master/dependency_breaking.c) example, I have been able to get this ( https://github.com/AKG001/test/blob/master/dependency_breaking.c.317r.final) .final code. I believe that there are no dependencies that are broken. Hoping someone else could also confirm if I have mistaken somewhere. I have given the dependent_ptr qualification to only memory type objects for now. The flag /d represents that this memory represents the dependent pointer in .final code. I am checking up the other examples form document P0190R4 to see what other things need to be done. Let me know what you guys think. -Akshat On Mon, Jul 22, 2019 at 3:16 PM Richard Biener wrote: > On Mon, Jul 22, 2019 at 10:54 AM Akshat Garg wrote: > >> On Mon, Jul 22, 2019 at 2:11 PM Richard Biener < >> richard.guenther@gmail.com> wrote: >> >>> On Mon, Jul 22, 2019 at 2:27 AM Akshat Garg wrote: >>> >>>> Hi all, >>>> Consider part of an example(figure 20) from doc P0190R4( >>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf) >>>> shown below: >>>> >>>> 1. void thread1 (void) >>>> 2. { >>>> 3. int * volatile p; >>>> 4. p = rcu_dereference(gip); >>>> 5. if (p) >>>> 6. assert(*(p+p[0]) == 42); >>>> 7. } >>>> The .gimple code produced is : >>>> >>>> 1. thread1 () >>>> 2. { >>>> 3. atomic int * D.1992; >>>> 4. int * volatile p; >>>> 5. { >>>> 6. atomic int * * __atomic_load_ptr; >>>> 7. atomic int * __atomic_load_tmp; >>>> 8. try >>>> 9. { >>>> 10. __atomic_load_ptr = &gip; >>>> 11. _1 = __atomic_load_8 (__atomic_load_ptr, 1); >>>> 12. _2 = (atomic int *) _1; >>>> 13. __atomic_load_tmp = _2; >>>> 14. D.1992 = __atomic_load_tmp; >>>> 15. } >>>> 16. finally >>>> 17. { >>>> 18. __atomic_load_tmp = {CLOBBER}; >>>> 19. } >>>> 20. } >>>> 21. p = D.1992; >>>> 22. p.2_3 = p; >>>> 23. if (p.2_3 != 0B) goto ; else goto ; >>>> 24. : >>>> 25. p.3_4 = p; >>>> 26. p.4_5 = p; >>>> 27. _6 = *p.4_5; >>>> 28. _7 = (long unsigned int) _6; >>>> 29. _8 = _7 * 4; >>>> 30. _9 = p.3_4 + _8; >>>> 31. _10 = *_9; >>>> 32. _11 = _10 == 42; >>>> 33. _12 = (int) _11; >>>> 34. assert (_12); >>>> 35. : >>>> 36. } >>>> >>>> assert at line 34 in .gimple code still breaks the dependency given by >>>> the user. I believe, there should be some ssa defined variable of p or p >>>> itself in assert. This is happening when I am considering pointer as >>>> volatile qualified. If I consider it as _Dependent_ptr qualified then it >>>> surely produces the broken dependency code. Let me know, if I wrong >>>> somewhere. >>>> >>>> >>> p appears as memory here which we load its value to p.3_4 which we then >>> offset by _8 and load from the >>> computed address into _10 which then appears in the assert condition. I >>> think that's as good as it can >>> get ... >>> >>> Richard. >>> >> >> Thank you for your reply. For, the same example above, consider this ( >> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L402) >> instruction at rtl level changed form this ( >> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L231) >> during the cse pass. The variable p.2_3 gets replaced by a temporary _1 but >> _1 is not any dependent pointer where, p.2_3 was. Is this also not breaking >> any dependencies >> > > I'm not sure. In general CSE can break dependences. If the dependent > pointer chain needs to conver multiple levels of > indirections from the original atomic operation you need to make sure to > not expose atomics as CSEable. Thus on > RTL have them all UNSPECs. > > Richard. > > >> -Akshat >>>> >>>> >>>> >>>> >>>> On Wed, Jul 17, 2019 at 4:23 PM Akshat Garg wrote: >>>> >>>>> On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill wrote: >>>>> >>>>>> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney < >>>>>> paulmck@linux.ibm.com> wrote: >>>>>> > >>>>>> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote: >>>>>> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg >>>>>> wrote: >>>>>> > > >>>>>> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan < >>>>>> > > > ramana.gcc@googlemail.com> wrote: >>>>>> > > > >>>>>> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg >>>>>> wrote: >>>>>> > > >> > >>>>>> > > >> > As we have some working front-end code for _Dependent_ptr, >>>>>> What should >>>>>> > > >> we do next? What I understand, we can start adding the library >>>>>> for >>>>>> > > >> dependent_ptr and its functions for C corresponding to the >>>>>> ones we created >>>>>> > > >> as C++ template library. Then, after that, we can move on to >>>>>> generating the >>>>>> > > >> assembly code part. >>>>>> > > >> > >>>>>> > > >> >>>>>> > > >> >>>>>> > > >> I think the next step is figuring out how to model the >>>>>> Dependent >>>>>> > > >> pointer information in the IR and figuring out what >>>>>> optimizations to >>>>>> > > >> allow or not with that information. At this point , I suspect >>>>>> we need >>>>>> > > >> a plan on record and have the conversation upstream on the >>>>>> lists. >>>>>> > > >> >>>>>> > > >> I think we need to put down a plan on record. >>>>>> > > >> >>>>>> > > >> Ramana >>>>>> > > > >>>>>> > > > [CCing gcc mailing list] >>>>>> > > > >>>>>> > > > So, shall I start looking over the pointer optimizations only >>>>>> and see what >>>>>> > > > information we may be needed on the same examples in the IR >>>>>> itself? >>>>>> > > > >>>>>> > > > - Akshat >>>>>> > > > >>>>>> > > I have coded an example where equality comparison kills >>>>>> dependency from the >>>>>> > > document P0190R4 as shown below : >>>>>> > > >>>>>> > > 1. struct rcutest rt = {1, 2, 3}; >>>>>> > > 2. void thread0 () >>>>>> > > 3. { >>>>>> > > 4. rt.a = -42; >>>>>> > > 5. rt.b = -43; >>>>>> > > 6. rt.c = -44; >>>>>> > > 7. rcu_assign_pointer(gp, &rt); >>>>>> > > 8. } >>>>>> > > 9. >>>>>> > > 10. void thread1 () >>>>>> > > 11. { >>>>>> > > 12. int i = -1; >>>>>> > > 13. int j = -1; >>>>>> > > 14. _Dependent_ptr struct rcutest *p; >>>>>> > > 15. >>>>>> > > 16. p = rcu_dereference(gp); >>>>>> > > 17. j = p->a; >>>>>> > > 18. if (p == &rt) >>>>>> > > 19. i = p->b; /*Dependency breaking point*/ >>>>>> > > 20. else if(p) >>>>>> > > 21. i = p->c; >>>>>> > > 22. assert(i<0); >>>>>> > > 23. assert(j<0); >>>>>> > > 24. } >>>>>> > > The gimple unoptimized code produced for lines 17-24 is shown >>>>>> below >>>>>> > > >>>>>> > > 1. if (p_16 == &rt) >>>>>> > > 2. goto ; [INV] >>>>>> > > 3. else >>>>>> > > 4. goto ; [INV] >>>>>> > > 5. >>>>>> > > 6. : >>>>>> > > 7. i_19 = p_16->b; >>>>>> > > 8. goto ; [INV] >>>>>> > > 9. >>>>>> > > 10. : >>>>>> > > 11. if (p_16 != 0B) >>>>>> > > 12. goto ; [INV] >>>>>> > > 13. else >>>>>> > > 14. goto ; [INV] >>>>>> > > 15. >>>>>> > > 16. : >>>>>> > > 17. i_18 = p_16->c; >>>>>> > > 18. >>>>>> > > 19. : >>>>>> > > 20. # i_7 = PHI >>>>>> > > 21. _3 = i_7 < 0; >>>>>> > > 22. _4 = (int) _3; >>>>>> > > 23. assert (_4); >>>>>> > > 24. _5 = j_17 < 0; >>>>>> > > 25. _6 = (int) _5; >>>>>> > > 26. assert (_6); >>>>>> > > 27. return; >>>>>> > > >>>>>> > > The optimized code after -O1 is applied for the same lines is >>>>>> hown below : >>>>>> > > >>>>>> > > 1. if (_2 == &rt) >>>>>> > > 2. goto ; [30.00%] >>>>>> > > 3. else >>>>>> > > 4. goto ; [70.00%] >>>>>> > > 5. >>>>>> > > 6. [local count: 322122547]: >>>>>> > > 7. i_12 = rt.b; >>>>>> > > 8. goto ; [100.00%] >>>>>> > > 9. >>>>>> > > 10. [local count: 751619277]: >>>>>> > > 11. if (_1 != 0) >>>>>> > > 12. goto ; [50.00%] >>>>>> > > 13. else >>>>>> > > 14. goto ; [50.00%] >>>>>> > > 15. >>>>>> > > 16. [local count: 375809638]: >>>>>> > > 17. i_11 = MEM[(dependent_ptr struct rcutest *)_2].c; >>>>>> > > 18. >>>>>> > > 19. [local count: 1073741824]: >>>>>> > > 20. # i_7 = PHI >>>>>> > > 21. _3 = i_7 < 0; >>>>>> > > 22. _4 = (int) _3; >>>>>> > > 23. assert (_4); >>>>>> > > 24. _5 = j_10 < 0; >>>>>> > > 25. _6 = (int) _5; >>>>>> > > 26. assert (_6); >>>>>> > > 27. return; >>>>>> > >>>>>> > Good show on tracing this through! >>>>>> > >>>>>> > > Statement 19 in the program gets converted from i_19 = p_16->b; >>>>>> in line 7 >>>>>> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code >>>>>> which >>>>>> > > breaks the dependency chain. We need to figure out the pass that >>>>>> does that >>>>>> > > and put some handling code in there for the _dependent_ptr >>>>>> qualified >>>>>> > > pointers. Passing simply -fipa-pure-const, >>>>>> -fguess-branch-probability or >>>>>> > > any other option alone does not produce the optimized code that >>>>>> breaks the >>>>>> > > dependency. But applying -O1, i.e., allowing all the >>>>>> optimizations does so. >>>>>> > > As passes are applied in a certain order, we need to figure out >>>>>> up to what >>>>>> > > passes, the code remains same and after what pass the dependency >>>>>> does not >>>>>> > > holds. So, we need to check the translated code after every pass. >>>>>> > > >>>>>> > > Does this sounds like a workable plan for ? Let me know your >>>>>> thoughts. If >>>>>> > > this sounds good then, we can do this for all the optimizations >>>>>> that may >>>>>> > > kill the dependencies at somepoint. >>>>>> > >>>>>> > I don't know of a better plan. >>>>>> > >>>>>> > My usual question... Is there some way to script the checking of >>>>>> the >>>>>> > translated code at the end of each pass? >>>>>> >>>>>> The usual way to check the output of an optimization pass is by >>>>>> dumping the intermediate code at that point and matching the dump >>>>>> against a regexp, as in the tree-ssa directories in the testsuite. >>>>>> -fdump-tree-all will dump after all the gimple optimization passes, >>>>>> and you can look through them until you find the breakage. >>>>>> >>>>>> Jason >>>>>> >>>>> Thanks Jason for the advice, I followed it. >>>>> >>>>> I coded an example shown in figure 26 from here ( >>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf). >>>>> The example: >>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c >>>>> The .objsz1 code: >>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1 >>>>> The .ccp1 code: >>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1 >>>>> The .sra code: >>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra >>>>> The .dom2 code: >>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2 >>>>> >>>>> Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed >>>>> after constant copy propagation (ccp) pass( >>>>> https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77). In >>>>> .objsz1 file at line 53, the dependencies start with >>>>> p_16 = _14; >>>>> ....... >>>>> i_19 = p_16->b; >>>>> ...... >>>>> as user specified and after that all the references are made through >>>>> struct p or its variants. But in .ccp1 file at line 41, the "struct p" is >>>>> replaced with variable _2, now, variable _2 starts the dependencies as we >>>>> can see. >>>>> _2 = (atomic struct rcutest *) _1; >>>>> ........... >>>>> i_19 = MEM[(struct rcutest *)_2].b; >>>>> .......... >>>>> I don't know whether it is okay to say at this point, the dependencies >>>>> are still maintained or not. Or we have to pass the dependent pointer >>>>> qualification to new variables that replaces them. Hoping someone could >>>>> clarify. >>>>> >>>>> Also, In .sra file at line 43 in thread1(), the statement is: >>>>> i_12 = MEM[(struct rcutest *)_2].b; >>>>> In .dom2 file at line 61, the above statement gets converted to: >>>>> i_12 = rt.b; >>>>> So, I believe that sure does breaks the dependency specified by the >>>>> user. For this, we need to do some modifications in tree-ssa-dom.c file. >>>>> Hope this makes sense. >>>>> >>>>> Thanks, >>>>> Akshat >>>>> >>>>