Re: move increase_alignment from simple to regular ipa pass

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: Richard Biener <rguenther@suse.de>,
	David Edelsohn <dje.gcc@gmail.com>,
		GCC Patches <gcc-patches@gcc.gnu.org>,
		"William J. Schmidt" <wschmidt@linux.vnet.ibm.com>,
		Segher Boessenkool <segher@kernel.crashing.org>
Subject: Re: move increase_alignment from simple to regular ipa pass
Date: Tue, 05 Jul 2016 09:53:00 -0000	[thread overview]
Message-ID: <CAAgBjMnJbw7q4Mt=G5DNPCQcO61VUNFhUMaMFBiw24cv1_hUYw@mail.gmail.com> (raw)
In-Reply-To: <CAAgBjMkpmiFEdiJ+Pk_ADPeFEzsbtDwxRtcorzePpUVe3=ABjA@mail.gmail.com>

ping * 2 ping https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01703.html

Thanks,
Prathamesh

On 28 June 2016 at 14:49, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
> ping https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01703.html
>
> Thanks,
> Prathamesh
>
> On 23 June 2016 at 22:51, Prathamesh Kulkarni
> <prathamesh.kulkarni@linaro.org> wrote:
>> On 17 June 2016 at 19:52, Prathamesh Kulkarni
>> <prathamesh.kulkarni@linaro.org> wrote:
>>> On 14 June 2016 at 18:31, Prathamesh Kulkarni
>>> <prathamesh.kulkarni@linaro.org> wrote:
>>>> On 13 June 2016 at 16:13, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>>>> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
>>>>>> index ecafe63..41ac408 100644
>>>>>> --- a/gcc/cgraph.h
>>>>>> +++ b/gcc/cgraph.h
>>>>>> @@ -1874,6 +1874,9 @@ public:
>>>>>>       if we did not do any inter-procedural code movement.  */
>>>>>>    unsigned used_by_single_function : 1;
>>>>>>
>>>>>> +  /* Set if -fsection-anchors is set.  */
>>>>>> +  unsigned section_anchor : 1;
>>>>>> +
>>>>>>  private:
>>>>>>    /* Assemble thunks and aliases associated to varpool node.  */
>>>>>>    void assemble_aliases (void);
>>>>>> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
>>>>>> index 4bfcad7..e75d5c0 100644
>>>>>> --- a/gcc/cgraphunit.c
>>>>>> +++ b/gcc/cgraphunit.c
>>>>>> @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl)
>>>>>>       it is available to notice_global_symbol.  */
>>>>>>    node->definition = true;
>>>>>>    notice_global_symbol (decl);
>>>>>> +
>>>>>> +  node->section_anchor = flag_section_anchors;
>>>>>> +
>>>>>>    if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl)
>>>>>>        /* Traditionally we do not eliminate static variables when not
>>>>>>        optimizing and when not doing toplevel reoder.  */
>>>>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>>>>> index f0d7196..e497795 100644
>>>>>> --- a/gcc/common.opt
>>>>>> +++ b/gcc/common.opt
>>>>>> @@ -1590,6 +1590,10 @@ fira-algorithm=
>>>>>>  Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization
>>>>>>  -fira-algorithm=[CB|priority] Set the used IRA algorithm.
>>>>>>
>>>>>> +fipa-increase_alignment
>>>>>> +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization
>>>>>> +Option to gate increase_alignment ipa pass.
>>>>>> +
>>>>>>  Enum
>>>>>>  Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA algorithm %qs)
>>>>>>
>>>>>> @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization
>>>>>>  Enable the dependent count heuristic in the scheduler.
>>>>>>
>>>>>>  fsection-anchors
>>>>>> -Common Report Var(flag_section_anchors) Optimization
>>>>>> +Common Report Var(flag_section_anchors)
>>>>>>  Access data in the same section from shared anchor points.
>>>>>>
>>>>>>  fsee
>>>>>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>>>>>> index a0db3a4..1482566 100644
>>>>>> --- a/gcc/config/aarch64/aarch64.c
>>>>>> +++ b/gcc/config/aarch64/aarch64.c
>>>>>> @@ -8252,6 +8252,8 @@ aarch64_override_options (void)
>>>>>>
>>>>>>    aarch64_register_fma_steering ();
>>>>>>
>>>>>> +  /* Enable increase_alignment pass.  */
>>>>>> +  flag_ipa_increase_alignment = 1;
>>>>>
>>>>> I would rather enable it always on targets that do support anchors.
>>>> AFAIK aarch64 supports section anchors.
>>>>>> diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
>>>>>> index ce9e146..7f09f3a 100644
>>>>>> --- a/gcc/lto/lto-symtab.c
>>>>>> +++ b/gcc/lto/lto-symtab.c
>>>>>> @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, symtab_node *entry)
>>>>>>       The type compatibility checks or the completing of types has properly
>>>>>>       dealt with most issues.  */
>>>>>>
>>>>>> +  /* ??? is this assert necessary ?  */
>>>>>> +  varpool_node *v_prevailing = dyn_cast<varpool_node *> (prevailing);
>>>>>> +  varpool_node *v_entry = dyn_cast<varpool_node *> (entry);
>>>>>> +  gcc_assert (v_prevailing && v_entry);
>>>>>> +  /* section_anchor of prevailing_decl wins.  */
>>>>>> +  v_entry->section_anchor = v_prevailing->section_anchor;
>>>>>> +
>>>>> Other flags are merged in lto_varpool_replace_node so please move this there.
>>>> Ah indeed, thanks for the pointers.
>>>> I wonder though if we need to set
>>>> prevailing_node->section_anchor = vnode->section_anchor ?
>>>> IIUC, the function merges flags from vnode into prevailing_node
>>>> and removes vnode. However we want prevailing_node->section_anchor
>>>> to always take precedence.
>>>>>> +/* Return true if alignment should be increased for this vnode.
>>>>>> +   This is done if every function that references/referring to vnode
>>>>>> +   has flag_tree_loop_vectorize set.  */
>>>>>> +
>>>>>> +static bool
>>>>>> +increase_alignment_p (varpool_node *vnode)
>>>>>> +{
>>>>>> +  ipa_ref *ref;
>>>>>> +
>>>>>> +  for (int i = 0; vnode->iterate_reference (i, ref); i++)
>>>>>> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referred))
>>>>>> +      {
>>>>>> +     struct cl_optimization *opts = opts_for_fn (cnode->decl);
>>>>>> +     if (!opts->x_flag_tree_loop_vectorize)
>>>>>> +       return false;
>>>>>> +      }
>>>>>
>>>>> If you take address of function that has vectorizer enabled probably doesn't
>>>>> imply need to increase alignment of that var. So please drop the loop.
>>>>>
>>>>> You only want function that read/writes or takes address of the symbol. But
>>>>> onthe other hand, you need to walk all aliases of the symbol by
>>>>> call_for_symbol_and_aliases
>>>>>> +
>>>>>> +  for (int i = 0; vnode->iterate_referring (i, ref); i++)
>>>>>> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referring))
>>>>>> +      {
>>>>>> +     struct cl_optimization *opts = opts_for_fn (cnode->decl);
>>>>>> +     if (!opts->x_flag_tree_loop_vectorize)
>>>>>> +       return false;
>>>>>> +      }
>>>>>> +
>>>>>> +  return true;
>>>>>> +}
>>>>>> +
>>>>>>  /* Entry point to increase_alignment pass.  */
>>>>>>  static unsigned int
>>>>>>  increase_alignment (void)
>>>>>> @@ -914,9 +942,12 @@ increase_alignment (void)
>>>>>>        tree decl = vnode->decl;
>>>>>>        unsigned int alignment;
>>>>>>
>>>>>> -      if ((decl_in_symtab_p (decl)
>>>>>> -       && !symtab_node::get (decl)->can_increase_alignment_p ())
>>>>>> -       || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl))
>>>>>> +      if (!vnode->section_anchor
>>>>>> +       || (decl_in_symtab_p (decl)
>>>>>> +           && !symtab_node::get (decl)->can_increase_alignment_p ())
>>>>>> +       || DECL_USER_ALIGN (decl)
>>>>>> +       || DECL_ARTIFICIAL (decl)
>>>>>> +       || !increase_alignment_p (vnode))
>>>>>
>>>>> Incrementally we probably should do more testing whether the variable looks like
>>>>> someting that can be vectorized, i.e. it contains array, has address taken or the
>>>>> accesses are array accesses within loop.
>>>>> This can be done by the analysis phase of the IPA pass inspecting the function
>>>>> bodies.
>>>> Thanks, I will try to check for array accesses are within a loop in
>>>> followup patch.
>>>> I was wondering if we could we treat a homogeneous global struct
>>>> (having members of one type),
>>>> as a global array of that type and increase it's alignment if required ?
>>>>>
>>>>> I think it is important waste to bump up everything including error messages etc.
>>>>> At least on i386 the effect on firefox datasegment of various alignment setting is
>>>>> very visible.
>>>> Um for a start, would it be OK to check if all functions referencing variable
>>>> have attribute noreturn, and in that case we skip increasing the alignment ?
>>>> I suppose that error functions would be having attribute noreturn set ?
>>>>>
>>>>> Looks OK to me otherwise. please send updated patch.
>>>> I have done the changes in the attached patch (stage-1 built).
>>>> I am not sure what to return from the callback function and
>>>> arbitrarily chose to return true.
>>> Hi,
>>> In this version (stage-1 built), I added read/write summaries which
>>> stream those variables
>>> which we want to increase alignment for.
>>>
>>> I have a few questions:
>>>
>>> a) To check if global var is used inside a loop, I obtained
>>> corresponding ipa_ref
>>> and checked loop_containing_stmt (ref->stmt), however it returned non-NULL
>>> even when ref->stmt was not inside a loop.
>>> for instance:
>>> int a[32];
>>> int f() { int x = a[0]; return x; }
>>> Um how to check if stmt is used inside a loop ?
>>>
>>> b) Is it necessary during WPA to check if function has
>>> flag_tree_vectorize_set since
>>> during analysis phase in increase_alignment_generate_summary() I check
>>> if cnode has flag_tree_loop_vectorize_set ?
>>>
>>> c) In LTO_increase_alignment_section, the following is streamed:
>>> i) a "count" of type uwhi, to represent number of variables
>>> ii) decls corresponding to the variables
>>> The variables are then obtained during read_summay using
>>> symtab_node::get_create().
>>> I suppose since decls for varpool_nodes would already be streamed in
>>> LTO_section_decls, I was wondering if I
>>> could somehow refer to the decls in that section to avoid duplicate
>>> streaming of decls ?
>> Hi,
>> In this version, the variable is streamed if it occurs within a loop
>> or it's address is taken.
>> To check if stmt is inside a loop I am using:
>>
>> struct loop *l = loop_containing_stmt (ref->stmt);
>> if (l != DECL_STRUCT_FUNCTION (cnode->decl)->x_current_loops->tree_root)
>>   vars->add (vnode);
>> Is this correct ?
>>
>> I came across an unexpected issue in my previous patch with -ffat-lto-objects.
>> I am allocating vars = new hash_set<varpool_node *> () in
>> generate_summary() and freeing it in write_summary().
>> However with -ffat-lto-objects, it appears execute() gets called after
>> write_summary()
>> during LGEN and we hit segfault in execute() at:
>> for (hash_set<varpool_node *>::iterator it = vars.begin (); it !=
>> vars.end (); it++)
>>   { ... }
>> because write_summary() has freed vars.
>> To workaround the issue, I gated call to free vars in write_summary on
>> flag_fat_lto_objects,
>> I am not sure if that's a good solution.
>>
>> Cross tested on arm*-*-*, aarch64*-*-*.
>>
>> Thanks,
>> Prathamesh
>>>
>>> Thanks,
>>> Prathamesh
>>>>
>>>> Thanks,
>>>> Prathamesh
>>>>>
>>>>> Honza

next prev parent reply	other threads:[~2016-07-05  9:53 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-02 12:52 David Edelsohn
2016-06-02 12:57 ` Richard Biener
2016-06-02 15:01   ` Jan Hubicka
2016-06-03  7:57     ` Richard Biener
2016-06-03  8:05       ` Jan Hubicka
2016-06-07  8:34         ` Prathamesh Kulkarni
2016-06-08 14:15           ` Richard Biener
2016-06-08 15:09             ` Jan Hubicka
2016-06-09 20:18               ` Prathamesh Kulkarni
2016-06-09 20:23                 ` Jan Hubicka
2016-06-10  9:33                   ` Prathamesh Kulkarni
2016-06-10 11:17                     ` Richard Biener
2016-06-13  8:57                       ` Prathamesh Kulkarni
2016-06-13 10:43                         ` Jan Hubicka
2016-06-14 13:02                           ` Prathamesh Kulkarni
2016-06-17 14:22                             ` Prathamesh Kulkarni
2016-06-23 17:21                               ` Prathamesh Kulkarni
2016-06-28  9:27                                 ` Prathamesh Kulkarni
2016-07-05  9:53                                   ` Prathamesh Kulkarni [this message]
2016-07-20 11:49                                     ` Prathamesh Kulkarni
  -- strict thread matches above, loose matches on Subject: below --
2016-06-01 10:17 Prathamesh Kulkarni
2016-06-01 13:07 ` Richard Biener
2016-06-02  7:29   ` Prathamesh Kulkarni
2016-06-02  7:53     ` Richard Biener
2016-06-02  9:15       ` Prathamesh Kulkarni
2016-06-02  9:24         ` Prathamesh Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAgBjMnJbw7q4Mt=G5DNPCQcO61VUNFhUMaMFBiw24cv1_hUYw@mail.gmail.com' \
    --to=prathamesh.kulkarni@linaro.org \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=rguenther@suse.de \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).