public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Avoid some C++ local statics with constructors
@ 2016-09-22 20:07 Jakub Jelinek
  2016-09-23  7:03 ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Jelinek @ 2016-09-22 20:07 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi!

The discovered 5 unnecessary C++ static locals with ctors prompted me to
look at other cases, which from looking at the optimized or non-optimized
code are just terrible.
We don't need to initialize static vectors with vNULL, because that implies
runtime initialization, static vars are zero initialized, which is what we
want for the vectors.
In ipa-cp, I understand the vr.min and vr.max is set just so that
uninitialized vars aren't copied around, but initializing static local
wide_int from tree and then copying over is significantly more expensive
than just using wi::zero.
The only questionable change is the sreal.h one, what it did is certainly
very inefficient and ugly, but what the patch does is a kind of hack to keep
it as efficient as possible even for -O0, at -O2 I'd say the compiler should
just inline sreal::normalize and fold it into nothing.
So, if preferred, we could go with just
   inline static sreal min ()
   {
     return sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP);
   }
which will be at -O2 as efficient as what the patch does - storing 2
integers.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-09-22  Jakub Jelinek  <jakub@redhat.com>

	* ipa-cp.c (ipcp_store_vr_results): Avoid static local
	var zero.
	* sreal.h (sreal::min, sreal::max): Avoid static local vars,
	construct values without normalization.
	* tree-ssa-sccvn.c (vn_reference_lookup_3): Don't initialize
	static local lhs_ops to vNULL.
cp/
	* name-lookup.c (store_bindings, store_class_bindings): Don't
	initialize static local bindings_need_stored to vNULL.

--- gcc/ipa-cp.c.jj	2016-09-21 08:54:15.000000000 +0200
+++ gcc/ipa-cp.c	2016-09-22 13:49:57.638198975 +0200
@@ -5204,11 +5204,9 @@ ipcp_store_vr_results (void)
 	 }
        else
 	 {
-	   static wide_int zero = integer_zero_node;
 	   vr.known = false;
 	   vr.type = VR_VARYING;
-	   vr.min = zero;
-	   vr.max = zero;
+	   vr.min = vr.max = wi::zero (INT_TYPE_SIZE);
 	 }
        ts->m_vr->quick_push (vr);
      }
--- gcc/sreal.h.jj	2016-01-04 14:55:52.000000000 +0100
+++ gcc/sreal.h	2016-09-22 14:09:38.882930801 +0200
@@ -104,14 +104,20 @@ public:
   /* Global minimum sreal can hold.  */
   inline static sreal min ()
   {
-    static sreal min = sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP);
+    sreal min;
+    /* This never needs normalization.  */
+    min.m_sig = -SREAL_MAX_SIG;
+    min.m_exp = SREAL_MAX_EXP;
     return min;
   }
 
   /* Global minimum sreal can hold.  */
   inline static sreal max ()
   {
-    static sreal max = sreal (SREAL_MAX_SIG, SREAL_MAX_EXP);
+    sreal max;
+    /* This never needs normalization.  */
+    max.m_sig = SREAL_MAX_SIG;
+    max.m_exp = SREAL_MAX_EXP;
     return max;
   }
 
--- gcc/tree-ssa-sccvn.c.jj	2016-09-20 15:43:09.000000000 +0200
+++ gcc/tree-ssa-sccvn.c	2016-09-22 13:40:03.667886908 +0200
@@ -1786,8 +1786,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
   gimple *def_stmt = SSA_NAME_DEF_STMT (vuse);
   tree base = ao_ref_base (ref);
   HOST_WIDE_INT offset, maxsize;
-  static vec<vn_reference_op_s>
-    lhs_ops = vNULL;
+  static vec<vn_reference_op_s> lhs_ops;
   ao_ref lhs_ref;
   bool lhs_ref_ok = false;
 
--- gcc/cp/name-lookup.c.jj	2016-09-14 23:54:25.000000000 +0200
+++ gcc/cp/name-lookup.c	2016-09-22 13:51:22.459102089 +0200
@@ -6197,7 +6197,7 @@ store_binding (tree id, vec<cxx_saved_bi
 static void
 store_bindings (tree names, vec<cxx_saved_binding, va_gc> **old_bindings)
 {
-  static vec<tree> bindings_need_stored = vNULL;
+  static vec<tree> bindings_need_stored;
   tree t, id;
   size_t i;
 
@@ -6233,7 +6233,7 @@ static void
 store_class_bindings (vec<cp_class_binding, va_gc> *names,
 		      vec<cxx_saved_binding, va_gc> **old_bindings)
 {
-  static vec<tree> bindings_need_stored = vNULL;
+  static vec<tree> bindings_need_stored;
   size_t i;
   cp_class_binding *cb;
 

	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Avoid some C++ local statics with constructors
  2016-09-22 20:07 [PATCH] Avoid some C++ local statics with constructors Jakub Jelinek
@ 2016-09-23  7:03 ` Richard Biener
  2016-09-23  8:58   ` Jakub Jelinek
  2016-09-23  9:33   ` Richard Biener
  0 siblings, 2 replies; 6+ messages in thread
From: Richard Biener @ 2016-09-23  7:03 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Thu, 22 Sep 2016, Jakub Jelinek wrote:

> Hi!
> 
> The discovered 5 unnecessary C++ static locals with ctors prompted me to
> look at other cases, which from looking at the optimized or non-optimized
> code are just terrible.
> We don't need to initialize static vectors with vNULL, because that implies
> runtime initialization, static vars are zero initialized, which is what we
> want for the vectors.
> In ipa-cp, I understand the vr.min and vr.max is set just so that
> uninitialized vars aren't copied around, but initializing static local
> wide_int from tree and then copying over is significantly more expensive
> than just using wi::zero.
> The only questionable change is the sreal.h one, what it did is certainly
> very inefficient and ugly, but what the patch does is a kind of hack to keep
> it as efficient as possible even for -O0, at -O2 I'd say the compiler should
> just inline sreal::normalize and fold it into nothing.
> So, if preferred, we could go with just
>    inline static sreal min ()
>    {
>      return sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP);
>    }
> which will be at -O2 as efficient as what the patch does - storing 2
> integers.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

While I agree with the goal to reduce the number of static constructors
esp. the vNULL cases I disagree with.  This is just introducing
undefined behavior (uninitialized object use), and in case we end up
making vNULL not all-zeros it's bound to break.  So IMHO your patch
is very much premature optimization here.

Maybe we can change vNULL to sth that avoids the constructor, if only
if C++14 is available (thus in stage2+)?

For cases like this I hope we could make GCC optimize away the static
construction, maybe you can reduce it to a simple testcase and file
a bugreport?  It will require making static constructors "first class"
in the cgraph so we know at some point the function body initializing
a specific global and some later cgraph phase builds up the global
constructor calling all remaining relevant individual constructor
functions (which can then still be inlined into that fn).

Thanks,
Richard.

> 2016-09-22  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* ipa-cp.c (ipcp_store_vr_results): Avoid static local
> 	var zero.
> 	* sreal.h (sreal::min, sreal::max): Avoid static local vars,
> 	construct values without normalization.
> 	* tree-ssa-sccvn.c (vn_reference_lookup_3): Don't initialize
> 	static local lhs_ops to vNULL.
> cp/
> 	* name-lookup.c (store_bindings, store_class_bindings): Don't
> 	initialize static local bindings_need_stored to vNULL.
> 
> --- gcc/ipa-cp.c.jj	2016-09-21 08:54:15.000000000 +0200
> +++ gcc/ipa-cp.c	2016-09-22 13:49:57.638198975 +0200
> @@ -5204,11 +5204,9 @@ ipcp_store_vr_results (void)
>  	 }
>         else
>  	 {
> -	   static wide_int zero = integer_zero_node;
>  	   vr.known = false;
>  	   vr.type = VR_VARYING;
> -	   vr.min = zero;
> -	   vr.max = zero;
> +	   vr.min = vr.max = wi::zero (INT_TYPE_SIZE);
>  	 }
>         ts->m_vr->quick_push (vr);
>       }
> --- gcc/sreal.h.jj	2016-01-04 14:55:52.000000000 +0100
> +++ gcc/sreal.h	2016-09-22 14:09:38.882930801 +0200
> @@ -104,14 +104,20 @@ public:
>    /* Global minimum sreal can hold.  */
>    inline static sreal min ()
>    {
> -    static sreal min = sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP);
> +    sreal min;
> +    /* This never needs normalization.  */
> +    min.m_sig = -SREAL_MAX_SIG;
> +    min.m_exp = SREAL_MAX_EXP;
>      return min;
>    }
>  
>    /* Global minimum sreal can hold.  */
>    inline static sreal max ()
>    {
> -    static sreal max = sreal (SREAL_MAX_SIG, SREAL_MAX_EXP);
> +    sreal max;
> +    /* This never needs normalization.  */
> +    max.m_sig = SREAL_MAX_SIG;
> +    max.m_exp = SREAL_MAX_EXP;
>      return max;
>    }
>  
> --- gcc/tree-ssa-sccvn.c.jj	2016-09-20 15:43:09.000000000 +0200
> +++ gcc/tree-ssa-sccvn.c	2016-09-22 13:40:03.667886908 +0200
> @@ -1786,8 +1786,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
>    gimple *def_stmt = SSA_NAME_DEF_STMT (vuse);
>    tree base = ao_ref_base (ref);
>    HOST_WIDE_INT offset, maxsize;
> -  static vec<vn_reference_op_s>
> -    lhs_ops = vNULL;
> +  static vec<vn_reference_op_s> lhs_ops;
>    ao_ref lhs_ref;
>    bool lhs_ref_ok = false;
>  
> --- gcc/cp/name-lookup.c.jj	2016-09-14 23:54:25.000000000 +0200
> +++ gcc/cp/name-lookup.c	2016-09-22 13:51:22.459102089 +0200
> @@ -6197,7 +6197,7 @@ store_binding (tree id, vec<cxx_saved_bi
>  static void
>  store_bindings (tree names, vec<cxx_saved_binding, va_gc> **old_bindings)
>  {
> -  static vec<tree> bindings_need_stored = vNULL;
> +  static vec<tree> bindings_need_stored;
>    tree t, id;
>    size_t i;
>  
> @@ -6233,7 +6233,7 @@ static void
>  store_class_bindings (vec<cp_class_binding, va_gc> *names,
>  		      vec<cxx_saved_binding, va_gc> **old_bindings)
>  {
> -  static vec<tree> bindings_need_stored = vNULL;
> +  static vec<tree> bindings_need_stored;
>    size_t i;
>    cp_class_binding *cb;
>  
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Avoid some C++ local statics with constructors
  2016-09-23  7:03 ` Richard Biener
@ 2016-09-23  8:58   ` Jakub Jelinek
  2016-09-23  9:19     ` Richard Biener
  2016-09-23  9:33   ` Richard Biener
  1 sibling, 1 reply; 6+ messages in thread
From: Jakub Jelinek @ 2016-09-23  8:58 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4123 bytes --]

On Fri, Sep 23, 2016 at 08:55:02AM +0200, Richard Biener wrote:
> While I agree with the goal to reduce the number of static constructors
> esp. the vNULL cases I disagree with.  This is just introducing
> undefined behavior (uninitialized object use), and in case we end up
> making vNULL not all-zeros it's bound to break.  So IMHO your patch
> is very much premature optimization here.

No, there is no undefined behavior, and we rely on the zero initialization
in > 100 cases, see
grep '\(^\|static \)vec \?<.*;' *.c *.cc */*.c ada/*/*.c | grep -v '\*'
or anywhere where we have vec part of a struct that we allocate cleared or
clear after allocation.
The thing is, vec is (intentionally) a POD, thus doesn't have a ctor,
and the = vNULL "construction" is
  template <typename T, typename A, typename L>
  operator vec<T, A, L> () { return vec<T, A, L>(); }
actually clearing of the vector.  If we ever wanted to initialize vec to
something not-all-zeros, we'd actually need to turn it into non-POD and add
ctor.

I'm attaching 3 further patches (not bootstrapped/regtested yet), which:
1) adds constexpr keyword for C++11 + to the vNULL operator, this seems to
be enough to get rid of the runtime initialization, both in the
tree-ssa-sccvn.o case with _ZGV var and e.g. in passes.c get rid of
_Z41__static_initialization_and_destruction_0ii ctor
2) another patch to remove the useless = vNULL initializers from file scope
vecs, we don't usually = 0 static vars either (though, apparently we have
hundreds of those)
3) a patch that attempts to clarify when = vNULL is needed and when it is
not

> Maybe we can change vNULL to sth that avoids the constructor, if only
> if C++14 is available (thus in stage2+)?
> 
> For cases like this I hope we could make GCC optimize away the static
> construction, maybe you can reduce it to a simple testcase and file
> a bugreport?  It will require making static constructors "first class"
> in the cgraph so we know at some point the function body initializing
> a specific global and some later cgraph phase builds up the global
> constructor calling all remaining relevant individual constructor
> functions (which can then still be inlined into that fn).

A short testcase is e.g.
struct S { int a; };
void baz (S *);
#if __cpp_constexpr >= 200704
constexpr
#endif
inline S foo () { return (S) { 2 }; }

S s = foo ();

void
bar ()
{
  static S t = foo ();
  baz (&t);
}
Here for -std=c++98 (or even -std=c++1z without the constexpr keyword)
at -O0 there is both _Z41__static_initialization_and_destruction_0ii and
the __cxa_guard*+_ZGV stuff, at -O2 we manage to inline
_Z41__static_initialization_and_destruction_0ii but still end up with
_GLOBAL__sub_I_s:
        movl    $2, s(%rip)
        ret
registered in .ctors.  And the guard stuff is left too.
Optimizing that away would be nice, but non-fun, we'd have to recognize
something like:

  static struct S t;
  unsigned char _1;
  int _2;

  <bb 2>:
  _1 = __atomic_load_1 (&_ZGVZ3barvE1t, 2);
  if (_1 == 0)
    goto <bb 4>;
  else
    goto <bb 3>;

  <bb 3>:
  goto <bb 6>;

  <bb 4>:
  _2 = __cxa_guard_acquire (&_ZGVZ3barvE1t);
  if (_2 != 0)
    goto <bb 5>;
  else
    goto <bb 3>;

  <bb 5>:
  MEM[(struct S *)&t] = 2;
  __cxa_guard_release (&_ZGVZ3barvE1t);

  <bb 6>:

and turn those stores into the static var into a static initializer
and throw away the _ZGV* var, plus if this is in an inline function
(i.e. the static var and its _ZGV* var are comdat) it might be an
ABI thing too (unfortunately the _ZGV* var and the actual var it is guarding
aren't in the same comdat, so even a trick like keeping the runtime
initialization in, but initialize the var to the expected values and
preinitialize the _ZGV* guard into the "already initialized" state wouldn't
work reliably, if the linker picks the _ZGV* var from one TU and the guarded
var from another one, it could fail).  So, I'm afraid if we want to do this
optimization, we'd need to limit it to the cases where the _ZGV* var isn't
visible outside of the TU.

Ok for the 3 patches (or some subset of them) for trunk if they pass
bootstrap/regtest?

	Jakub

[-- Attachment #2: U117 --]
[-- Type: text/plain, Size: 468 bytes --]

2016-09-23  Jakub Jelinek  <jakub@redhat.com>

	* vec.h (vnull::operator vec): Add constexpr keyword for
	C++11 and later.

--- gcc/vec.h.jj	2016-04-26 08:08:18.000000000 +0200
+++ gcc/vec.h	2016-09-23 09:48:48.813586327 +0200
@@ -414,6 +414,9 @@ struct GTY((user)) vec
 struct vnull
 {
   template <typename T, typename A, typename L>
+#if __cpp_constexpr >= 201103
+  constexpr
+#endif
   operator vec<T, A, L> () { return vec<T, A, L>(); }
 };
 extern vnull vNULL;

[-- Attachment #3: U118 --]
[-- Type: text/plain, Size: 5748 bytes --]

2016-09-23  Jakub Jelinek  <jakub@redhat.com>

	* sel-sched-ir.c (sel_global_bb_info, sel_region_bb_info,
	loop_nests, s_i_d, last_added_blocks): Remove unnecessary
	= vNULL initialization of file scope vec.
	* passes.c (pass_tab, enabled_pass_uid_range_tab,
	disabled_pass_uid_range_tab): Likewise.
	* haifa-sched.c (sched_luids, h_i_d): Likewise.
	* tree-chkp-opt.c (check_infos): Likewise.
	* sel-sched.c (vec_av_set, vec_temp_moveop_nops): Likewise.
c/
	* c-parser.c (incomplete_record_decls): Remove unnecessary
	= vNULL initialization of file scope vec.
cp/
	* constexpr.c (call_stack): Remove unnecessary
	= vNULL initialization of file scope vec.

--- gcc/sel-sched-ir.c.jj	2016-03-15 17:10:19.000000000 +0100
+++ gcc/sel-sched-ir.c	2016-09-23 09:56:54.145357346 +0200
@@ -45,12 +45,10 @@ along with GCC; see the file COPYING3.
 #include "sel-sched-dump.h"
 
 /* A vector holding bb info for whole scheduling pass.  */
-vec<sel_global_bb_info_def>
-    sel_global_bb_info = vNULL;
+vec<sel_global_bb_info_def> sel_global_bb_info;
 
 /* A vector holding bb info.  */
-vec<sel_region_bb_info_def>
-    sel_region_bb_info = vNULL;
+vec<sel_region_bb_info_def> sel_region_bb_info;
 
 /* A pool for allocating all lists.  */
 object_allocator<_list_node> sched_lists_pool ("sel-sched-lists");
@@ -66,7 +64,7 @@ struct loop *current_loop_nest;
 
 /* LOOP_NESTS is a vector containing the corresponding loop nest for
    each region.  */
-static vec<loop_p> loop_nests = vNULL;
+static vec<loop_p> loop_nests;
 
 /* Saves blocks already in loop regions, indexed by bb->index.  */
 static sbitmap bbs_in_loop_rgns = NULL;
@@ -4163,7 +4161,7 @@ finish_region_bb_info (void)
 \f
 
 /* Data for each insn in current region.  */
-vec<sel_insn_data_def> s_i_d = vNULL;
+vec<sel_insn_data_def> s_i_d;
 
 /* Extend data structures for insns from current region.  */
 static void
@@ -4499,8 +4497,7 @@ get_av_level (insn_t insn)
 
 /* The basic block that already has been processed by the sched_data_update (),
    but hasn't been in sel_add_bb () yet.  */
-static vec<basic_block>
-    last_added_blocks = vNULL;
+static vec<basic_block> last_added_blocks;
 
 /* A pool for allocating successor infos.  */
 static struct
--- gcc/passes.c.jj	2016-09-14 16:00:50.000000000 +0200
+++ gcc/passes.c	2016-09-23 09:55:41.265281188 +0200
@@ -862,7 +862,7 @@ pass_manager::register_pass_name (opt_pa
 /* Map from pass id to canonicalized pass name.  */
 
 typedef const char *char_ptr;
-static vec<char_ptr> pass_tab = vNULL;
+static vec<char_ptr> pass_tab;
 
 /* Callback function for traversing NAME_TO_PASS_MAP.  */
 
@@ -982,10 +982,8 @@ struct uid_range
 typedef struct uid_range *uid_range_p;
 
 
-static vec<uid_range_p>
-      enabled_pass_uid_range_tab = vNULL;
-static vec<uid_range_p>
-      disabled_pass_uid_range_tab = vNULL;
+static vec<uid_range_p> enabled_pass_uid_range_tab;
+static vec<uid_range_p> disabled_pass_uid_range_tab;
 
 
 /* Parse option string for -fdisable- and -fenable-
--- gcc/haifa-sched.c.jj	2016-08-29 12:17:29.000000000 +0200
+++ gcc/haifa-sched.c	2016-09-23 09:55:15.440612566 +0200
@@ -401,13 +401,13 @@ const struct common_sched_info_def haifa
   };
 
 /* Mapping from instruction UID to its Logical UID.  */
-vec<int> sched_luids = vNULL;
+vec<int> sched_luids;
 
 /* Next LUID to assign to an instruction.  */
 int sched_max_luid = 1;
 
 /* Haifa Instruction Data.  */
-vec<haifa_insn_data_def> h_i_d = vNULL;
+vec<haifa_insn_data_def> h_i_d;
 
 void (* sched_init_only_bb) (basic_block, basic_block);
 
--- gcc/tree-chkp-opt.c.jj	2016-08-17 10:22:14.000000000 +0200
+++ gcc/tree-chkp-opt.c	2016-09-23 09:57:09.715159981 +0200
@@ -84,7 +84,7 @@ static void chkp_collect_value (tree ssa
 #define chkp_checku_fndecl \
   (targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDCU))
 
-static vec<struct bb_checks, va_heap, vl_ptr> check_infos = vNULL;
+static vec<struct bb_checks, va_heap, vl_ptr> check_infos;
 
 /* Comparator for pol_item structures I1 and I2 to be used
    to find items with equal var.  Also used for polynomial
--- gcc/sel-sched.c.jj	2016-08-06 12:11:51.000000000 +0200
+++ gcc/sel-sched.c	2016-09-23 09:56:20.679781563 +0200
@@ -494,7 +494,7 @@ static int max_ws;
 static int num_insns_scheduled;
 
 /* A vector of expressions is used to be able to sort them.  */
-static vec<expr_t> vec_av_set = vNULL;
+static vec<expr_t> vec_av_set;
 
 /* A vector of vinsns is used to hold temporary lists of vinsns.  */
 typedef vec<vinsn_t> vinsn_vec_t;
@@ -512,7 +512,7 @@ static vinsn_vec_t vec_target_unavailabl
 
 /* Vector to store temporary nops inserted in move_op to prevent removal
    of empty bbs.  */
-static vec<insn_t> vec_temp_moveop_nops = vNULL;
+static vec<insn_t> vec_temp_moveop_nops;
 
 /* These bitmaps record original instructions scheduled on the current
    iteration and bookkeeping copies created by them.  */
--- gcc/c/c-parser.c.jj	2016-09-14 23:48:59.000000000 +0200
+++ gcc/c/c-parser.c	2016-09-23 09:57:28.249925030 +0200
@@ -67,7 +67,7 @@ along with GCC; see the file COPYING3.
    In c_parser_translation_unit(), we iterate over incomplete_record_decls
    and report error if any of the decls are still incomplete.  */ 
 
-vec<tree> incomplete_record_decls = vNULL;
+vec<tree> incomplete_record_decls;
 
 void
 set_c_expr_source_range (c_expr *expr,
--- gcc/cp/constexpr.c.jj	2016-09-20 17:17:51.000000000 +0200
+++ gcc/cp/constexpr.c	2016-09-23 09:57:46.055699321 +0200
@@ -1253,7 +1253,7 @@ cxx_bind_parameters_in_call (const const
    These do not need to be marked for PCH or GC.  */
 
 /* FIXME remember and print actual constant arguments.  */
-static vec<tree> call_stack = vNULL;
+static vec<tree> call_stack;
 static int call_stack_tick;
 static int last_cx_error_tick;
 

[-- Attachment #4: U119 --]
[-- Type: text/plain, Size: 788 bytes --]

2016-09-23  Jakub Jelinek  <jakub@redhat.com>

	* vec.h (vNULL): Extend comment to say = vNULL initialization
	isn't needed for static vars.

--- gcc/vec.h.jj	2016-09-23 09:53:13.000000000 +0200
+++ gcc/vec.h	2016-09-23 10:11:06.811548785 +0200
@@ -410,7 +410,9 @@ struct GTY((user)) vec
 /* Type to provide NULL values for vec<T, A, L>.  This is used to
    provide nil initializers for vec instances.  Since vec must be
    a POD, we cannot have proper ctor/dtor for it.  To initialize
-   a vec instance, you can assign it the value vNULL.  */
+   a vec instance, you can assign it the value vNULL.  This isn't
+   needed for file-scope and function-local static vectors, which
+   are zero-initialized by default.  */
 struct vnull
 {
   template <typename T, typename A, typename L>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Avoid some C++ local statics with constructors
  2016-09-23  8:58   ` Jakub Jelinek
@ 2016-09-23  9:19     ` Richard Biener
  2016-09-23  9:20       ` Jakub Jelinek
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2016-09-23  9:19 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Jan Hubicka

On Fri, 23 Sep 2016, Jakub Jelinek wrote:

> On Fri, Sep 23, 2016 at 08:55:02AM +0200, Richard Biener wrote:
> > While I agree with the goal to reduce the number of static constructors
> > esp. the vNULL cases I disagree with.  This is just introducing
> > undefined behavior (uninitialized object use), and in case we end up
> > making vNULL not all-zeros it's bound to break.  So IMHO your patch
> > is very much premature optimization here.
> 
> No, there is no undefined behavior, and we rely on the zero initialization
> in > 100 cases, see
> grep '\(^\|static \)vec \?<.*;' *.c *.cc */*.c ada/*/*.c | grep -v '\*'
> or anywhere where we have vec part of a struct that we allocate cleared or
> clear after allocation.
> The thing is, vec is (intentionally) a POD, thus doesn't have a ctor,
> and the = vNULL "construction" is
>   template <typename T, typename A, typename L>
>   operator vec<T, A, L> () { return vec<T, A, L>(); }
> actually clearing of the vector.  If we ever wanted to initialize vec to
> something not-all-zeros, we'd actually need to turn it into non-POD and add
> ctor.
> 
> I'm attaching 3 further patches (not bootstrapped/regtested yet), which:
> 1) adds constexpr keyword for C++11 + to the vNULL operator, this seems to
> be enough to get rid of the runtime initialization, both in the
> tree-ssa-sccvn.o case with _ZGV var and e.g. in passes.c get rid of
> _Z41__static_initialization_and_destruction_0ii ctor

Really?  We have

extern vnull vNULL;

thus all = vNULL initializers are really copies of an instance in vec.c.

This patch is nevertheless fine.

> 2) another patch to remove the useless = vNULL initializers from file scope
> vecs, we don't usually = 0 static vars either (though, apparently we have
> hundreds of those)

Ok.

> 3) a patch that attempts to clarify when = vNULL is needed and when it is
> not

Ok as well.

> > Maybe we can change vNULL to sth that avoids the constructor, if only
> > if C++14 is available (thus in stage2+)?
> > 
> > For cases like this I hope we could make GCC optimize away the static
> > construction, maybe you can reduce it to a simple testcase and file
> > a bugreport?  It will require making static constructors "first class"
> > in the cgraph so we know at some point the function body initializing
> > a specific global and some later cgraph phase builds up the global
> > constructor calling all remaining relevant individual constructor
> > functions (which can then still be inlined into that fn).
> 
> A short testcase is e.g.
> struct S { int a; };
> void baz (S *);
> #if __cpp_constexpr >= 200704
> constexpr
> #endif
> inline S foo () { return (S) { 2 }; }
> 
> S s = foo ();
> 
> void
> bar ()
> {
>   static S t = foo ();
>   baz (&t);
> }
> Here for -std=c++98 (or even -std=c++1z without the constexpr keyword)
> at -O0 there is both _Z41__static_initialization_and_destruction_0ii and
> the __cxa_guard*+_ZGV stuff, at -O2 we manage to inline
> _Z41__static_initialization_and_destruction_0ii but still end up with
> _GLOBAL__sub_I_s:
>         movl    $2, s(%rip)
>         ret
> registered in .ctors.

Yeah, I think this is where we could improve.

>  And the guard stuff is left too.
> Optimizing that away would be nice, but non-fun, we'd have to recognize
> something like:
> 
>   static struct S t;
>   unsigned char _1;
>   int _2;
> 
>   <bb 2>:
>   _1 = __atomic_load_1 (&_ZGVZ3barvE1t, 2);
>   if (_1 == 0)
>     goto <bb 4>;
>   else
>     goto <bb 3>;
> 
>   <bb 3>:
>   goto <bb 6>;
> 
>   <bb 4>:
>   _2 = __cxa_guard_acquire (&_ZGVZ3barvE1t);
>   if (_2 != 0)
>     goto <bb 5>;
>   else
>     goto <bb 3>;
> 
>   <bb 5>:
>   MEM[(struct S *)&t] = 2;
>   __cxa_guard_release (&_ZGVZ3barvE1t);
> 
>   <bb 6>:

Ick - I totally forgot about the local static initializer woes
with guards.  _Maybe_ we can represent this by putting the guard
into the initializer function (remember, I'd like to see all
static initializers be represented as functions, refered to
from their initializing varpool node as, say, ->initializer).
Thus we'd have the FE generate

bar ()
{
  __static_init_for_t ();
...
}

__static_init_for_t ()
{
  all the guard stuff plus the init
}

the initializers would be always-inline (at IPA time when optimizing
to be able to do the promotion to a static ctor).

> and turn those stores into the static var into a static initializer
> and throw away the _ZGV* var, plus if this is in an inline function
> (i.e. the static var and its _ZGV* var are comdat) it might be an
> ABI thing too (unfortunately the _ZGV* var and the actual var it is guarding
> aren't in the same comdat, so even a trick like keeping the runtime
> initialization in, but initialize the var to the expected values and
> preinitialize the _ZGV* guard into the "already initialized" state wouldn't
> work reliably, if the linker picks the _ZGV* var from one TU and the guarded
> var from another one, it could fail).  So, I'm afraid if we want to do this
> optimization, we'd need to limit it to the cases where the _ZGV* var isn't
> visible outside of the TU.
> 
> Ok for the 3 patches (or some subset of them) for trunk if they pass
> bootstrap/regtest?

Ok.

Richard.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Avoid some C++ local statics with constructors
  2016-09-23  9:19     ` Richard Biener
@ 2016-09-23  9:20       ` Jakub Jelinek
  0 siblings, 0 replies; 6+ messages in thread
From: Jakub Jelinek @ 2016-09-23  9:20 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jan Hubicka

On Fri, Sep 23, 2016 at 10:59:12AM +0200, Richard Biener wrote:
> On Fri, 23 Sep 2016, Jakub Jelinek wrote:
> 
> > On Fri, Sep 23, 2016 at 08:55:02AM +0200, Richard Biener wrote:
> > > While I agree with the goal to reduce the number of static constructors
> > > esp. the vNULL cases I disagree with.  This is just introducing
> > > undefined behavior (uninitialized object use), and in case we end up
> > > making vNULL not all-zeros it's bound to break.  So IMHO your patch
> > > is very much premature optimization here.
> > 
> > No, there is no undefined behavior, and we rely on the zero initialization
> > in > 100 cases, see
> > grep '\(^\|static \)vec \?<.*;' *.c *.cc */*.c ada/*/*.c | grep -v '\*'
> > or anywhere where we have vec part of a struct that we allocate cleared or
> > clear after allocation.
> > The thing is, vec is (intentionally) a POD, thus doesn't have a ctor,
> > and the = vNULL "construction" is
> >   template <typename T, typename A, typename L>
> >   operator vec<T, A, L> () { return vec<T, A, L>(); }
> > actually clearing of the vector.  If we ever wanted to initialize vec to
> > something not-all-zeros, we'd actually need to turn it into non-POD and add
> > ctor.
> > 
> > I'm attaching 3 further patches (not bootstrapped/regtested yet), which:
> > 1) adds constexpr keyword for C++11 + to the vNULL operator, this seems to
> > be enough to get rid of the runtime initialization, both in the
> > tree-ssa-sccvn.o case with _ZGV var and e.g. in passes.c get rid of
> > _Z41__static_initialization_and_destruction_0ii ctor
> 
> Really?  We have
> 
> extern vnull vNULL;

vNULL is not a vec (after all, vec is a template, so there can't be a single
object), just a dummy object with a conversion operator template which
value-initializes an unnamed vec temporary and the assignment copies it into
the variable.

Anyway, I'll file 2 PRs, one for the file scope static initialization
optimization, another for the local static.

	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Avoid some C++ local statics with constructors
  2016-09-23  7:03 ` Richard Biener
  2016-09-23  8:58   ` Jakub Jelinek
@ 2016-09-23  9:33   ` Richard Biener
  1 sibling, 0 replies; 6+ messages in thread
From: Richard Biener @ 2016-09-23  9:33 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Fri, 23 Sep 2016, Richard Biener wrote:

> On Thu, 22 Sep 2016, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > The discovered 5 unnecessary C++ static locals with ctors prompted me to
> > look at other cases, which from looking at the optimized or non-optimized
> > code are just terrible.
> > We don't need to initialize static vectors with vNULL, because that implies
> > runtime initialization, static vars are zero initialized, which is what we
> > want for the vectors.
> > In ipa-cp, I understand the vr.min and vr.max is set just so that
> > uninitialized vars aren't copied around, but initializing static local
> > wide_int from tree and then copying over is significantly more expensive
> > than just using wi::zero.
> > The only questionable change is the sreal.h one, what it did is certainly
> > very inefficient and ugly, but what the patch does is a kind of hack to keep
> > it as efficient as possible even for -O0, at -O2 I'd say the compiler should
> > just inline sreal::normalize and fold it into nothing.
> > So, if preferred, we could go with just
> >    inline static sreal min ()
> >    {
> >      return sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP);
> >    }
> > which will be at -O2 as efficient as what the patch does - storing 2
> > integers.
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> While I agree with the goal to reduce the number of static constructors
> esp. the vNULL cases I disagree with.  This is just introducing
> undefined behavior (uninitialized object use), and in case we end up
> making vNULL not all-zeros it's bound to break.  So IMHO your patch
> is very much premature optimization here.
> 
> Maybe we can change vNULL to sth that avoids the constructor, if only
> if C++14 is available (thus in stage2+)?
> 
> For cases like this I hope we could make GCC optimize away the static
> construction, maybe you can reduce it to a simple testcase and file
> a bugreport?  It will require making static constructors "first class"
> in the cgraph so we know at some point the function body initializing
> a specific global and some later cgraph phase builds up the global
> constructor calling all remaining relevant individual constructor
> functions (which can then still be inlined into that fn).

Thus in the light of the other patch-set this patch is ok as well.

Thanks,
Richard.

> Thanks,
> Richard.
> 
> > 2016-09-22  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	* ipa-cp.c (ipcp_store_vr_results): Avoid static local
> > 	var zero.
> > 	* sreal.h (sreal::min, sreal::max): Avoid static local vars,
> > 	construct values without normalization.
> > 	* tree-ssa-sccvn.c (vn_reference_lookup_3): Don't initialize
> > 	static local lhs_ops to vNULL.
> > cp/
> > 	* name-lookup.c (store_bindings, store_class_bindings): Don't
> > 	initialize static local bindings_need_stored to vNULL.
> > 
> > --- gcc/ipa-cp.c.jj	2016-09-21 08:54:15.000000000 +0200
> > +++ gcc/ipa-cp.c	2016-09-22 13:49:57.638198975 +0200
> > @@ -5204,11 +5204,9 @@ ipcp_store_vr_results (void)
> >  	 }
> >         else
> >  	 {
> > -	   static wide_int zero = integer_zero_node;
> >  	   vr.known = false;
> >  	   vr.type = VR_VARYING;
> > -	   vr.min = zero;
> > -	   vr.max = zero;
> > +	   vr.min = vr.max = wi::zero (INT_TYPE_SIZE);
> >  	 }
> >         ts->m_vr->quick_push (vr);
> >       }
> > --- gcc/sreal.h.jj	2016-01-04 14:55:52.000000000 +0100
> > +++ gcc/sreal.h	2016-09-22 14:09:38.882930801 +0200
> > @@ -104,14 +104,20 @@ public:
> >    /* Global minimum sreal can hold.  */
> >    inline static sreal min ()
> >    {
> > -    static sreal min = sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP);
> > +    sreal min;
> > +    /* This never needs normalization.  */
> > +    min.m_sig = -SREAL_MAX_SIG;
> > +    min.m_exp = SREAL_MAX_EXP;
> >      return min;
> >    }
> >  
> >    /* Global minimum sreal can hold.  */
> >    inline static sreal max ()
> >    {
> > -    static sreal max = sreal (SREAL_MAX_SIG, SREAL_MAX_EXP);
> > +    sreal max;
> > +    /* This never needs normalization.  */
> > +    max.m_sig = SREAL_MAX_SIG;
> > +    max.m_exp = SREAL_MAX_EXP;
> >      return max;
> >    }
> >  
> > --- gcc/tree-ssa-sccvn.c.jj	2016-09-20 15:43:09.000000000 +0200
> > +++ gcc/tree-ssa-sccvn.c	2016-09-22 13:40:03.667886908 +0200
> > @@ -1786,8 +1786,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree
> >    gimple *def_stmt = SSA_NAME_DEF_STMT (vuse);
> >    tree base = ao_ref_base (ref);
> >    HOST_WIDE_INT offset, maxsize;
> > -  static vec<vn_reference_op_s>
> > -    lhs_ops = vNULL;
> > +  static vec<vn_reference_op_s> lhs_ops;
> >    ao_ref lhs_ref;
> >    bool lhs_ref_ok = false;
> >  
> > --- gcc/cp/name-lookup.c.jj	2016-09-14 23:54:25.000000000 +0200
> > +++ gcc/cp/name-lookup.c	2016-09-22 13:51:22.459102089 +0200
> > @@ -6197,7 +6197,7 @@ store_binding (tree id, vec<cxx_saved_bi
> >  static void
> >  store_bindings (tree names, vec<cxx_saved_binding, va_gc> **old_bindings)
> >  {
> > -  static vec<tree> bindings_need_stored = vNULL;
> > +  static vec<tree> bindings_need_stored;
> >    tree t, id;
> >    size_t i;
> >  
> > @@ -6233,7 +6233,7 @@ static void
> >  store_class_bindings (vec<cp_class_binding, va_gc> *names,
> >  		      vec<cxx_saved_binding, va_gc> **old_bindings)
> >  {
> > -  static vec<tree> bindings_need_stored = vNULL;
> > +  static vec<tree> bindings_need_stored;
> >    size_t i;
> >    cp_class_binding *cb;
> >  
> > 
> > 	Jakub
> > 
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-09-23  9:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-22 20:07 [PATCH] Avoid some C++ local statics with constructors Jakub Jelinek
2016-09-23  7:03 ` Richard Biener
2016-09-23  8:58   ` Jakub Jelinek
2016-09-23  9:19     ` Richard Biener
2016-09-23  9:20       ` Jakub Jelinek
2016-09-23  9:33   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).