From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20763 invoked by alias); 29 May 2011 16:14:27 -0000 Received: (qmail 20747 invoked by uid 22791); 29 May 2011 16:14:23 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,TW_CP X-Spam-Check-By: sourceware.org Received: from mail-wy0-f175.google.com (HELO mail-wy0-f175.google.com) (74.125.82.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 29 May 2011 16:14:08 +0000 Received: by wye20 with SMTP id 20so2593645wye.20 for ; Sun, 29 May 2011 09:14:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.227.199.21 with SMTP id eq21mr3953083wbb.101.1306685646622; Sun, 29 May 2011 09:14:06 -0700 (PDT) Received: by 10.227.37.152 with HTTP; Sun, 29 May 2011 09:14:06 -0700 (PDT) In-Reply-To: <20110529143135.GC3577@kam.mff.cuni.cz> References: <20110529143135.GC3577@kam.mff.cuni.cz> Date: Sun, 29 May 2011 18:04:00 -0000 Message-ID: Subject: Re: Faster string streaming From: Richard Guenther To: Jan Hubicka Cc: gcc-patches@gcc.gnu.org, rguenther@suse.de Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg02271.txt.bz2 On Sun, May 29, 2011 at 4:31 PM, Jan Hubicka wrote: > Hi, > string straming is quite wasteful by creating a copy of every string it g= ets, > looking it up in hashtable, freeing if it is present and adding otherwise= . =A0The > copy is made just to add 0 value to terminate it since htab_hash_string i= s used > and it expects 0 terminated strings that is not always the case here. > > Additionally we end up copying every string we stream twice (once for has= h, > once for output buffer) and because all the strings are persistently stor= ed in > memory anyway, we end up with every string sitting in memory three times. > > This patch avoids early copying by inlining string hashing that takes len= gth > parameter; avoids copying of the string when the string is known to stay = in > memory during whole output block duration (that is always the case for all > strings we handle at the moment). > > I also added obstack into output block for allocating objects that needs = to be > freed at the time OB is destructed that is the case of string copies and = the > datastructure used to hold hashtable entries. > > This also makes me to wonder why output streams are not obstacks, they se= ems > to be fitting the purpose quite well here and are better optimized than o= ur > implementation. Good question ... > > Bootstrapped/regtested x86_64-linux, OK? > Honza > > =A0 =A0 =A0 =A0* lto-streamer-out.c (hash_string_slot_node): Hash string = based on its > =A0 =A0 =A0 =A0length. > =A0 =A0 =A0 =A0(string_slot_free): Remove > =A0 =A0 =A0 =A0(create_output_block): Initialize obstack. > =A0 =A0 =A0 =A0(destroy_output_block): Free obstack. > =A0 =A0 =A0 =A0(lto_string_index): Add PERSISTENT parameter; do not dupli= cate > =A0 =A0 =A0 =A0the string unless it needs to be added into the hash. > =A0 =A0 =A0 =A0(lto_output_string_with_length): Add persistent attribute; > =A0 =A0 =A0 =A0handle NULL strings. > =A0 =A0 =A0 =A0(lto_output_string): Add PERSISTENT parameter. > =A0 =A0 =A0 =A0(output_string_cst, output_identifier): Simplify. > =A0 =A0 =A0 =A0(lto_output_location_bitpack): Update. > =A0 =A0 =A0 =A0(lto_output_builtin_tree): Update. > =A0 =A0 =A0 =A0* lto-streamer.h (struct output_block): Add obstack. > =A0 =A0 =A0 =A0(lto_output_string, lto_output_string_with_length): Remove= declarations; > =A0 =A0 =A0 =A0functions are static now. > > Index: lto-streamer-out.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- lto-streamer-out.c =A0(revision 174377) > +++ lto-streamer-out.c =A0(working copy) > @@ -51,13 +51,19 @@ struct string_slot > =A0}; > > > -/* Returns a hash code for P. =A0*/ > +/* Returns a hash code for P. > + =A0 Shamelessly*/ ... stolen from libiberty. ? Ok with that comment adjusted. Thanks, Richard. > > =A0static hashval_t > =A0hash_string_slot_node (const void *p) > =A0{ > =A0 const struct string_slot *ds =3D (const struct string_slot *) p; > - =A0return (hashval_t) htab_hash_string (ds->s); > + =A0hashval_t r =3D ds->len; > + =A0int i; > + > + =A0for (i =3D 0; i < ds->len; i++) > + =A0 =A0 r =3D r * 67 + (unsigned)ds->s[i] - 113; > + =A0return r; > =A0} > > > @@ -76,17 +82,6 @@ eq_string_slot_node (const void *p1, con > =A0} > > > -/* Free the string slot pointed-to by P. =A0*/ > - > -static void > -string_slot_free (void *p) > -{ > - =A0struct string_slot *slot =3D (struct string_slot *) p; > - =A0free (CONST_CAST (void *, (const void *) slot->s)); > - =A0free (slot); > -} > - > - > =A0/* Clear the line info stored in DATA_IN. =A0*/ > > =A0static void > @@ -118,7 +113,8 @@ create_output_block (enum lto_section_ty > =A0 clear_line_info (ob); > > =A0 ob->string_hash_table =3D htab_create (37, hash_string_slot_node, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0eq_string_slot_node, string_slot_free); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0eq_string_slot_node, NULL); > + =A0gcc_obstack_init (&ob->obstack); > > =A0 return ob; > =A0} > @@ -139,26 +135,27 @@ destroy_output_block (struct output_bloc > =A0 =A0 free (ob->cfg_stream); > > =A0 lto_streamer_cache_delete (ob->writer_cache); > + =A0obstack_free (&ob->obstack, NULL); > > =A0 free (ob); > =A0} > > =A0/* Return index used to reference STRING of LEN characters in the stri= ng table > =A0 =A0in OB. =A0The string might or might not include a trailing '\0'. > - =A0 Then put the index onto the INDEX_STREAM. =A0*/ > + =A0 Then put the index onto the INDEX_STREAM. > + =A0 When PERSISTENT is set, the string S is supposed to not change duri= ng > + =A0 duration of the OB and thus OB can keep pointer into it. =A0*/ > > =A0static unsigned > =A0lto_string_index (struct output_block *ob, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0const char *s, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 unsigned int len) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 unsigned int len, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 bool persistent) > =A0{ > =A0 struct string_slot **slot; > =A0 struct string_slot s_slot; > - =A0char *string =3D (char *) xmalloc (len + 1); > - =A0memcpy (string, s, len); > - =A0string[len] =3D '\0'; > > - =A0s_slot.s =3D string; > + =A0s_slot.s =3D s; > =A0 s_slot.len =3D len; > =A0 s_slot.slot_num =3D 0; > > @@ -169,7 +166,17 @@ lto_string_index (struct output_block *o > =A0 =A0 =A0 struct lto_output_stream *string_stream =3D ob->string_stream; > =A0 =A0 =A0 unsigned int start =3D string_stream->total_size; > =A0 =A0 =A0 struct string_slot *new_slot > - =A0 =A0 =A0 =3D (struct string_slot *) xmalloc (sizeof (struct string_s= lot)); > + =A0 =A0 =A0 =3D XOBNEW (&ob->obstack, struct string_slot); > + =A0 =A0 =A0const char *string; > + > + =A0 =A0 =A0if (!persistent) > + =A0 =A0 =A0 { > + =A0 =A0 =A0 =A0 char *tmp; > + =A0 =A0 =A0 =A0 string =3D tmp =3D XOBNEWVEC (&ob->obstack, char, len); > + =A0 =A0 =A0 =A0 =A0memcpy (tmp, s, len); > + =A0 =A0 =A0 =A0} > + =A0 =A0 =A0else > + =A0 =A0 =A0 string =3D s; > > =A0 =A0 =A0 new_slot->s =3D string; > =A0 =A0 =A0 new_slot->len =3D len; > @@ -182,7 +189,6 @@ lto_string_index (struct output_block *o > =A0 else > =A0 =A0 { > =A0 =A0 =A0 struct string_slot *old_slot =3D *slot; > - =A0 =A0 =A0free (string); > =A0 =A0 =A0 return old_slot->slot_num + 1; > =A0 =A0 } > =A0} > @@ -190,29 +196,39 @@ lto_string_index (struct output_block *o > > =A0/* Output STRING of LEN characters to the string > =A0 =A0table in OB. The string might or might not include a trailing '\0'. > - =A0 Then put the index onto the INDEX_STREAM. =A0*/ > + =A0 Then put the index onto the INDEX_STREAM. > + =A0 When PERSISTENT is set, the string S is supposed to not change duri= ng > + =A0 duration of the OB and thus OB can keep pointer into it. =A0*/ > > -void > +static void > =A0lto_output_string_with_length (struct output_block *ob, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct lto_ou= tput_stream *index_stream, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 const char *s, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0unsigned int= len) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0unsigned int= len, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0bool persist= ent) > =A0{ > - =A0lto_output_uleb128_stream (index_stream, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0lto_string_index= (ob, s, len)); > + =A0if (s) > + =A0 =A0lto_output_uleb128_stream (index_stream, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0lto_string_i= ndex (ob, s, len, persistent)); > + =A0else > + =A0 =A0lto_output_1_stream (index_stream, 0); > =A0} > > =A0/* Output the '\0' terminated STRING to the string > - =A0 table in OB. =A0Then put the index onto the INDEX_STREAM. =A0*/ > + =A0 table in OB. =A0Then put the index onto the INDEX_STREAM. > + =A0 When PERSISTENT is set, the string S is supposed to not change duri= ng > + =A0 duration of the OB and thus OB can keep pointer into it. =A0*/ > > -void > +static void > =A0lto_output_string (struct output_block *ob, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct lto_output_stream *index_strea= m, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0const char *string) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0const char *string, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0bool persistent) > =A0{ > =A0 if (string) > =A0 =A0 lto_output_string_with_length (ob, index_stream, string, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0strl= en (string) + 1); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0strl= en (string) + 1, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0pers= istent); > =A0 else > =A0 =A0 lto_output_1_stream (index_stream, 0); > =A0} > @@ -226,12 +242,10 @@ output_string_cst (struct output_block * > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct lto_output_stream *index_strea= m, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 tree string) > =A0{ > - =A0if (string) > - =A0 =A0lto_output_string_with_length (ob, index_stream, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TREE= _STRING_POINTER (string), > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TREE= _STRING_LENGTH (string )); > - =A0else > - =A0 =A0lto_output_1_stream (index_stream, 0); > + =A0lto_output_string_with_length (ob, index_stream, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TREE_STR= ING_POINTER (string), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TREE_STR= ING_LENGTH (string), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0true); > =A0} > > > @@ -243,12 +257,10 @@ output_identifier (struct output_block * > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct lto_output_stream *index_strea= m, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 tree id) > =A0{ > - =A0if (id) > - =A0 =A0lto_output_string_with_length (ob, index_stream, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0IDEN= TIFIER_POINTER (id), > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0IDEN= TIFIER_LENGTH (id)); > - =A0else > - =A0 =A0lto_output_1_stream (index_stream, 0); > + =A0lto_output_string_with_length (ob, index_stream, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0IDENTIFI= ER_POINTER (id), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0IDENTIFI= ER_LENGTH (id), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0true); > =A0} > > > @@ -618,8 +630,9 @@ lto_output_location_bitpack (struct bitp > =A0 bp_pack_value (bp, ob->current_file !=3D xloc.file, 1); > =A0 if (ob->current_file !=3D xloc.file) > =A0 =A0 bp_pack_var_len_unsigned (bp, lto_string_index (ob, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 xloc.file, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 strlen (xloc.file) + 1)); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 xloc.file, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 strlen (xloc.file) + 1, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 true)); > =A0 ob->current_file =3D xloc.file; > > =A0 bp_pack_value (bp, ob->current_line !=3D xloc.line, 1); > @@ -1184,7 +1197,8 @@ static void > =A0lto_output_ts_translation_unit_decl_tree_pointers (struct output_block= *ob, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 tree expr) > =A0{ > - =A0lto_output_string (ob, ob->main_stream, TRANSLATION_UNIT_LANGUAGE (e= xpr)); > + =A0lto_output_string (ob, ob->main_stream, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0TRANSLATION_UNIT_LANGUAGE (expr)= , true); > =A0} > > =A0/* Helper for lto_output_tree. =A0Write all pointer fields in EXPR to = output > @@ -1350,12 +1364,12 @@ lto_output_builtin_tree (struct output_b > =A0 =A0 =A0 =A0 reader side from adding a second '*', we omit it here. = =A0*/ > =A0 =A0 =A0 const char *str =3D IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (= expr)); > =A0 =A0 =A0 if (strlen (str) > 1 && str[0] =3D=3D '*') > - =A0 =A0 =A0 lto_output_string (ob, ob->main_stream, &str[1]); > + =A0 =A0 =A0 lto_output_string (ob, ob->main_stream, &str[1], true); > =A0 =A0 =A0 else > - =A0 =A0 =A0 lto_output_string (ob, ob->main_stream, NULL); > + =A0 =A0 =A0 lto_output_string (ob, ob->main_stream, NULL, true); > =A0 =A0 } > =A0 else > - =A0 =A0lto_output_string (ob, ob->main_stream, NULL); > + =A0 =A0lto_output_string (ob, ob->main_stream, NULL, true); > =A0} > > > @@ -1768,7 +1782,7 @@ output_gimple_stmt (struct output_block > =A0 =A0 =A0 lto_output_uleb128_stream (ob->main_stream, gimple_asm_noutpu= ts (stmt)); > =A0 =A0 =A0 lto_output_uleb128_stream (ob->main_stream, gimple_asm_nclobb= ers (stmt)); > =A0 =A0 =A0 lto_output_uleb128_stream (ob->main_stream, gimple_asm_nlabel= s (stmt)); > - =A0 =A0 =A0lto_output_string (ob, ob->main_stream, gimple_asm_string (s= tmt)); > + =A0 =A0 =A0lto_output_string (ob, ob->main_stream, gimple_asm_string (s= tmt), true); > =A0 =A0 =A0 /* Fallthru =A0*/ > > =A0 =A0 case GIMPLE_ASSIGN: > Index: lto-streamer.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- lto-streamer.h =A0 =A0 =A0(revision 174377) > +++ lto-streamer.h =A0 =A0 =A0(working copy) > @@ -700,6 +700,10 @@ struct output_block > > =A0 /* Cache of nodes written in this section. =A0*/ > =A0 struct lto_streamer_cache_d *writer_cache; > + > + =A0/* All data persistent across whole duration of output block > + =A0 =A0 can go here. =A0*/ > + =A0struct obstack obstack; > =A0}; > > > @@ -873,13 +877,6 @@ extern struct output_block *create_outpu > =A0extern void destroy_output_block (struct output_block *); > =A0extern void lto_output_tree (struct output_block *, tree, bool); > =A0extern void produce_asm (struct output_block *ob, tree fn); > -extern void lto_output_string (struct output_block *, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct lto_o= utput_stream *, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0const char *= ); > -extern void lto_output_string_with_length (struct output_block *, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0struct lto_output_stream *, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0const char *, > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0unsigned int); > =A0void lto_output_decl_state_streams (struct output_block *, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0st= ruct lto_out_decl_state *); > =A0void lto_output_decl_state_refs (struct output_block *, >