From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 944503857800 for ; Mon, 14 Sep 2020 09:02:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 944503857800 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=none smtp.mailfrom=hubicka@kam.mff.cuni.cz Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 16B1B280873; Mon, 14 Sep 2020 11:02:26 +0200 (CEST) Date: Mon, 14 Sep 2020 11:02:26 +0200 From: Jan Hubicka To: Jakub Jelinek Cc: Richard Biener , gcc-patches@gcc.gnu.org, "Joseph S. Myers" Subject: Re: [PATCH] options, lto: Optimize streaming of optimization nodes Message-ID: <20200914090226.GA27871@kam.mff.cuni.cz> References: <20200913083327.GG21814@tucnak> <20200914070048.GT21814@tucnak> <20200914084810.GU21814@tucnak> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200914084810.GU21814@tucnak> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Sep 2020 09:02:35 -0000 > On Mon, Sep 14, 2020 at 09:31:52AM +0200, Richard Biener wrote: > > But does it make any noticable difference in the end? Using > > Yes. > > > bp_pack_var_len_unsigned just causes us to [u]leb encode half-bytes > > rather than full bytes. Using hardcoded 8/16/32/64 makes it still > > dependent on what 'int' is at maximum on the host. > > > > That is, I'd indeed prefer bp_pack_var_len_unsigned over hard-coding > > 8, 16, etc., but can you share a size comparison of the bitpack? > > I guess with bp_pack_var_len_unsigned it might shrink in half > > compared to the current code and streaming standard -O2? > > So, I've tried > --- gcc/tree-streamer-out.c.jj 2020-07-28 15:39:10.079755251 +0200 > +++ gcc/tree-streamer-out.c 2020-09-14 10:31:29.106957258 +0200 > @@ -489,7 +489,11 @@ streamer_write_tree_bitfields (struct ou > pack_ts_translation_unit_decl_value_fields (ob, &bp, expr); > > if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION)) > +{ > +long ts = ob->main_stream->total_size; > cl_optimization_stream_out (ob, &bp, TREE_OPTIMIZATION (expr)); > +fprintf (stderr, "total_size %ld\n", (long) (ob->main_stream->total_size - ts)); > +} You should be able to read the sizes from streaming dump file as well. > > if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR)) > bp_pack_var_len_unsigned (&bp, CONSTRUCTOR_NELTS (expr)); > hack without and with the following patch on a simple small testcase with > -O2 -flto. > Got 574 bytes without the opc-save-gen.awk change and 454 bytes with it, > that is ~ 21% saving on the TREE_OPTIMIZATION streaming. > > 2020-09-14 Jakub Jelinek > > * optc-save-gen.awk: In cl_optimization_stream_out use > bp_pack_var_len_{int,unsigned} instead of bp_pack_value. In > cl_optimization_stream_in use bp_unpack_var_len_{int,unsigned} > instead of bp_unpack_value. Formatting fix. > > --- gcc/optc-save-gen.awk.jj 2020-09-14 09:04:35.879854156 +0200 > +++ gcc/optc-save-gen.awk 2020-09-14 10:38:47.722424942 +0200 > @@ -1257,8 +1257,10 @@ for (i = 0; i < n_opt_val; i++) { > otype = var_opt_val_type[i]; > if (otype ~ "^const char \\**$") > print " bp_pack_string (ob, bp, ptr->" name", true);"; > + else if (otype ~ "^unsigned") > + print " bp_pack_var_len_unsigned (bp, ptr->" name");"; > else > - print " bp_pack_value (bp, ptr->" name", 64);"; > + print " bp_pack_var_len_int (bp, ptr->" name");"; > } > print " for (size_t i = 0; i < sizeof (ptr->explicit_mask) / sizeof (ptr->explicit_mask[0]); i++)"; > print " bp_pack_value (bp, ptr->explicit_mask[i], 64);"; > @@ -1274,14 +1276,15 @@ print "{"; > for (i = 0; i < n_opt_val; i++) { > name = var_opt_val[i] > otype = var_opt_val_type[i]; > - if (otype ~ "^const char \\**$") > - { > - print " ptr->" name" = bp_unpack_string (data_in, bp);"; > - print " if (ptr->" name")"; > - print " ptr->" name" = xstrdup (ptr->" name");"; > + if (otype ~ "^const char \\**$") { > + print " ptr->" name" = bp_unpack_string (data_in, bp);"; > + print " if (ptr->" name")"; > + print " ptr->" name" = xstrdup (ptr->" name");"; > } > + else if (otype ~ "^unsigned") > + print " ptr->" name" = (" var_opt_val_type[i] ") bp_unpack_var_len_unsigned (bp);"; > else > - print " ptr->" name" = (" var_opt_val_type[i] ") bp_unpack_value (bp, 64);"; > + print " ptr->" name" = (" var_opt_val_type[i] ") bp_unpack_var_len_int (bp);"; Not making difference between signed/unsigned was my implementation lazyness at the time code was added. So this looks like nice cleanup. Especially for the new param machinery, most of streamed values are probably going to be the default values. Perhaps somehow we could stream them more effectively. Overall we sould not get much more than 1 optimize/target node per unit so the size should show up only when you stream a lot of very small .o files. Honza > } > print " for (size_t i = 0; i < sizeof (ptr->explicit_mask) / sizeof (ptr->explicit_mask[0]); i++)"; > print " ptr->explicit_mask[i] = bp_unpack_value (bp, 64);"; > > > Jakub >