From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rguenther@suse.de>
Received: from mx2.suse.de (mx2.suse.de [195.135.220.15])
 by sourceware.org (Postfix) with ESMTPS id 1A5A2384C003
 for <gcc-patches@gcc.gnu.org>; Wed, 13 Jan 2021 15:10:03 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1A5A2384C003
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=suse.de
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=rguenther@suse.de
X-Virus-Scanned: by amavisd-new at test-mx.suse.de
Received: from relay2.suse.de (unknown [195.135.221.27])
 by mx2.suse.de (Postfix) with ESMTP id E3665AB92;
 Wed, 13 Jan 2021 15:10:01 +0000 (UTC)
Date: Wed, 13 Jan 2021 16:10:01 +0100 (CET)
From: Richard Biener <rguenther@suse.de>
To: Qing Zhao <QING.ZHAO@ORACLE.COM>
cc: Richard Sandiford <richard.sandiford@arm.com>, 
 Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: The performance data for two different implementation of new
 security feature -ftrivial-auto-var-init
In-Reply-To: <2C0218A8-0D9F-4C49-8293-EF0D19E00288@ORACLE.COM>
Message-ID: <nycvar.YFH.7.76.2101131608170.17979@zhemvz.fhfr.qr>
References: <EBAE1FD7-0440-42FA-AE07-7F98F9552610@ORACLE.COM>
 <CAFiYyc2fOb9TYRm8A8zTffx38eSHJ9=14pdxMbme2X=AZ8nm+A@mail.gmail.com>
 <mptczzq94mu.fsf@arm.com>
 <CAFiYyc1vroX6zZf1MKzRxy5Rnpkg_hUWpTTejspiZ-xU=HoNtg@mail.gmail.com>
 <33955130-9D2D-43D5-818D-1DCC13FC1988@ORACLE.COM>
 <CAFiYyc2fbaT7e055yfRZ3HivAEk5ysKbXJ_+MKZdcAjKnnN2Mw@mail.gmail.com>
 <89D58812-0F3E-47AE-95A5-0A07B66EED8C@ORACLE.COM>
 <CAFiYyc27C=8UCe800sb7cBNaw2iv9PQZzogH70e=e1mMOo3Q+Q@mail.gmail.com>
 <9585CBB2-0082-4B9A-AC75-250F54F0797C@ORACLE.COM>
 <CAFiYyc2=47bUM1OpD_anSGnnj-ZgGAqVffq57XyAr4iq8uLPgA@mail.gmail.com>
 <51911859-45D5-4566-B588-F828B9D7313B@ORACLE.COM>
 <CAFiYyc3G7iFPhtasJ9+NTvbNJQ++fjRqFLCe84KKdoC4hc+gtg@mail.gmail.com>
 <9127AAB9-92C8-4A1B-BAD5-2F5F8762DCF9@ORACLE.COM>
 <5A0F7219-DAFA-4EAA-B845-0E236A108738@ORACLE.COM>
 <AA43C665-DE55-4113-B879-AAC63AAC6F59@ORACLE.COM>
 <nycvar.YFH.7.76.2101130838090.17979@zhemvz.fhfr.qr>
 <2C0218A8-0D9F-4C49-8293-EF0D19E00288@ORACLE.COM>
User-Agent: Alpine 2.21 (LSU 202 2017-01-01)
MIME-Version: 1.0
X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
 RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8BIT
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Jan 2021 15:10:08 -0000

On Wed, 13 Jan 2021, Qing Zhao wrote:

> 
> 
> > On Jan 13, 2021, at 1:39 AM, Richard Biener <rguenther@suse.de> wrote:
> > 
> > On Tue, 12 Jan 2021, Qing Zhao wrote:
> > 
> >> Hi, 
> >> 
> >> Just check in to see whether you have any comments and suggestions on this:
> >> 
> >> FYI, I have been continue with Approach D implementation since last week:
> >> 
> >> D. Adding  calls to .DEFFERED_INIT during gimplification, expand the .DEFFERED_INIT during expand to
> >> real initialization. Adjusting uninitialized pass with the new refs with “.DEFFERED_INIT”.
> >> 
> >> For the remaining work of Approach D:
> >> 
> >> ** complete the implementation of -ftrivial-auto-var-init=pattern;
> >> ** complete the implementation of uninitialized warnings maintenance work for D. 
> >> 
> >> I have completed the uninitialized warnings maintenance work for D.
> >> And finished partial of the -ftrivial-auto-var-init=pattern implementation. 
> >> 
> >> The following are remaining work of Approach D:
> >> 
> >>   ** -ftrivial-auto-var-init=pattern for VLA;
> >>   **add a new attribute for variable:
> >> __attribute((uninitialized)
> >> the marked variable is uninitialized intentionaly for performance purpose.
> >>   ** adding complete testing cases;
> >> 
> >> 
> >> Please let me know if you have any objection on my current decision on implementing approach D. 
> > 
> > Did you do any analysis on how stack usage and code size are changed 
> > with approach D?
> 
> I did the code size change comparison (I will provide the data in another email). And with this data, D works better than A in general. (This is surprise to me actually).
> 
> But not the stack usage.  Not sure how to collect the stack usage data, 
> do you have any suggestion on this?

There is -fstack-usage you could use, then of course watching
the stack segment at runtime.  I'm mostly concerned about
stack-limited "processes" such as the linux kernel which I think
is a primary target of your work.

Richard.

> 
> > How does compile-time behave (we could gobble up
> > lots of .DEFERRED_INIT calls I guess)?
> I can collect this data too and report it later.
> 
> Thanks.
> 
> Qing
> > 
> > Richard.
> > 
> >> Thanks a lot for your help.
> >> 
> >> Qing
> >> 
> >> 
> >>> On Jan 5, 2021, at 1:05 PM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> >>> 
> >>> Hi,
> >>> 
> >>> This is an update for our previous discussion. 
> >>> 
> >>> 1. I implemented the following two different implementations in the latest upstream gcc:
> >>> 
> >>> A. Adding real initialization during gimplification, not maintain the uninitialized warnings.
> >>> 
> >>> D. Adding  calls to .DEFFERED_INIT during gimplification, expand the .DEFFERED_INIT during expand to
> >>> real initialization. Adjusting uninitialized pass with the new refs with “.DEFFERED_INIT”.
> >>> 
> >>> Note, in this initial implementation,
> >>> 	** I ONLY implement -ftrivial-auto-var-init=zero, the implementation of -ftrivial-auto-var-init=pattern 
> >>> 	   is not done yet.  Therefore, the performance data is only about -ftrivial-auto-var-init=zero. 
> >>> 
> >>> 	** I added an temporary  option -fauto-var-init-approach=A|B|C|D  to choose implementation A or D for 
> >>> 	   runtime performance study.
> >>> 	** I didn’t finish the uninitialized warnings maintenance work for D. (That might take more time than I expected). 
> >>> 
> >>> 2. I collected runtime data for CPU2017 on a x86 machine with this new gcc for the following 3 cases:
> >>> 
> >>> no: default. (-g -O2 -march=native )
> >>> A:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=A 
> >>> D:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=D 
> >>> 
> >>> And then compute the slowdown data for both A and D as following:
> >>> 
> >>> benchmarks		A / no	D /no
> >>> 
> >>> 500.perlbench_r	1.25%	1.25%
> >>> 502.gcc_r		0.68%	1.80%
> >>> 505.mcf_r		0.68%	0.14%
> >>> 520.omnetpp_r	4.83%	4.68%
> >>> 523.xalancbmk_r	0.18%	1.96%
> >>> 525.x264_r		1.55%	2.07%
> >>> 531.deepsjeng_	11.57%	11.85%
> >>> 541.leela_r		0.64%	0.80%
> >>> 557.xz_			 -0.41%	-0.41%
> >>> 
> >>> 507.cactuBSSN_r	0.44%	0.44%
> >>> 508.namd_r		0.34%	0.34%
> >>> 510.parest_r		0.17%	0.25%
> >>> 511.povray_r		56.57%	57.27%
> >>> 519.lbm_r		0.00%	0.00%
> >>> 521.wrf_r			 -0.28%	-0.37%
> >>> 526.blender_r		16.96%	17.71%
> >>> 527.cam4_r		0.70%	0.53%
> >>> 538.imagick_r		2.40%	2.40%
> >>> 544.nab_r		0.00%	-0.65%
> >>> 
> >>> avg				5.17%	5.37%
> >>> 
> >>> From the above data, we can see that in general, the runtime performance slowdown for 
> >>> implementation A and D are similar for individual benchmarks.
> >>> 
> >>> There are several benchmarks that have significant slowdown with the new added initialization for both
> >>> A and D, for example, 511.povray_r, 526.blender_, and 531.deepsjeng_r, I will try to study a little bit
> >>> more on what kind of new initializations introduced such slowdown. 
> >>> 
> >>> From the current study so far, I think that approach D should be good enough for our final implementation. 
> >>> So, I will try to finish approach D with the following remaining work
> >>> 
> >>>     ** complete the implementation of -ftrivial-auto-var-init=pattern;
> >>>     ** complete the implementation of uninitialized warnings maintenance work for D. 
> >>> 
> >>> 
> >>> Let me know if you have any comments and suggestions on my current and future work.
> >>> 
> >>> Thanks a lot for your help.
> >>> 
> >>> Qing
> >>> 
> >>>> On Dec 9, 2020, at 10:18 AM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> >>>> 
> >>>> The following are the approaches I will implement and compare:
> >>>> 
> >>>> Our final goal is to keep the uninitialized warning and minimize the run-time performance cost.
> >>>> 
> >>>> A. Adding real initialization during gimplification, not maintain the uninitialized warnings.
> >>>> B. Adding real initialization during gimplification, marking them with “artificial_init”. 
> >>>>   Adjusting uninitialized pass, maintaining the annotation, making sure the real init not
> >>>>   Deleted from the fake init. 
> >>>> C.  Marking the DECL for an uninitialized auto variable as “no_explicit_init” during gimplification,
> >>>>    maintain this “no_explicit_init” bit till after pass_late_warn_uninitialized, or till pass_expand, 
> >>>>    add real initialization for all DECLs that are marked with “no_explicit_init”.
> >>>> D. Adding .DEFFERED_INIT during gimplification, expand the .DEFFERED_INIT during expand to
> >>>>   real initialization. Adjusting uninitialized pass with the new refs with “.DEFFERED_INIT”.
> >>>> 
> >>>> 
> >>>> In the above, approach A will be the one that have the minimum run-time cost, will be the base for the performance
> >>>> comparison. 
> >>>> 
> >>>> I will implement approach D then, this one is expected to have the most run-time overhead among the above list, but
> >>>> Implementation should be the cleanest among B, C, D. Let’s see how much more performance overhead this approach
> >>>> will be. If the data is good, maybe we can avoid the effort to implement B, and C. 
> >>>> 
> >>>> If the performance of D is not good, I will implement B or C at that time.
> >>>> 
> >>>> Let me know if you have any comment or suggestions.
> >>>> 
> >>>> Thanks.
> >>>> 
> >>>> Qing
> >>> 
> >> 
> >> 
> > 
> > -- 
> > Richard Biener <rguenther@suse.de <mailto:rguenther@suse.de>>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)