public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Qing Zhao <qing.zhao@oracle.com>
Cc: Kees cook <keescook@chromium.org>,
	richard Sandiford <richard.sandiford@arm.com>,
	gcc-patches Qing Zhao via <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc
Date: Mon, 21 Jun 2021 17:35:32 +0200	[thread overview]
Message-ID: <D317749E-ED7E-4F41-ABB9-4F8028F1343B@suse.de> (raw)
In-Reply-To: <FBE3320B-AA51-4789-81DB-7B78C2819B65@oracle.com>

On June 21, 2021 5:11:30 PM GMT+02:00, Qing Zhao <qing.zhao@oracle.com> wrote:
>HI, Richard,
>
>> On Jun 21, 2021, at 2:53 AM, Richard Biener <rguenther@suse.de>
>wrote:
>> 
>>> 
>>> 
>>> This is for the compatibility with CLANG. -:).
>(https://reviews.llvm.org/D54604)
>> 
>> I don't care about functional 1:1 "compatibility" with CLANG.
>
>Okay.  -:)
>
>> 
>>> 1. Pattern initialization
>>> 
>>>  This is the recommended initialization approach. Pattern
>initialization's
>> 
>> But elsewhere you said pattern initialization is only for debugging,
>> not production …
>
>Yes. Pattern initialization is only for debugging purpose during
>development phase.
>
>> 
>>> Use a pattern that fits them all.  I mean memory allocation
>hardening
>>> fills allocated storage with a repeated (byte) pattern and people
>are
>>> happy with that.  It also makes it easy to spot uninitialized
>storage
>>> from a debugger.  So please, do not over-design this, it really
>doesn't
>>> make any sense and the common case you are inevitably chasing here
>>> would already be fine with a random repeated pattern.
>>> 
>>> So, My question is:
>>> 
>>> If we want to pattern initialize with the single repeated pattern
>for all types, with one is better to use:  “0xAAAAAAAA”
>>> or “0xFFFFFFFF” , or other pattern that our current glibc used?
>What’s that pattern?
>> 
>> It's set by the user.
>
>Yes, looks like that glibc uses a byte-repeated pattern that is set by
>the user through environment variable.
>
>> 
>>> Will  “0xAAAAAAAA” in a floating type auto variable crash the
>program?
>>> Will “0xFFFFFFFF” in a pointer type auto variable crash the program?
>(Might crash?)
>>> 
>>> 
>>> (thus also my suggestion to split out
>>> padding handling - now we can also split out pattern init handling,
>>> maybe somebody else feels like reviewing and approving this, who
>knows).
>>> 
>>> I am okay with further splitting pattern initialization part to a
>separate patch. Then we will
>>> have 4 independent patches in total:
>>> 
>>> 1. -fauto-var-init=zero and all the handling in other passes to the
>new added call to .DEFERRED_INIT.
>>> 2. Add -fauto-var-init=pattern
>>> 3. Add -fauto-var-init-padding
>>> 4. Add -ftrivial-auto-var-init for CLANG compatibility.
>>> 
>>> Are the above the correct understanding?
>> 
>> I think we can drop -fauto-var-init=pattern and just go with block
>> initializing which will cover padding as well which means we can
>> stay with the odd -ftrivial-auto-var-init name used by CLANG and
>> add no additional options.
>
>Yes, this is a good idea. 
>
>block initializing will cover all paddings automatically. 
>
>Shall we do block initializing for both “zero initialization” and
>“pattern initialization”?
>
>Currently, for zero initialization, I used the following:
>
>>>> +    case AUTO_INIT_ZERO:
>>>> +      init = build_zero_cst (TREE_TYPE (var));
>>>> +      expand_assignment (var, init, false);
>>>> +      break;
>
>Looks like that the current “expand_assignment” does not initialize
>paddings with zeroes. 
>Shall I also use “memset” for “zero initialization”?

I'd say so, yes. 

>> 
>>> As said, block-initializing with a repeated pattern is OK and I can
>see
>>> that being useful.  Trying to produce "nicer" values for floats,
>bools
>>> and pointers on 32bit platforms is IMHO not going to fix anything
>and
>>> introduce as many problems as it will "fix".
>>> 
>>> Yes, I agree, if we can find a good repeated pattern for all types’s
>
>>> pattern initialization, that will be much easier and simpler to 
>>> implement, I am happy to do that.  (Honestly, the part of
>implementation 
>>> that took me most of the time is pattern-initialization.. and I am
>still 
>>> not very comfortable with this part Of the code myself.  -:)
>> 
>> There's no "safe" pattern besides all-zero for all "undefined" uses
>> (note that uses do not necessarily use declared types).  Which is why
>> recommending pattern init is somewhat misguided.  There's maybe 
>> some useful pattern that more readily produces crashes, those that
>> produce a FP sNaN for all of the float types.
>
>So, pattern value as 0xFF might be better than 0xAA since 0xFFFFFFFF
>will be a NaN value for floating type?

I think for debugging NaNs are quite nice, yes. 

>> 
>>> And if you block-initialize stuff you then automagically cover
>padding.
>>> I call this a win-win, no?
>>> 
>>> Yes, this will also initialize paddings with patterns (Not zeroes as
>CLANG did).
>>> Shall we compatible with CLANG on this?
>> 
>> No, why?
>
>Okay.
>
>>> in my example code (untested) you then still need
>>> 
>>>  expand_assignment (var, ctor, false);
>>> 
>>> it would be the easiest way to try pattern init with a pattern
>that's
>>> bigger than a byte (otherwise of course the memset path is optimal).
>>> 
>>> If the pattern that is used to initialize all types is
>byte-repeatable, for example, 0xA or 0xF, then
>>> We can use memset to initialize all types, however, the potential
>problem is, if later we decide
>>> To change to another pattern that might not be byte-repeatable, then
>the memset implementation
>>> is not proper at that time.
>>> 
>>> Is it possible that we might change the pattern later?
>> 
>> The pattern should be documented as an implementation detail unless
>> we want to expose it to the user via, say, -fpattern-init=0xdeadbeef.
>
>Not sure whether it’s necessary to expose this to user.
>
>One question that is important to the implementation is:
>
>Shall we use “byte-repeated” or “word-repeated” pattern?
>Is “word-repeated” pattern better than “byte-repeated” pattern?
>
>For implementation, “byte-repeated” pattern will make the whole
>implementation much simpler since both “zero initialization” 
>and “pattern initialization” can be implemented with “memset” with
>different “value”.  
>
>So, if “word-repeated” pattern will not have too much more benefit, I
>will prefer “byte-repeated” pattern.
>
>Let me know your comments here.

I have no strong opinion and prefer byte repetition for simplicity. But I would document this as implementation detail that can change. 

Richard. 

>> 
>>> 
>>> 
>>> As said, for example glibc allocator hardening with MALLOC_PERTURB_
>>> uses simple byte-init.
>>> 
>>> What’s the pattern glibc used?
>> 
>> The value of the MALLOC_PERTURB_ environment truncated to a byte.
>
>Okay.
>
>thanks.
>
>Qing
>> 
>> Richard.
>> 


  reply	other threads:[~2021-06-21 15:35 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-12 17:16 Qing Zhao
2021-05-25 19:26 ` Qing Zhao
2021-05-26 11:18 ` Richard Biener
2021-05-27 19:44   ` Qing Zhao
2021-06-07  7:48     ` Richard Biener
2021-06-07 16:13       ` Qing Zhao
2021-06-08  7:37         ` Richard Biener
2021-06-08 16:56           ` Kees Cook
2021-06-08 17:32             ` Qing Zhao
2021-06-08 17:36               ` Kees Cook
2021-06-07 23:45       ` Kees Cook
2021-06-08  8:27         ` Richard Biener
2021-05-27 21:42   ` Qing Zhao
2021-06-03 20:14   ` Qing Zhao
2021-06-07  7:50     ` Richard Biener
2021-06-03 20:18   ` Qing Zhao
2021-06-07  7:53     ` Richard Biener
2021-06-07 16:18       ` Qing Zhao
2021-06-07 23:48         ` Kees Cook
2021-06-08  7:41         ` Richard Biener
2021-06-08 15:27           ` Qing Zhao
2021-06-08 16:59           ` Kees Cook
2021-06-08 18:05             ` Qing Zhao
2021-06-11 11:04             ` Richard Biener
2021-06-11 17:14               ` Kees Cook
2021-06-10 21:11   ` Qing Zhao
2021-06-11 11:12     ` Richard Biener
2021-06-11 15:49       ` Qing Zhao
2021-06-11 16:24         ` Kees Cook
2021-06-11 17:00         ` Qing Zhao
2021-06-14 16:10         ` Qing Zhao
2021-06-15 13:21           ` Richard Biener
2021-06-15 21:49             ` Qing Zhao
2021-06-16  6:19               ` Richard Biener
2021-06-16 15:04                 ` Qing Zhao
2021-06-16 19:39                   ` Qing Zhao
2021-06-18 23:47                     ` Kees Cook
2021-06-21 15:39                       ` Qing Zhao
2021-06-21 16:18                         ` Kees Cook
2021-06-21 17:11                           ` Qing Zhao
2021-06-22  8:25                           ` Richard Sandiford
2021-06-22  8:59                             ` Richard Biener
2021-06-22 13:54                               ` Qing Zhao
2021-06-22 14:00                                 ` Richard Biener
2021-06-22 14:10                                   ` Qing Zhao
2021-06-22 14:15                                     ` Richard Biener
2021-06-22 14:33                                       ` Qing Zhao
2021-06-22 19:04                                         ` Richard Biener
2021-06-22 17:55                             ` Kees Cook
2021-06-22 18:18                               ` Richard Sandiford
2021-06-22 21:31                                 ` Qing Zhao
2021-06-23  6:05                                   ` Richard Biener
2021-06-21  7:53                   ` Richard Biener
2021-06-21 15:11                     ` Qing Zhao
2021-06-21 15:35                       ` Richard Biener [this message]
2021-06-21 16:13                         ` Qing Zhao
2021-06-22  6:24                           ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D317749E-ED7E-4F41-ABB9-4F8028F1343B@suse.de \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=keescook@chromium.org \
    --cc=qing.zhao@oracle.com \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).