From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id E124A383D80F for ; Mon, 16 Aug 2021 15:08:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E124A383D80F Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B59ED1FE84; Mon, 16 Aug 2021 15:08:15 +0000 (UTC) Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap1.suse-dmz.suse.de (Postfix) with ESMTPS id 55CDA13B95; Mon, 16 Aug 2021 15:08:15 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap1.suse-dmz.suse.de with ESMTPSA id fWSZFF9/GmEKUAAAGKfGzw (envelope-from ); Mon, 16 Aug 2021 15:08:15 +0000 Date: Mon, 16 Aug 2021 17:08:10 +0200 From: Richard Biener To: Qing Zhao CC: Jakub Jelinek , Richard Sandiford , Nick Alcock via Gcc-patches , Kees Cook Subject: =?US-ASCII?Q?Re=3A_=5Bpatch=5D=5Bversion_6=5D_add_-ftrivial-auto-var-in?= =?US-ASCII?Q?it_and_variable_attribute_=22uninitialized=22_to_gcc?= User-Agent: K-9 Mail for Android In-Reply-To: <7CCB41E9-6807-4E6D-ABFB-C1DE73A9BEF1@oracle.com> References: <52E29277-1403-4755-901A-528116C43FB8@oracle.com> <6FD42B95-F73D-4B75-B83A-BAC4925B1714@oracle.com> <28A70D41-58C6-49FF-BDA9-661F36A2D795@oracle.com> <7CCB41E9-6807-4E6D-ABFB-C1DE73A9BEF1@oracle.com> Message-ID: <459A689F-039B-4298-881A-540A39418C13@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Aug 2021 15:08:27 -0000 On August 16, 2021 4:48:16 PM GMT+02:00, Qing Zhao wrote: > > >> On Aug 16, 2021, at 2:12 AM, Richard Biener wrote= : >>=20 >> On Wed, 11 Aug 2021, Qing Zhao wrote: >>=20 >>> Hi,=20 >>>=20 >>> I finally decided to take another approach to resolve this issue, it r= esolved all the potential issues with the =E2=80=9Caddress taken=E2=80=9D a= uto variable=2E >>>=20 >>> The basic idea is to avoid generating the temporary variable in the be= ginning=2E=20 >>> As you mentioned, "The reason is that alt_reloc is memory (because it = is address taken) and that GIMPLE says that register typed stores=20 >>> need to use a is_gimple_val RHS which the call is not=2E=E2=80=9D >>> In order to avoid generating the temporary variable for =E2=80=9Caddre= ss taken=E2=80=9D auto variable, I updated the utility routine =E2=80=9Cis_= gimple_val=E2=80=9D as following: >>>=20 >>> diff --git a/gcc/gimple-expr=2Ec b/gcc/gimple-expr=2Ec >>> index a2563a45c37d=2E=2Ed5ef1aef8cea 100644 >>> --- a/gcc/gimple-expr=2Ec >>> +++ b/gcc/gimple-expr=2Ec >>> @@ -787,8 +787,20 @@ is_gimple_reg (tree t) >>> return !DECL_NOT_GIMPLE_REG_P (t); >>> } >>>=20 >>> +/* Return true if T is a call to =2EDEFERRED_INIT internal function= =2E */=20 >>> +static bool >>> +is_deferred_init_call (tree t) >>> +{ >>> + if (TREE_CODE (t) =3D=3D CALL_EXPR >>> + && CALL_EXPR_IFN (t) =3D=3D IFN_DEFERRED_INIT) >>> + return true; >>> + return false; >>> +} >>> + >>>=20 >>> -/* Return true if T is a GIMPLE rvalue, i=2Ee=2E an identifier or a c= onstant=2E */ >>> +/* Return true if T is a GIMPLE rvalue, i=2Ee=2E an identifier or a c= onstant, >>> + or a call to =2EDEFERRED_INIT internal function because the call t= o >>> + =2EDEFERRED_INIT will eventually be expanded as a constant=2E */ >>>=20 >>> bool >>> is_gimple_val (tree t) >>> @@ -799,7 +811,8 @@ is_gimple_val (tree t) >>> && !is_gimple_reg (t)) >>> return false; >>>=20 >>> - return (is_gimple_variable (t) || is_gimple_min_invariant (t)); >>> + return (is_gimple_variable (t) || is_gimple_min_invariant (t) >>> + || is_deferred_init_call (t)); >>> } >>>=20 >>> With this change, the temporary variable will not be created for =E2= =80=9Caddress taken=E2=80=9D auto variable, and uninitialized analysis does= not need any change=2E=20 >>> Everything works well=2E >>>=20 >>> And I believe that treating =E2=80=9Ccall to =2EDEFERRED_INIT=E2=80=9D= as =E2=80=9Cis_gimple_val=E2=80=9D is reasonable since this call actually = is a constant=2E >>>=20 >>> Let me know if you have any objection on this solution=2E >>=20 >> Yeah, I object to this solution=2E > >Can you explain what=E2=80=99s the major issue for this solution?=20 It punches a hole into the GIMPLE IL which is very likely incomplete and w= ill cause issues elsewhere=2E In particular the corresponding type check is= missing and not computable since the RHS of this call isn't separately ava= ilable=2E=20 If you go down this route then you should have added a new constant class = tree code instead of an internal function call=2E=20 >To me, treating =E2=80=9Ccall to =2EDEFERRED_INIT=E2=80=9D as =E2=80=9Ci= s_gimple_val=E2=80=9D is reasonable since this call actually is a constant= =2E Sure, but it's not represented as such=2E=20 Richard=2E=20 >Thanks=2E > >Qing >> Richard=2E >>=20 >>> thanks=2E >>>=20 >>> Qing >>>=20 >>>> On Aug 11, 2021, at 3:30 PM, Qing Zhao via Gcc-patches wrote: >>>>=20 >>>> Hi,=20 >>>>=20 >>>> I met another issue for =E2=80=9Caddress taken=E2=80=9D auto variable= , see below for details: >>>>=20 >>>> **** the testing case: (gcc/testsuite/gcc=2Edg/uninit-16=2Ec) >>>>=20 >>>> int foo, bar; >>>>=20 >>>> static >>>> void decode_reloc(int reloc, int *is_alt) >>>> { >>>> if (reloc >=3D 20) >>>> *is_alt =3D 1; >>>> else if (reloc >=3D 10) >>>> *is_alt =3D 0; >>>> } >>>>=20 >>>> void testfunc() >>>> { >>>> int alt_reloc; >>>>=20 >>>> decode_reloc(foo, &alt_reloc); >>>>=20 >>>> if (alt_reloc) /* { dg-warning "may be used uninitialized" } */ >>>> bar =3D 42; >>>> } >>>>=20 >>>> ****When compiled with -ftrivial-auto-var-init=3Dzero -O2 -Wuninitial= ized -fdump-tree-all: >>>>=20 >>>> =2E*************gimple dump: >>>>=20 >>>> void testfunc () >>>> {=20 >>>> int alt_reloc; >>>>=20 >>>> try >>>> { >>>> _1 =3D =2EDEFERRED_INIT (4, 2, 0); >>>> alt_reloc =3D _1; >>>> foo=2E0_2 =3D foo; >>>> decode_reloc (foo=2E0_2, &alt_reloc); >>>> alt_reloc=2E1_3 =3D alt_reloc; >>>> if (alt_reloc=2E1_3 !=3D 0) goto ; else goto = ; >>>> : >>>> bar =3D 42; >>>> : >>>> } >>>> finally >>>> { >>>> alt_reloc =3D {CLOBBER}; >>>> } >>>> } >>>>=20 >>>> **************fre1 dump: >>>>=20 >>>> void testfunc () >>>> { >>>> int alt_reloc; >>>> int _1; >>>> int foo=2E0_2; >>>>=20 >>>> : >>>> _1 =3D =2EDEFERRED_INIT (4, 2, 0); >>>> foo=2E0_2 =3D foo; >>>> if (foo=2E0_2 > 19) >>>> goto ; [50=2E00%] >>>> else >>>> goto ; [50=2E00%] >>>>=20 >>>> : >>>> goto ; [100=2E00%] >>>>=20 >>>> : >>>> if (foo=2E0_2 > 9) >>>> goto ; [50=2E00%] >>>> else >>>> goto ; [50=2E00%] >>>>=20 >>>> : >>>> goto ; [100=2E00%] >>>>=20 >>>> : >>>> if (_1 !=3D 0) >>>> goto ; [INV] >>>> else >>>> goto ; [INV] >>>>=20 >>>> : >>>> bar =3D 42; >>>>=20 >>>> : >>>> return; >>>>=20 >>>> } >>>>=20 >>>> From the above IR file after =E2=80=9CFRE=E2=80=9D, we can see that t= he major issue with this IR is: >>>>=20 >>>> The address taken auto variable =E2=80=9Calt_reloc=E2=80=9D has been = completely replaced by the temporary variable =E2=80=9C_1=E2=80=9D in all >>>> the uses of the original =E2=80=9Calt_reloc=E2=80=9D=2E=20 >>>>=20 >>>> The major problem with such IR is, during uninitialized analysis pha= se, the original use of =E2=80=9Calt_reloc=E2=80=9D disappeared completely= =2E >>>> So, the warning cannot be reported=2E >>>>=20 >>>>=20 >>>> My questions: >>>>=20 >>>> 1=2E Is it possible to get the original =E2=80=9Calt_reloc=E2=80=9D t= hrough the temporary variable =E2=80=9C_1=E2=80=9D with some available info= rmation recorded in the IR? >>>> 2=2E If not, then we have to record the relationship between =E2=80= =9Calt_reloc=E2=80=9D and =E2=80=9C_1=E2=80=9D when the original =E2=80=9Ca= lt_reloc=E2=80=9D is replaced by =E2=80=9C_1=E2=80=9D and get such relation= ship during >>>> Uninitialized analysis phase=2E Is this doable? >>>> 3=2E Looks like that for =E2=80=9Caddress taken=E2=80=9D auto variabl= e, if we have to introduce a new temporary variable and split the call to = =2EDEFERRED_INIT into two: >>>>=20 >>>> temp =3D =2EDEFERRED_INIT (4, 2, 0); >>>> alt_reloc =3D temp; >>>>=20 >>>> More issues might possible=2E >>>>=20 >>>> Any comments and suggestions on this issue? >>>>=20 >>>> Qing >>>>=20 >>>> j >>>>> On Aug 11, 2021, at 11:55 AM, Richard Biener w= rote: >>>>>=20 >>>>> On August 11, 2021 6:22:00 PM GMT+02:00, Qing Zhao wrote: >>>>>>=20 >>>>>>=20 >>>>>>> On Aug 11, 2021, at 10:53 AM, Richard Biener = wrote: >>>>>>>=20 >>>>>>> On August 11, 2021 5:30:40 PM GMT+02:00, Qing Zhao wrote: >>>>>>>> I modified the routine =E2=80=9Cgimple_add_init_for_auto_var=E2= =80=9D as the following: >>>>>>>> =3D=3D=3D=3D >>>>>>>> /* Generate initialization to automatic variable DECL based on IN= IT_TYPE=2E >>>>>>>> Build a call to internal const function DEFERRED_INIT: >>>>>>>> 1st argument: SIZE of the DECL; >>>>>>>> 2nd argument: INIT_TYPE; >>>>>>>> 3rd argument: IS_VLA, 0 NO, 1 YES; >>>>>>>>=20 >>>>>>>> as DEFERRED_INIT (SIZE of the DECL, INIT_TYPE, IS_VLA)=2E */ >>>>>>>> static void >>>>>>>> gimple_add_init_for_auto_var (tree decl, >>>>>>>> enum auto_init_type init_type, >>>>>>>> bool is_vla, >>>>>>>> gimple_seq *seq_p) >>>>>>>> { >>>>>>>> gcc_assert (VAR_P (decl) && !DECL_EXTERNAL (decl) && !TREE_STATIC= (decl)); >>>>>>>> gcc_assert (init_type > AUTO_INIT_UNINITIALIZED); >>>>>>>> tree decl_size =3D TYPE_SIZE_UNIT (TREE_TYPE (decl)); >>>>>>>>=20 >>>>>>>> tree init_type_node >>>>>>>> =3D build_int_cst (integer_type_node, (int) init_type); >>>>>>>> tree is_vla_node >>>>>>>> =3D build_int_cst (integer_type_node, (int) is_vla); >>>>>>>>=20 >>>>>>>> tree call =3D build_call_expr_internal_loc (UNKNOWN_LOCATION, IFN= _DEFERRED_INIT, >>>>>>>> TREE_TYPE (decl), 3, >>>>>>>> decl_size, init_type_node, >>>>>>>> is_vla_node); >>>>>>>>=20 >>>>>>>> /* If this DECL is a VLA, a temporary address variable for it has= been >>>>>>>> created, the replacement for DECL is recorded in DECL_VALUE_EXPR = (decl), >>>>>>>> we should use it as the LHS of the call=2E */ >>>>>>>>=20 >>>>>>>> tree lhs_call >>>>>>>> =3D is_vla ? DECL_VALUE_EXPR (decl) : decl; >>>>>>>> gimplify_assign (lhs_call, call, seq_p); >>>>>>>> } >>>>>>>>=20 >>>>>>>> With this change, the current issue is resolved, the gimple dump = now is: >>>>>>>>=20 >>>>>>>> (*arr=2E1) =3D =2EDEFERRED_INIT (D=2E1952, 2, 1); >>>>>>>>=20 >>>>>>>> However, there is another new issue: >>>>>>>>=20 >>>>>>>> For the following testing case: >>>>>>>>=20 >>>>>>>> =3D=3D=3D=3D=3D=3D >>>>>>>> [opc@qinzhao-ol8u3-x86 gcc]$ cat t=2Ec >>>>>>>> int bar; >>>>>>>>=20 >>>>>>>> extern void decode_reloc(int *); >>>>>>>>=20 >>>>>>>> void testfunc() >>>>>>>> { >>>>>>>> int alt_reloc; >>>>>>>>=20 >>>>>>>> decode_reloc(&alt_reloc); >>>>>>>>=20 >>>>>>>> if (alt_reloc) /* { dg-warning "may be used uninitialized" } */ >>>>>>>> bar =3D 42;=20 >>>>>>>> } >>>>>>>> =3D=3D=3D=3D=3D >>>>>>>>=20 >>>>>>>> In the above, the auto var =E2=80=9Calt_reloc=E2=80=9D is address= taken, then the gimple dump for it when compiled with -ftrivial-auto-var-i= nit=3Dzero is: >>>>>>>>=20 >>>>>>>> void testfunc () >>>>>>>> { >>>>>>>> int alt_reloc; >>>>>>>>=20 >>>>>>>> try >>>>>>>> { >>>>>>>> _1 =3D =2EDEFERRED_INIT (4, 2, 0); >>>>>>>> alt_reloc =3D _1; >>>>>>>> decode_reloc (&alt_reloc); >>>>>>>> alt_reloc=2E0_2 =3D alt_reloc; >>>>>>>> if (alt_reloc=2E0_2 !=3D 0) goto ; else goto ; >>>>>>>> : >>>>>>>> bar =3D 42; >>>>>>>> : >>>>>>>> } >>>>>>>> finally >>>>>>>> { >>>>>>>> alt_reloc =3D {CLOBBER}; >>>>>>>> } >>>>>>>> } >>>>>>>>=20 >>>>>>>> I=2Ee, instead of the expected IR: >>>>>>>>=20 >>>>>>>> alt_reloc =3D =2EDEFERRED_INIT (4, 2, 0); >>>>>>>>=20 >>>>>>>> We got the following: >>>>>>>>=20 >>>>>>>> _1 =3D =2EDEFERRED_INIT (4, 2, 0); >>>>>>>> alt_reloc =3D _1; >>>>>>>>=20 >>>>>>>> I guess the temp =E2=80=9C_1=E2=80=9D is created because =E2=80= =9Calt_reloc=E2=80=9D is address taken=2E=20 >>>>>>>=20 >>>>>>> Yes and no=2E The reason is that alt_reloc is memory (because it i= s address taken) and that GIMPLE says that register typed stores need to us= e a is_gimple_val RHS which the call is not=2E >>>>>>=20 >>>>>> Okay=2E >>>>>>>=20 >>>>>>>> My questions: >>>>>>>>=20 >>>>>>>> Shall we accept such IR for =2EDEFERRED_INIT purpose when the aut= o var is address taken?=20 >>>>>>>=20 >>>>>>> I think so=2E Note it doesn't necessarily need address taking but = any other reason that prevents SSA rewriting the variable suffices=2E=20 >>>>>>=20 >>>>>> You mean, in addition to =E2=80=9Caddress taken=E2=80=9D, there are= other situations that will introduce such IR: >>>>>>=20 >>>>>> temp =3D =2EDEFERRED_INIT(); >>>>>> auto_var =3D temp; >>>>>>=20 >>>>>> So, such IR is unavoidable and we have to handle it? >>>>>=20 >>>>> Yes=2E=20 >>>>>=20 >>>>>> If we have to handle it, what=E2=80=99 the best way to do it? >>>>>>=20 >>>>>> The solution in my mind is: >>>>>> 1=2E During uninitialized analysis phase, following the data flow t= o connect =2EDEFERRED_INIT to =E2=80=9Cauto_var=E2=80=9D, and then decide t= hat =E2=80=9Cauto_var=E2=80=9D is uninitialized=2E >>>>>=20 >>>>> Yes=2E Basically if there's an artificial variable auto initialized = you have to look at its uses=2E=20 >>>>>=20 >>>>>> 2=2E During RTL expansion, following the data flow to connect =2EDE= FERRED_INIT to =E2=80=9Cauto_var=E2=80=9D, and then delete =E2=80=9Ctemp=E2= =80=9D, and then expand =2EDEFERRED_INIT to auto_var=2E >>>>>=20 >>>>> That shouldn't be necessary=2E You'd initialize a temporary register= which is then copied to the real variable=2E That's good enough and should= be optimized by the RTL pipeline=2E=20 >>>>>=20 >>>>>> Let me know your comments and suggestions on this=2E >>>>>>=20 >>>>>>=20 >>>>>>>=20 >>>>>>> The only other option is to force=2E DEFERED_INIT making the LHS a= ddress taken which I think could be achieved by passing it the address as a= rgument instead of having a LHS=2E But let's not go down this route - it wi= ll have quite bad behavior on alias analysis and optimization=2E=20 >>>>>>=20 >>>>>> Okay=2E >>>>>>=20 >>>>>> Qing >>>>>>>=20 >>>>>>>> If so, =E2=80=9Cuninitialized analysis=E2=80=9D phase need to be = further adjusted to specially handle such IR=2E=20 >>>>>>>>=20 >>>>>>>> If not, what should we do when the auto var is address taken? >>>=20 >>>=20 >>=20 >> --=20 >> Richard Biener >> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg= , >> Germany; GF: Felix Imend=C3=B6rffer; HRB 36809 (AG Nuernberg) >