From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 0838938438FF for ; Tue, 17 Aug 2021 08:43:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0838938438FF Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id CD10021C30; Tue, 17 Aug 2021 08:43:00 +0000 (UTC) Received: from murzim.suse.de (murzim.suse.de [10.160.4.192]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id C4AFEA3B81; Tue, 17 Aug 2021 08:43:00 +0000 (UTC) Date: Tue, 17 Aug 2021 10:43:00 +0200 (CEST) From: Richard Biener To: Qing Zhao cc: Martin Jambor , Jakub Jelinek , Kees Cook , Nick Alcock via Gcc-patches , Richard Biener Subject: Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc In-Reply-To: Message-ID: References: <52E29277-1403-4755-901A-528116C43FB8@oracle.com> <517EA40B-9500-4090-8F03-B4A9CECC62F8@oracle.com> User-Agent: Alpine 2.21 (LSU 202 2017-01-01) MIME-Version: 1.0 X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Aug 2021 08:43:18 -0000 On Mon, 16 Aug 2021, Qing Zhao wrote: > > > > On Aug 16, 2021, at 2:40 AM, Richard Biener wrote: > > > > On Thu, 12 Aug 2021, Qing Zhao wrote: > > > >> Hi, Richard, > >> > >> For RTL expansion of call to .DEFERRED_INIT, I changed my code per your suggestions like following: > >> > >> ====================== > >> #define INIT_PATTERN_VALUE 0xFE > >> static void > >> expand_DEFERRED_INIT (internal_fn, gcall *stmt) > >> { > >> tree lhs = gimple_call_lhs (stmt); > >> tree var_size = gimple_call_arg (stmt, 0); > >> enum auto_init_type init_type > >> = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1)); > >> bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2)); > >> > >> tree var_type = TREE_TYPE (lhs); > >> gcc_assert (init_type > AUTO_INIT_UNINITIALIZED); > >> > >> if (is_vla || (!can_native_interpret_type_p (var_type))) > >> { > >> /* If this is a VLA or the type of the variable cannot be natively > >> interpreted, expand to a memset to initialize it. */ > >> if (TREE_CODE (lhs) == SSA_NAME) > >> lhs = SSA_NAME_VAR (lhs); > >> tree var_addr = NULL_TREE; > >> if (is_vla) > >> var_addr = TREE_OPERAND (lhs, 0); > >> else > >> { > >> TREE_ADDRESSABLE (lhs) = 1; > >> var_addr = build_fold_addr_expr (lhs); > >> } > >> tree value = (init_type == AUTO_INIT_PATTERN) ? > >> build_int_cst (unsigned_char_type_node, > >> INIT_PATTERN_VALUE) : > >> build_zero_cst (unsigned_char_type_node); > >> tree m_call = build_call_expr (builtin_decl_implicit (BUILT_IN_MEMSET), > >> 3, var_addr, value, var_size); > >> /* Expand this memset call. */ > >> expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type)); > >> } > >> else > >> { > >> /* If this is not a VLA and the type of the variable can be natively > >> interpreted, expand to assignment to generate better code. */ > >> tree pattern = NULL_TREE; > >> unsigned HOST_WIDE_INT total_bytes > >> = tree_to_uhwi (TYPE_SIZE_UNIT (var_type)); > >> > >> if (init_type == AUTO_INIT_PATTERN) > >> { > >> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > >> memset (buf, INIT_PATTERN_VALUE, total_bytes); > >> pattern = native_interpret_expr (var_type, buf, total_bytes); > >> gcc_assert (pattern); > >> } > >> > >> tree init = (init_type == AUTO_INIT_PATTERN) ? > >> pattern : > >> build_zero_cst (var_type); > >> expand_assignment (lhs, init, false); > >> } > >> } > >> =========================== > >> > >> Now, I used “can_native_interpret_type_p (var_type)” instead of “use_register_for_decl (lhs)” to decide > >> whether to use “memset” or use “assign” to expand this function. > >> > >> However, this exposed an bug that is very hard to be addressed: > >> > >> *******For the testing case: test suite/gcc.dg/uninit-I.c: > >> > >> /* { dg-do compile } */ > >> /* { dg-options "-O2 -Wuninitialized" } */ > >> > >> int sys_msgctl (void) > >> { > >> struct { int mode; } setbuf; > >> return setbuf.mode; /* { dg-warning "'setbuf\.mode' is used" } */ > >> == > >> > >> ******the above auto var “setbuf” has “struct” type, which “can_native_interpret_type_p(var_type)” is false, therefore, > >> Expanding this .DEFERRED_INIT call went down the “memset” expansion route. > >> > >> However, this structure type can be fitted into a register, therefore cannot be taken address anymore at this stage, even though I tried: > >> > >> TREE_ADDRESSABLE (lhs) = 1; > >> var_addr = build_fold_addr_expr (lhs); > >> > >> To create an address variable for it, the expansion still failed at expr.c: line 8412: > >> during RTL pass: expand > >> /home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-I.c:6:24: internal compiler error: in expand_expr_addr_expr_1, at expr.c:8412 > >> 0xd04104 expand_expr_addr_expr_1 > >> ../../latest-gcc/gcc/expr.c:8412 > >> 0xd04a95 expand_expr_addr_expr > >> ../../latest-gcc/gcc/expr.c:8525 > >> 0xd13592 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) > >> ../../latest-gcc/gcc/expr.c:11741 > >> 0xd05142 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) > >> ../../latest-gcc/gcc/expr.c:8713 > >> 0xaed1d3 expand_expr > >> ../../latest-gcc/gcc/expr.h:301 > >> 0xaf0d89 get_memory_rtx > >> ../../latest-gcc/gcc/builtins.c:1370 > >> 0xafb4fb expand_builtin_memset_args > >> ../../latest-gcc/gcc/builtins.c:4102 > >> 0xafacde expand_builtin_memset(tree_node*, rtx_def*, machine_mode) > >> ../../latest-gcc/gcc/builtins.c:3886 > >> 0xe97fb3 expand_DEFERRED_INIT > >> > >> ******That’s the major reason why I chose “use_register_for_decl(lhs)” to decide “memset” expansion or “assign” expansion, “memset” expansion > >> needs to take address of the variable, if the variable has been decided to fit into a register, then its address cannot taken anymore at this stage. > >> > >> ******using “can_native_interpret_type_p” did make the “pattern” generation part much cleaner and simpler, however, looks like it didn’t work correctly. > >> > >> Based on this, I’d like to keep my previous implementation by using “use_register_for_decl” to decide whether to take “memset” expansion or “assign” expansion. > >> Therefore, I might still need to keep the “UGLY” implementation of generatting “pattern” constant for different types? > >> > >> Let me know your opinion on this. > > > > Hmm, I think you can use use_register_for_decl(lhs) to decide to use an > > alternate type to generate the pattern (and feed to > > can_native_interpret_type_p) by using > > lang_hooks.type_for_mode (TYPE_MODE (TREE_TYPE (lhs))). > > Do you mean that the TYPE returned by “lang_hooks.type_for_mode(TYPE_MODE (TREE_TYPE (lhs))” will always satisfy “can_native_interpret_type_p”? > Even for big structure types? I meant that for use_register_for_decl (lhs) the structures will be always small and the structure type will have a mode that is not BLKmode (but for example DImode for struct { int i; int j; }). > i.e, > > tree var_type = TREE_TYPE(lhs); > tree pattern = NULL_TREE; > unsigned HOST_WIDE_INT total_bytes > = tree_to_uhwi (TYPE_SIZE_UNIT (var_type)); > > If (use_register_for_decl(lhs)==false) > { > tree alt_type = lang_hooks.type_for_mode(TYPE_MODE(var_type), TYPE_UNSIGNED(var_type); > If (can_native_interpret_type_p (alt_type)) > { > unsigned char *buf = (unsigned char *) xmalloc (total_bytes); > memset (buf, INIT_PATTERN_VALUE, total_bytes); > pattern = native_interpret_expr (alt_type, buf, total_bytes); > gcc_assert (pattern); > } > else > gcc_unreachable (); > } > > ? > Don’t quite understand here. Please clarify. For !use_register_for_decl you use memset already, but for use_register_for_decl not all types satisfy can_native_interpret_type_p (in particular all struct and union types). But when we use a register for the decl then we can of course directly initialize the register. As said, it would be much cleaner (and maybe also easier) to then simply expand the RTL directly rather than going through expand_assignment. It's also easier to directly see whether the LHS is a MEM_P or a REG_P but then for the MEM_P case it's a bit more "awkward" to use the easy way of the generic expand code (esp. if we eventually want to emit actual calls to memset - do we?) > > You can then > > build the assignment from the pattern as > > > > VIEW_CONVERT (lhs) = pattern_cst; > > What’s the in the above? The type “alt_type” returned by “lang_hooks.type_for_mode(TREE_TYPE(lhs))? > Do I need to build “MODIFY_EXPR” for the above? Yes, the lang_hook.type_for_mode result and no, you could go through expand_assignment. > lhs = build1 (VIEW_CONVERT_EXPR, alt_type, lhs); > tree final = build2 (MODIFY_EXPR, TREE_TYPE (alt_type), lhs, pattern); > > Then how to expand this”final"? > > > > note that more RTL-expand-ish would be to simply expand 'lhs' and > > see whether it's a REG_P or a MEM_P and decide based on that. > > You mean that the current RTL expansion will automatically expand LHS to memset route or assignment route based on whether > LHS is a REG_P or MEM_P? I don’t need to explicitly code for “expand_builtin_memset” or “expand_assign”? No. But you are inside the expander for the internal function call and this is expected to generate RTL. You can simply generate RTL directly without "faking" new GENERIC calls or assignments and expanding those. But lets not go there for now. Richard.