From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id F07BF3857C52 for ; Thu, 24 Sep 2020 07:47:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F07BF3857C52 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tobias_Burnus@mentor.com IronPort-SDR: hOWrFzEf8po1xe9HUdYl3bhY1Ez8N/G/N2+DIat4U2X5miUipkDoSJtOjebykm/8lddCLBT0Iy BI4tS0m3qJssKdzm4wHw8ximaTe+HkEdQ3dUrXnCUawxXT6DaKz9YnudQ5FWSYn9+Y9HV4npKY Bl7QrsznjDjgm+xdeohZqSd1+ULMEjmTd47MUjRrHyQIaVSYuxhy2Gc+wsMpcu7JhtcUg6K30j 9/cijUmh75T7PiTzAOtkcCxU4aaRme8mqAb8yxvaWmxQoimTT8tkxhskXACgGI4IFJKrURL5uO iI8= X-IronPort-AV: E=Sophos;i="5.77,296,1596528000"; d="scan'208";a="53258661" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa2.mentor.iphmx.com with ESMTP; 23 Sep 2020 23:47:29 -0800 IronPort-SDR: 5KoWaYReqJno9ShPBerqSYYkpRDDYcsecJz1aCJX1PE8oIMJUn7sAc417JJpA9RIPUg0OEeWKd h0u+FKVno8F76HOtLSEerSV6q18p3l94mJO0PPcF0sgqU4PCjb3QUbB2BphkD5TdsqgS7uwjgT aYijzwUd+LO2NF6Ejiav4AI4IDHTwJ/zD3iNxfCcklKhCllwkJLz5ElpGewGKKKDrN1A7i6xZl eKI28Y/yIVgdNUOMs2Q4rWue85HjvB4JoNyN6xi4bXWAjvGpYP20gpN2k61y3hKns9LYYofp9T mfI= Subject: Re: [Patch] LTO: Force externally_visible for offload_vars/funcs (PR97179) To: Richard Biener , Tobias Burnus CC: gcc-patches , Jakub Jelinek , Jan Hubicka References: <4250958d-f7bf-1a0a-31d2-63eff191b258@codesourcery.com> <0e22d8c5-1008-cad4-c131-57ee3950a73a@codesourcery.com> <26b07ad0-ba42-b2c6-2325-cad7360f8e2c@codesourcery.com> From: Tobias Burnus Message-ID: <54a8767f-3cfe-a3ca-6149-0a6d3ee0b6d9@codesourcery.com> Date: Thu, 24 Sep 2020 09:47:20 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: quoted-printable Content-Language: en-US X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Sep 2020 07:47:32 -0000 On 9/24/20 9:03 AM, Richard Biener wrote: > Hmm, but offload_vars and offload_funcs do not need to be exported > since they get stored into tables with addresses pointing to them > (and that table is exported). Granted but the x86-64 linker does not seem to be able to resolve the symbol if the table is in a.ltrans0.ltrans.o and the variable or function is in a.ltrans1.ltrans.o That's both host/x86-64 code; the linker might not see that the table is used by a dynamic library =E2=80=93 but still it should resolve the links, shouldn't it? Possibly, the 'externally_visible =3D 1' in my code is also a read herring; it also works by using: TREE_PUBLIC (decl) =3D 1; gcc_assert (!node->offloadable); node->offloadable =3D 1; and below if (node->offloadable) { node->offloadable =3D 0; validize_symbol_for_target (node); continue; } Namely: PUBLIC + avoid calling promote_symbol. > Note that ultimatively the desired visibility is determined by > the linker and communicated via the resolution file to the WPA > stage. I'm not sure whether both host and offload code participate > in the same link and thus if the offload tables are properly > seen as being referenced This could be the problem. The device part is linked by the host/x86-64 linker =E2=80=93 but the device's ".o" files are just linked and not processed by 'ld. (In case of nvptx, they are host compiled .o files which contain everything as strings with the nvptx as text =E2=80=93 to be passed to the JIT at startup.) Note that *no* WPA/LTO is done on the device side =E2=80=93 there only all generated files are collected without any inter-file optimizations. (Sufficient for the code generated by the program, which is all in one file =E2=80=93 but it still would be useful to inline, e.g., libm functions.) > (for a non-DSO symbols are usually _not_ > force-exported) - so, how is the offload table constructed? First, the offload tables exist both on the host and on the device(s). They have to be identical as otherwise the association between variables and function is lost. The symbols are added to offload_vars + offload_funcs. In lto-cgraph.c's output_offload_tables there is the last chance to remove now unused nodes =E2=80=94 as once the tables are streamed for device usage, they cannot be changed. Hence, there one has node->force_output =3D 1; [Unrelated: this prevents later optimizations, which still could be done; cf. PR95622] The table itself is written in omp-offload.c's omp_finish_file. For the host, the constructor is constructed in add_decls_addresses_to_decl_constructor, which does: CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE, addr); if (is_var) CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE, size); and then in omp_finish_file: tree funcs_decl =3D build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier (".offload_func_table")= , funcs_decl_type); DECL_USER_ALIGN (funcs_decl) =3D DECL_USER_ALIGN (vars_decl) =3D 1; SET_DECL_ALIGN (funcs_decl, TYPE_ALIGN (funcs_decl_type)); DECL_INITIAL (funcs_decl) =3D ctor_f; set_decl_section_name (funcs_decl, OFFLOAD_FUNC_TABLE_SECTION_NAME); varpool_node::finalize_decl (vars_decl); Tobias ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstra=C3=9Fe 201, 80634 M=C3=BCnch= en / Germany Registergericht M=C3=BCnchen HRB 106955, Gesch=C3=A4ftsf=C3=BChrer: Thomas = Heurung, Alexander Walter