* nvptx: Support global constructors/destructors via 'collect2' [not found] <878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com> @ 2022-12-02 13:35 ` Thomas Schwinge 2022-12-20 8:03 ` [PING] " Thomas Schwinge ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Thomas Schwinge @ 2022-12-02 13:35 UTC (permalink / raw) To: gcc-patches, Tom de Vries [-- Attachment #1: Type: text/plain, Size: 1141 bytes --] Hi! On 2022-12-01T22:13:38+0100, I wrote: > I'm working on support for global constructors/destructors with > GCC/nvptx See "nvptx: Support global constructors/destructors via 'collect2'" attached; OK to push? (... with 'gcc/doc/install.texi' accordingly updated once <https://github.com/MentorEmbedded/nvptx-tools/pull/40> "'nm'" and newlib <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> "nvptx: Implement '_exit' instead of 'exit'" have been merged; any comments to those?) Per my quick scanning of 'gcc/config.gcc' history, for more than two decades, there was a clear trend to remove 'use_collect2=yes' configurations; now finally a new one is being added -- making sure we're not slowly dispensing with the need for the early 1990s piece of work that 'gcc/collect2*' is... ;'-P Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-nvptx-Support-global-constructors-destructors-via-co.patch --] [-- Type: text/x-diff, Size: 10745 bytes --] From ba5f6471d39e684fb740523651138a90a1b63cf9 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Sun, 13 Nov 2022 14:19:30 +0100 Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2' The function attributes 'constructor', 'destructor', and 'init_priority' now work, as do the C++ features making use of this. Test cases with effective target 'global_constructor' and 'init_priority' now generally work, and 'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented: global constructors not supported on this target". This depends on <https://github.com/MentorEmbedded/nvptx-tools/pull/40> "'nm'" generally, and for global destructors support: newlib <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> "nvptx: Implement '_exit' instead of 'exit'". gcc/ * collect2.cc (write_c_file_glob): Allow for 'COLLECT2_MAIN_REFERENCE' override. * config.gcc <case ${target} in nvptx-*>: Set 'use_collect2=yes'. * config/nvptx/nvptx.h: Adjust. gcc/testsuite/ * gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is 'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper in identifiers. * lib/target-supports.exp (check_effective_target_global_constructor): Enable for nvptx. libgcc/ * config.host <case ${host} in nvptx-*>: Add 'crtbegin.o', 'crtend.o' to 'extra_parts'. * config/nvptx/crt0.c: Invoke '__do_global_ctors', '__do_global_dtors'. * config/nvptx/crtstuff.c: New. * config/nvptx/t-nvptx: Adjust. --- gcc/collect2.cc | 4 ++ gcc/config.gcc | 1 + gcc/config/nvptx/nvptx.h | 35 ++++++++++- .../no_profile_instrument_function-attr-1.c | 2 +- gcc/testsuite/lib/target-supports.exp | 3 +- libgcc/config.host | 2 +- libgcc/config/nvptx/crt0.c | 5 ++ libgcc/config/nvptx/crtstuff.c | 58 +++++++++++++++++++ libgcc/config/nvptx/t-nvptx | 15 ++++- 9 files changed, 118 insertions(+), 7 deletions(-) create mode 100644 libgcc/config/nvptx/crtstuff.c diff --git a/gcc/collect2.cc b/gcc/collect2.cc index d81c7f28f16..945a9ff86dd 100644 --- a/gcc/collect2.cc +++ b/gcc/collect2.cc @@ -2238,8 +2238,12 @@ write_c_file_glob (FILE *stream, const char *name ATTRIBUTE_UNUSED) fprintf (stream, "\tdereg_frame,\n"); fprintf (stream, "\t0\n};\n\n"); +# ifdef COLLECT2_MAIN_REFERENCE + fprintf (stream, "%s\n\n", COLLECT2_MAIN_REFERENCE); +# else fprintf (stream, "extern entry_pt %s;\n", NAME__MAIN); fprintf (stream, "entry_pt *__main_reference = %s;\n\n", NAME__MAIN); +# endif } #endif /* ! LD_INIT_SWITCH */ diff --git a/gcc/config.gcc b/gcc/config.gcc index b5eda046033..9b27efd5f51 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -2783,6 +2783,7 @@ nvptx-*) tm_file="${tm_file} newlib-stdint.h" use_gcc_stdint=wrap tmake_file="nvptx/t-nvptx" + use_collect2=yes if test x$enable_as_accelerator = xyes; then extra_programs="${extra_programs} mkoffload\$(exeext)" tm_file="${tm_file} nvptx/offload.h" diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h index b1cbe5d417b..bc1021a8031 100644 --- a/gcc/config/nvptx/nvptx.h +++ b/gcc/config/nvptx/nvptx.h @@ -35,7 +35,39 @@ '../../gcc.cc:asm_options', 'HAVE_GNU_AS'. */ #define ASM_SPEC "%{v}" -#define STARTFILE_SPEC "%{mmainkernel:crt0.o%s}" +#define STARTFILE_SPEC \ + STARTFILE_SPEC_MMAINKERNEL \ + " " STARTFILE_SPEC_CDTOR + +#define ENDFILE_SPEC \ + ENDFILE_SPEC_CDTOR + +#define STARTFILE_SPEC_MMAINKERNEL "%{mmainkernel:crt0.o%s}" + +/* Support for global constructors/destructors is implemented via + 'collect2' and the following helpers. */ + +#define STARTFILE_SPEC_CDTOR "crtbegin.o%s" + +#define ENDFILE_SPEC_CDTOR "crtend.o%s" + +/* nvptx does its own wrapping of 'main' + (see 'libgcc/config/nvptx/crt0.c:__main'). */ +#define HAS_INIT_SECTION + +/* For example with old Nvidia Tesla K20c, Driver Version: 361.93.02, the + function pointers stored in the '__CTOR_LIST__', '__DTOR_LIST__' arrays + evidently evaluate to NULL in JIT compilation. Avoiding the use of + assembler names ('write_list_with_asm') doesn't help, but defining a dummy + function next to the arrays apparently does work around this issue... + + The default '__main_reference' synthesized by 'collect2' refers to our + 'crt0.o' '__main' function with incompatible signature: + + error : Function '__main' not declared __global__ in all source files + + Address both these issues via 'COLLECT2_MAIN_REFERENCE'. */ +#define COLLECT2_MAIN_REFERENCE "__attribute__((unused)) static void dummy () {}" #define TARGET_CPU_CPP_BUILTINS() nvptx_cpu_cpp_builtins () @@ -354,7 +386,6 @@ struct GTY(()) machine_function #define MOVE_MAX 8 #define MOVE_RATIO(SPEED) 4 #define FUNCTION_MODE QImode -#define HAS_INIT_SECTION 1 /* The C++ front end insists to link against libstdc++ -- which we don't build. Tell it to instead link against the innocuous libgcc. */ diff --git a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c index 909f8a68479..5b4101cf596 100644 --- a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c +++ b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c @@ -18,7 +18,7 @@ int main () return foo (); } -/* { dg-final { scan-tree-dump-times "__gcov0\[._\]main.* = PROF_edge_counter" 1 "optimized"} } */ +/* { dg-final { scan-tree-dump-times "__gcov0\[$._\]main.* = PROF_edge_counter" 1 "optimized"} } */ /* { dg-final { scan-tree-dump-times "__gcov_indirect_call_profiler_v" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_time_profiler_counter = " 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_init" 1 "optimized" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index d2de761adb5..ba9b045ebd4 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -907,8 +907,7 @@ proc check_effective_target_nonlocal_goto {} { # Return 1 if global constructors are supported, 0 otherwise. proc check_effective_target_global_constructor {} { - if { [istarget nvptx-*-*] - || [istarget bpf-*-*] } { + if { [istarget bpf-*-*] } { return 0 } return 1 diff --git a/libgcc/config.host b/libgcc/config.host index eb23abe89f5..25072f41860 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -1499,7 +1499,7 @@ m32c-*-elf*) ;; nvptx-*) tmake_file="$tmake_file nvptx/t-nvptx" - extra_parts="crt0.o" + extra_parts="crt0.o crtbegin.o crtend.o" ;; *) echo "*** Configuration ${host} not supported" 1>&2 diff --git a/libgcc/config/nvptx/crt0.c b/libgcc/config/nvptx/crt0.c index abf047327ae..6a49790f479 100644 --- a/libgcc/config/nvptx/crt0.c +++ b/libgcc/config/nvptx/crt0.c @@ -19,6 +19,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see <http://www.gnu.org/licenses/>. */ +#include "gbl-ctors.h" + int *__exitval_ptr; extern void __attribute__((noreturn)) exit (int status); @@ -47,5 +49,8 @@ __main (int *rval_ptr, int argc, void **argv) __nvptx_stacks[0] = stack + sizeof stack; __nvptx_uni[0] = 0; + __do_global_ctors (); + atexit (__do_global_dtors); + exit (main (argc, argv)); } diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c new file mode 100644 index 00000000000..0823fc49901 --- /dev/null +++ b/libgcc/config/nvptx/crtstuff.c @@ -0,0 +1,58 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ + +#include "gbl-ctors.h" + +/* The common 'crtstuff.c' doesn't quite provide what we need, so we roll our + own. + + There's no technical reason in this configuration here to split the two + functions '__do_global_ctors' and '__do_global_ctors' into two separate + files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we + do so anyway, for symmetry with other configurations. */ + +#ifdef CRT_BEGIN + +void +__do_global_ctors (void) +{ + DO_GLOBAL_CTORS_BODY; +} + +#elif defined(CRT_END) /* ! CRT_BEGIN */ + +void +__do_global_dtors (void) +{ + /* In this configuration here, there's no way that "this routine is run more + than once [...] when exit is called recursively": for nvptx target, the + call to '__do_global_dtors' is registered via 'atexit', which doesn't + re-enter a function already run. + Therefore, we do *not* "arrange to remember where in the list we left off + processing". */ + func_ptr *p; + for (p = __DTOR_LIST__ + 1; *p; ) + (*p++) (); +} + +#else /* ! CRT_BEGIN && ! CRT_END */ +#error "One of CRT_BEGIN or CRT_END must be defined." +#endif diff --git a/libgcc/config/nvptx/t-nvptx b/libgcc/config/nvptx/t-nvptx index ede0bf0f87d..9a0454c3a4d 100644 --- a/libgcc/config/nvptx/t-nvptx +++ b/libgcc/config/nvptx/t-nvptx @@ -3,7 +3,7 @@ LIB2ADD=$(srcdir)/config/nvptx/reduction.c \ $(srcdir)/config/nvptx/atomic.c LIB2ADDEH= -LIB2FUNCS_EXCLUDE=__main +LIB2FUNCS_EXCLUDE= crt0.o: $(srcdir)/config/nvptx/crt0.c $(crt_compile) -c $< @@ -12,3 +12,16 @@ crt0.o: $(srcdir)/config/nvptx/crt0.c # support it, and it may cause the build to fail, because of alloca usage, for # example. INHIBIT_LIBC_CFLAGS = -Dinhibit_libc + +# Support for global constructors/destructors is implemented via +# 'collect2' and the following helpers. + +LIB2FUNCS_EXCLUDE += __main + +CUSTOM_CRTSTUFF = yes + +crtbegin.o: $(srcdir)/config/nvptx/crtstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_BEGIN + +crtend.o: $(srcdir)/config/nvptx/crtstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END -- 2.25.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PING] nvptx: Support global constructors/destructors via 'collect2' 2022-12-02 13:35 ` nvptx: Support global constructors/destructors via 'collect2' Thomas Schwinge @ 2022-12-20 8:03 ` Thomas Schwinge 2023-01-11 11:48 ` [PING^2] " Thomas Schwinge 2023-01-24 9:01 ` Make 'libgcc/config/nvptx/crt0.c' build '--without-headers' (was: [PING] nvptx: Support global constructors/destructors via 'collect2') Thomas Schwinge 2022-12-23 13:35 ` nvptx: Support global constructors/destructors via 'collect2' for offloading (was: " Thomas Schwinge 2023-01-20 20:41 ` [og12] nvptx: Support global constructors/destructors via 'collect2' Thomas Schwinge 2 siblings, 2 replies; 10+ messages in thread From: Thomas Schwinge @ 2022-12-20 8:03 UTC (permalink / raw) To: gcc-patches, Tom de Vries [-- Attachment #1: Type: text/plain, Size: 1433 bytes --] Hi! Ping. Minor change in the attached "nvptx: Support global constructors/destructors via 'collect2'": for 'atexit', add '#include <stdlib.h>' to 'libgcc/config/nvptx/crt0.c'. Grüße Thomas On 2022-12-02T14:35:35+0100, I wrote: > Hi! > > On 2022-12-01T22:13:38+0100, I wrote: >> I'm working on support for global constructors/destructors with >> GCC/nvptx > > See "nvptx: Support global constructors/destructors via 'collect2'" > attached; OK to push? (... with 'gcc/doc/install.texi' accordingly > updated once <https://github.com/MentorEmbedded/nvptx-tools/pull/40> > "'nm'" and newlib > <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> > "nvptx: Implement '_exit' instead of 'exit'" have been merged; any > comments to those?) > > Per my quick scanning of 'gcc/config.gcc' history, for more than two > decades, there was a clear trend to remove 'use_collect2=yes' > configurations; now finally a new one is being added -- making sure we're > not slowly dispensing with the need for the early 1990s piece of work > that 'gcc/collect2*' is... ;'-P > > > Grüße > Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-nvptx-Support-global-constructors-destructors-via-co.patch --] [-- Type: text/x-diff, Size: 10784 bytes --] From 0e7cf5a9f83c3a82eafa126886e5d92651bfbb30 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Sun, 13 Nov 2022 14:19:30 +0100 Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2' The function attributes 'constructor', 'destructor', and 'init_priority' now work, as do the C++ features making use of this. Test cases with effective target 'global_constructor' and 'init_priority' now generally work, and 'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented: global constructors not supported on this target". This depends on <https://github.com/MentorEmbedded/nvptx-tools/pull/40> "'nm'" generally, and for global destructors support: newlib <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> "nvptx: Implement '_exit' instead of 'exit'". gcc/ * collect2.cc (write_c_file_glob): Allow for 'COLLECT2_MAIN_REFERENCE' override. * config.gcc <case ${target} in nvptx-*>: Set 'use_collect2=yes'. * config/nvptx/nvptx.h: Adjust. gcc/testsuite/ * gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is 'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper in identifiers. * lib/target-supports.exp (check_effective_target_global_constructor): Enable for nvptx. libgcc/ * config.host <case ${host} in nvptx-*>: Add 'crtbegin.o', 'crtend.o' to 'extra_parts'. * config/nvptx/crt0.c: Invoke '__do_global_ctors', '__do_global_dtors'. * config/nvptx/crtstuff.c: New. * config/nvptx/t-nvptx: Adjust. --- gcc/collect2.cc | 4 ++ gcc/config.gcc | 1 + gcc/config/nvptx/nvptx.h | 35 ++++++++++- .../no_profile_instrument_function-attr-1.c | 2 +- gcc/testsuite/lib/target-supports.exp | 3 +- libgcc/config.host | 2 +- libgcc/config/nvptx/crt0.c | 6 ++ libgcc/config/nvptx/crtstuff.c | 58 +++++++++++++++++++ libgcc/config/nvptx/t-nvptx | 15 ++++- 9 files changed, 119 insertions(+), 7 deletions(-) create mode 100644 libgcc/config/nvptx/crtstuff.c diff --git a/gcc/collect2.cc b/gcc/collect2.cc index d81c7f28f16a..945a9ff86dda 100644 --- a/gcc/collect2.cc +++ b/gcc/collect2.cc @@ -2238,8 +2238,12 @@ write_c_file_glob (FILE *stream, const char *name ATTRIBUTE_UNUSED) fprintf (stream, "\tdereg_frame,\n"); fprintf (stream, "\t0\n};\n\n"); +# ifdef COLLECT2_MAIN_REFERENCE + fprintf (stream, "%s\n\n", COLLECT2_MAIN_REFERENCE); +# else fprintf (stream, "extern entry_pt %s;\n", NAME__MAIN); fprintf (stream, "entry_pt *__main_reference = %s;\n\n", NAME__MAIN); +# endif } #endif /* ! LD_INIT_SWITCH */ diff --git a/gcc/config.gcc b/gcc/config.gcc index 951902338205..fec67d7b6e40 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -2784,6 +2784,7 @@ nvptx-*) tm_file="${tm_file} newlib-stdint.h" use_gcc_stdint=wrap tmake_file="nvptx/t-nvptx" + use_collect2=yes if test x$enable_as_accelerator = xyes; then extra_programs="${extra_programs} mkoffload\$(exeext)" tm_file="${tm_file} nvptx/offload.h" diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h index dc676dcb5fc5..235c1e4d99d5 100644 --- a/gcc/config/nvptx/nvptx.h +++ b/gcc/config/nvptx/nvptx.h @@ -35,7 +35,39 @@ '../../gcc.cc:asm_options', 'HAVE_GNU_AS'. */ #define ASM_SPEC "%{v}" -#define STARTFILE_SPEC "%{mmainkernel:crt0.o%s}" +#define STARTFILE_SPEC \ + STARTFILE_SPEC_MMAINKERNEL \ + " " STARTFILE_SPEC_CDTOR + +#define ENDFILE_SPEC \ + ENDFILE_SPEC_CDTOR + +#define STARTFILE_SPEC_MMAINKERNEL "%{mmainkernel:crt0.o%s}" + +/* Support for global constructors/destructors is implemented via + 'collect2' and the following helpers. */ + +#define STARTFILE_SPEC_CDTOR "crtbegin.o%s" + +#define ENDFILE_SPEC_CDTOR "crtend.o%s" + +/* nvptx does its own wrapping of 'main' + (see 'libgcc/config/nvptx/crt0.c:__main'). */ +#define HAS_INIT_SECTION + +/* For example with old Nvidia Tesla K20c, Driver Version: 361.93.02, the + function pointers stored in the '__CTOR_LIST__', '__DTOR_LIST__' arrays + evidently evaluate to NULL in JIT compilation. Avoiding the use of + assembler names ('write_list_with_asm') doesn't help, but defining a dummy + function next to the arrays apparently does work around this issue... + + The default '__main_reference' synthesized by 'collect2' refers to our + 'crt0.o' '__main' function with incompatible signature: + + error : Function '__main' not declared __global__ in all source files + + Address both these issues via 'COLLECT2_MAIN_REFERENCE'. */ +#define COLLECT2_MAIN_REFERENCE "__attribute__((unused)) static void dummy () {}" #define TARGET_CPU_CPP_BUILTINS() nvptx_cpu_cpp_builtins () @@ -348,7 +380,6 @@ struct GTY(()) machine_function #define MOVE_MAX 8 #define MOVE_RATIO(SPEED) 4 #define FUNCTION_MODE QImode -#define HAS_INIT_SECTION 1 /* The C++ front end insists to link against libstdc++ -- which we don't build. Tell it to instead link against the innocuous libgcc. */ diff --git a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c index 909f8a684791..5b4101cf596d 100644 --- a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c +++ b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c @@ -18,7 +18,7 @@ int main () return foo (); } -/* { dg-final { scan-tree-dump-times "__gcov0\[._\]main.* = PROF_edge_counter" 1 "optimized"} } */ +/* { dg-final { scan-tree-dump-times "__gcov0\[$._\]main.* = PROF_edge_counter" 1 "optimized"} } */ /* { dg-final { scan-tree-dump-times "__gcov_indirect_call_profiler_v" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_time_profiler_counter = " 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_init" 1 "optimized" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index ea06e21c3a14..b1b1c5b36bc2 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -907,8 +907,7 @@ proc check_effective_target_nonlocal_goto {} { # Return 1 if global constructors are supported, 0 otherwise. proc check_effective_target_global_constructor {} { - if { [istarget nvptx-*-*] - || [istarget bpf-*-*] } { + if { [istarget bpf-*-*] } { return 0 } return 1 diff --git a/libgcc/config.host b/libgcc/config.host index eb23abe89f5e..25072f41860c 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -1499,7 +1499,7 @@ m32c-*-elf*) ;; nvptx-*) tmake_file="$tmake_file nvptx/t-nvptx" - extra_parts="crt0.o" + extra_parts="crt0.o crtbegin.o crtend.o" ;; *) echo "*** Configuration ${host} not supported" 1>&2 diff --git a/libgcc/config/nvptx/crt0.c b/libgcc/config/nvptx/crt0.c index abf047327ae7..860e2bfacadd 100644 --- a/libgcc/config/nvptx/crt0.c +++ b/libgcc/config/nvptx/crt0.c @@ -19,6 +19,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see <http://www.gnu.org/licenses/>. */ +#include <stdlib.h> +#include "gbl-ctors.h" + int *__exitval_ptr; extern void __attribute__((noreturn)) exit (int status); @@ -47,5 +50,8 @@ __main (int *rval_ptr, int argc, void **argv) __nvptx_stacks[0] = stack + sizeof stack; __nvptx_uni[0] = 0; + __do_global_ctors (); + atexit (__do_global_dtors); + exit (main (argc, argv)); } diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c new file mode 100644 index 000000000000..0823fc499019 --- /dev/null +++ b/libgcc/config/nvptx/crtstuff.c @@ -0,0 +1,58 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ + +#include "gbl-ctors.h" + +/* The common 'crtstuff.c' doesn't quite provide what we need, so we roll our + own. + + There's no technical reason in this configuration here to split the two + functions '__do_global_ctors' and '__do_global_ctors' into two separate + files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we + do so anyway, for symmetry with other configurations. */ + +#ifdef CRT_BEGIN + +void +__do_global_ctors (void) +{ + DO_GLOBAL_CTORS_BODY; +} + +#elif defined(CRT_END) /* ! CRT_BEGIN */ + +void +__do_global_dtors (void) +{ + /* In this configuration here, there's no way that "this routine is run more + than once [...] when exit is called recursively": for nvptx target, the + call to '__do_global_dtors' is registered via 'atexit', which doesn't + re-enter a function already run. + Therefore, we do *not* "arrange to remember where in the list we left off + processing". */ + func_ptr *p; + for (p = __DTOR_LIST__ + 1; *p; ) + (*p++) (); +} + +#else /* ! CRT_BEGIN && ! CRT_END */ +#error "One of CRT_BEGIN or CRT_END must be defined." +#endif diff --git a/libgcc/config/nvptx/t-nvptx b/libgcc/config/nvptx/t-nvptx index ede0bf0f87dd..9a0454c3a4d0 100644 --- a/libgcc/config/nvptx/t-nvptx +++ b/libgcc/config/nvptx/t-nvptx @@ -3,7 +3,7 @@ LIB2ADD=$(srcdir)/config/nvptx/reduction.c \ $(srcdir)/config/nvptx/atomic.c LIB2ADDEH= -LIB2FUNCS_EXCLUDE=__main +LIB2FUNCS_EXCLUDE= crt0.o: $(srcdir)/config/nvptx/crt0.c $(crt_compile) -c $< @@ -12,3 +12,16 @@ crt0.o: $(srcdir)/config/nvptx/crt0.c # support it, and it may cause the build to fail, because of alloca usage, for # example. INHIBIT_LIBC_CFLAGS = -Dinhibit_libc + +# Support for global constructors/destructors is implemented via +# 'collect2' and the following helpers. + +LIB2FUNCS_EXCLUDE += __main + +CUSTOM_CRTSTUFF = yes + +crtbegin.o: $(srcdir)/config/nvptx/crtstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_BEGIN + +crtend.o: $(srcdir)/config/nvptx/crtstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END -- 2.35.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PING^2] nvptx: Support global constructors/destructors via 'collect2' 2022-12-20 8:03 ` [PING] " Thomas Schwinge @ 2023-01-11 11:48 ` Thomas Schwinge 2023-01-24 9:01 ` Make 'libgcc/config/nvptx/crt0.c' build '--without-headers' (was: [PING] nvptx: Support global constructors/destructors via 'collect2') Thomas Schwinge 1 sibling, 0 replies; 10+ messages in thread From: Thomas Schwinge @ 2023-01-11 11:48 UTC (permalink / raw) To: gcc-patches, Tom de Vries [-- Attachment #1: Type: text/plain, Size: 1558 bytes --] Hi! Ping. Grüße Thomas On 2022-12-20T09:03:51+0100, I wrote: > Hi! > > Ping. > > > Minor change in the attached > "nvptx: Support global constructors/destructors via 'collect2'": for > 'atexit', add '#include <stdlib.h>' to 'libgcc/config/nvptx/crt0.c'. > > > Grüße > Thomas > > > On 2022-12-02T14:35:35+0100, I wrote: >> Hi! >> >> On 2022-12-01T22:13:38+0100, I wrote: >>> I'm working on support for global constructors/destructors with >>> GCC/nvptx >> >> See "nvptx: Support global constructors/destructors via 'collect2'" >> attached; OK to push? (... with 'gcc/doc/install.texi' accordingly >> updated once <https://github.com/MentorEmbedded/nvptx-tools/pull/40> >> "'nm'" and newlib >> <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> >> "nvptx: Implement '_exit' instead of 'exit'" have been merged; any >> comments to those?) >> >> Per my quick scanning of 'gcc/config.gcc' history, for more than two >> decades, there was a clear trend to remove 'use_collect2=yes' >> configurations; now finally a new one is being added -- making sure we're >> not slowly dispensing with the need for the early 1990s piece of work >> that 'gcc/collect2*' is... ;'-P >> >> >> Grüße >> Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-nvptx-Support-global-constructors-destructors-via-co.patch --] [-- Type: text/x-diff, Size: 10784 bytes --] From 0e7cf5a9f83c3a82eafa126886e5d92651bfbb30 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Sun, 13 Nov 2022 14:19:30 +0100 Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2' The function attributes 'constructor', 'destructor', and 'init_priority' now work, as do the C++ features making use of this. Test cases with effective target 'global_constructor' and 'init_priority' now generally work, and 'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented: global constructors not supported on this target". This depends on <https://github.com/MentorEmbedded/nvptx-tools/pull/40> "'nm'" generally, and for global destructors support: newlib <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> "nvptx: Implement '_exit' instead of 'exit'". gcc/ * collect2.cc (write_c_file_glob): Allow for 'COLLECT2_MAIN_REFERENCE' override. * config.gcc <case ${target} in nvptx-*>: Set 'use_collect2=yes'. * config/nvptx/nvptx.h: Adjust. gcc/testsuite/ * gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is 'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper in identifiers. * lib/target-supports.exp (check_effective_target_global_constructor): Enable for nvptx. libgcc/ * config.host <case ${host} in nvptx-*>: Add 'crtbegin.o', 'crtend.o' to 'extra_parts'. * config/nvptx/crt0.c: Invoke '__do_global_ctors', '__do_global_dtors'. * config/nvptx/crtstuff.c: New. * config/nvptx/t-nvptx: Adjust. --- gcc/collect2.cc | 4 ++ gcc/config.gcc | 1 + gcc/config/nvptx/nvptx.h | 35 ++++++++++- .../no_profile_instrument_function-attr-1.c | 2 +- gcc/testsuite/lib/target-supports.exp | 3 +- libgcc/config.host | 2 +- libgcc/config/nvptx/crt0.c | 6 ++ libgcc/config/nvptx/crtstuff.c | 58 +++++++++++++++++++ libgcc/config/nvptx/t-nvptx | 15 ++++- 9 files changed, 119 insertions(+), 7 deletions(-) create mode 100644 libgcc/config/nvptx/crtstuff.c diff --git a/gcc/collect2.cc b/gcc/collect2.cc index d81c7f28f16a..945a9ff86dda 100644 --- a/gcc/collect2.cc +++ b/gcc/collect2.cc @@ -2238,8 +2238,12 @@ write_c_file_glob (FILE *stream, const char *name ATTRIBUTE_UNUSED) fprintf (stream, "\tdereg_frame,\n"); fprintf (stream, "\t0\n};\n\n"); +# ifdef COLLECT2_MAIN_REFERENCE + fprintf (stream, "%s\n\n", COLLECT2_MAIN_REFERENCE); +# else fprintf (stream, "extern entry_pt %s;\n", NAME__MAIN); fprintf (stream, "entry_pt *__main_reference = %s;\n\n", NAME__MAIN); +# endif } #endif /* ! LD_INIT_SWITCH */ diff --git a/gcc/config.gcc b/gcc/config.gcc index 951902338205..fec67d7b6e40 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -2784,6 +2784,7 @@ nvptx-*) tm_file="${tm_file} newlib-stdint.h" use_gcc_stdint=wrap tmake_file="nvptx/t-nvptx" + use_collect2=yes if test x$enable_as_accelerator = xyes; then extra_programs="${extra_programs} mkoffload\$(exeext)" tm_file="${tm_file} nvptx/offload.h" diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h index dc676dcb5fc5..235c1e4d99d5 100644 --- a/gcc/config/nvptx/nvptx.h +++ b/gcc/config/nvptx/nvptx.h @@ -35,7 +35,39 @@ '../../gcc.cc:asm_options', 'HAVE_GNU_AS'. */ #define ASM_SPEC "%{v}" -#define STARTFILE_SPEC "%{mmainkernel:crt0.o%s}" +#define STARTFILE_SPEC \ + STARTFILE_SPEC_MMAINKERNEL \ + " " STARTFILE_SPEC_CDTOR + +#define ENDFILE_SPEC \ + ENDFILE_SPEC_CDTOR + +#define STARTFILE_SPEC_MMAINKERNEL "%{mmainkernel:crt0.o%s}" + +/* Support for global constructors/destructors is implemented via + 'collect2' and the following helpers. */ + +#define STARTFILE_SPEC_CDTOR "crtbegin.o%s" + +#define ENDFILE_SPEC_CDTOR "crtend.o%s" + +/* nvptx does its own wrapping of 'main' + (see 'libgcc/config/nvptx/crt0.c:__main'). */ +#define HAS_INIT_SECTION + +/* For example with old Nvidia Tesla K20c, Driver Version: 361.93.02, the + function pointers stored in the '__CTOR_LIST__', '__DTOR_LIST__' arrays + evidently evaluate to NULL in JIT compilation. Avoiding the use of + assembler names ('write_list_with_asm') doesn't help, but defining a dummy + function next to the arrays apparently does work around this issue... + + The default '__main_reference' synthesized by 'collect2' refers to our + 'crt0.o' '__main' function with incompatible signature: + + error : Function '__main' not declared __global__ in all source files + + Address both these issues via 'COLLECT2_MAIN_REFERENCE'. */ +#define COLLECT2_MAIN_REFERENCE "__attribute__((unused)) static void dummy () {}" #define TARGET_CPU_CPP_BUILTINS() nvptx_cpu_cpp_builtins () @@ -348,7 +380,6 @@ struct GTY(()) machine_function #define MOVE_MAX 8 #define MOVE_RATIO(SPEED) 4 #define FUNCTION_MODE QImode -#define HAS_INIT_SECTION 1 /* The C++ front end insists to link against libstdc++ -- which we don't build. Tell it to instead link against the innocuous libgcc. */ diff --git a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c index 909f8a684791..5b4101cf596d 100644 --- a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c +++ b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c @@ -18,7 +18,7 @@ int main () return foo (); } -/* { dg-final { scan-tree-dump-times "__gcov0\[._\]main.* = PROF_edge_counter" 1 "optimized"} } */ +/* { dg-final { scan-tree-dump-times "__gcov0\[$._\]main.* = PROF_edge_counter" 1 "optimized"} } */ /* { dg-final { scan-tree-dump-times "__gcov_indirect_call_profiler_v" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_time_profiler_counter = " 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_init" 1 "optimized" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index ea06e21c3a14..b1b1c5b36bc2 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -907,8 +907,7 @@ proc check_effective_target_nonlocal_goto {} { # Return 1 if global constructors are supported, 0 otherwise. proc check_effective_target_global_constructor {} { - if { [istarget nvptx-*-*] - || [istarget bpf-*-*] } { + if { [istarget bpf-*-*] } { return 0 } return 1 diff --git a/libgcc/config.host b/libgcc/config.host index eb23abe89f5e..25072f41860c 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -1499,7 +1499,7 @@ m32c-*-elf*) ;; nvptx-*) tmake_file="$tmake_file nvptx/t-nvptx" - extra_parts="crt0.o" + extra_parts="crt0.o crtbegin.o crtend.o" ;; *) echo "*** Configuration ${host} not supported" 1>&2 diff --git a/libgcc/config/nvptx/crt0.c b/libgcc/config/nvptx/crt0.c index abf047327ae7..860e2bfacadd 100644 --- a/libgcc/config/nvptx/crt0.c +++ b/libgcc/config/nvptx/crt0.c @@ -19,6 +19,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see <http://www.gnu.org/licenses/>. */ +#include <stdlib.h> +#include "gbl-ctors.h" + int *__exitval_ptr; extern void __attribute__((noreturn)) exit (int status); @@ -47,5 +50,8 @@ __main (int *rval_ptr, int argc, void **argv) __nvptx_stacks[0] = stack + sizeof stack; __nvptx_uni[0] = 0; + __do_global_ctors (); + atexit (__do_global_dtors); + exit (main (argc, argv)); } diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c new file mode 100644 index 000000000000..0823fc499019 --- /dev/null +++ b/libgcc/config/nvptx/crtstuff.c @@ -0,0 +1,58 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ + +#include "gbl-ctors.h" + +/* The common 'crtstuff.c' doesn't quite provide what we need, so we roll our + own. + + There's no technical reason in this configuration here to split the two + functions '__do_global_ctors' and '__do_global_ctors' into two separate + files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we + do so anyway, for symmetry with other configurations. */ + +#ifdef CRT_BEGIN + +void +__do_global_ctors (void) +{ + DO_GLOBAL_CTORS_BODY; +} + +#elif defined(CRT_END) /* ! CRT_BEGIN */ + +void +__do_global_dtors (void) +{ + /* In this configuration here, there's no way that "this routine is run more + than once [...] when exit is called recursively": for nvptx target, the + call to '__do_global_dtors' is registered via 'atexit', which doesn't + re-enter a function already run. + Therefore, we do *not* "arrange to remember where in the list we left off + processing". */ + func_ptr *p; + for (p = __DTOR_LIST__ + 1; *p; ) + (*p++) (); +} + +#else /* ! CRT_BEGIN && ! CRT_END */ +#error "One of CRT_BEGIN or CRT_END must be defined." +#endif diff --git a/libgcc/config/nvptx/t-nvptx b/libgcc/config/nvptx/t-nvptx index ede0bf0f87dd..9a0454c3a4d0 100644 --- a/libgcc/config/nvptx/t-nvptx +++ b/libgcc/config/nvptx/t-nvptx @@ -3,7 +3,7 @@ LIB2ADD=$(srcdir)/config/nvptx/reduction.c \ $(srcdir)/config/nvptx/atomic.c LIB2ADDEH= -LIB2FUNCS_EXCLUDE=__main +LIB2FUNCS_EXCLUDE= crt0.o: $(srcdir)/config/nvptx/crt0.c $(crt_compile) -c $< @@ -12,3 +12,16 @@ crt0.o: $(srcdir)/config/nvptx/crt0.c # support it, and it may cause the build to fail, because of alloca usage, for # example. INHIBIT_LIBC_CFLAGS = -Dinhibit_libc + +# Support for global constructors/destructors is implemented via +# 'collect2' and the following helpers. + +LIB2FUNCS_EXCLUDE += __main + +CUSTOM_CRTSTUFF = yes + +crtbegin.o: $(srcdir)/config/nvptx/crtstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_BEGIN + +crtend.o: $(srcdir)/config/nvptx/crtstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END -- 2.35.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Make 'libgcc/config/nvptx/crt0.c' build '--without-headers' (was: [PING] nvptx: Support global constructors/destructors via 'collect2') 2022-12-20 8:03 ` [PING] " Thomas Schwinge 2023-01-11 11:48 ` [PING^2] " Thomas Schwinge @ 2023-01-24 9:01 ` Thomas Schwinge 1 sibling, 0 replies; 10+ messages in thread From: Thomas Schwinge @ 2023-01-24 9:01 UTC (permalink / raw) To: gcc-patches; +Cc: Tom de Vries [-- Attachment #1: Type: text/plain, Size: 1408 bytes --] Hi! On 2022-12-20T09:03:51+0100, I wrote: > Minor change in the attached > "nvptx: Support global constructors/destructors via 'collect2'": for > 'atexit', add '#include <stdlib.h>' to 'libgcc/config/nvptx/crt0.c'. Turns out, it's not that easy. ;-) Pushed to devel/omp/gcc-12 branch commit d90a8a5685c8bd3657892feac01739fe87a457a5 "Make 'libgcc/config/nvptx/crt0.c' build '--without-headers'", see attached. Please consider that one 'fixup'ed into the GCC master branch submission. Grüße Thomas > --- a/libgcc/config/nvptx/crt0.c > +++ b/libgcc/config/nvptx/crt0.c > @@ -19,6 +19,9 @@ > see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > <http://www.gnu.org/licenses/>. */ > > +#include <stdlib.h> > +#include "gbl-ctors.h" > + > int *__exitval_ptr; > > extern void __attribute__((noreturn)) exit (int status); > @@ -47,5 +50,8 @@ __main (int *rval_ptr, int argc, void **argv) > __nvptx_stacks[0] = stack + sizeof stack; > __nvptx_uni[0] = 0; > > + __do_global_ctors (); > + atexit (__do_global_dtors); > + > exit (main (argc, argv)); > } ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Make-libgcc-config-nvptx-crt0.c-build-without-header.patch --] [-- Type: text/x-diff, Size: 1767 bytes --] From d90a8a5685c8bd3657892feac01739fe87a457a5 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Tue, 24 Jan 2023 09:49:34 +0100 Subject: [PATCH] Make 'libgcc/config/nvptx/crt0.c' build '--without-headers' ..., where it currently fails: [...]/libgcc/config/nvptx/crt0.c:22:10: fatal error: stdlib.h: No such file or directory 22 | #include <stdlib.h> | ^~~~~~~~~~ Fix-up for "nvptx: Support global constructors/destructors via 'collect2'". libgcc/ * config/nvptx/crt0.c [!HAVE_STDLIB_H]: Don't '#include <stdlib.h>'. (atexit): Prototype. --- libgcc/ChangeLog.omp | 5 +++++ libgcc/config/nvptx/crt0.c | 7 ++++++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/libgcc/ChangeLog.omp b/libgcc/ChangeLog.omp index c46f49bf5b7..cf509a70d61 100644 --- a/libgcc/ChangeLog.omp +++ b/libgcc/ChangeLog.omp @@ -1,3 +1,8 @@ +2023-01-24 Thomas Schwinge <thomas@codesourcery.com> + + * config/nvptx/crt0.c [!HAVE_STDLIB_H]: Don't '#include <stdlib.h>'. + (atexit): Prototype. + 2023-01-20 Thomas Schwinge <thomas@codesourcery.com> Andrew Stubbs <ams@codesourcery.com> diff --git a/libgcc/config/nvptx/crt0.c b/libgcc/config/nvptx/crt0.c index 860e2bfacad..02648bef84b 100644 --- a/libgcc/config/nvptx/crt0.c +++ b/libgcc/config/nvptx/crt0.c @@ -19,11 +19,16 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see <http://www.gnu.org/licenses/>. */ -#include <stdlib.h> +#include "auto-target.h" + +#ifdef HAVE_STDLIB_H +# include <stdlib.h> +#endif #include "gbl-ctors.h" int *__exitval_ptr; +extern int atexit (void (*function) (void)); extern void __attribute__((noreturn)) exit (int status); extern int main (int, void **); -- 2.25.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* nvptx: Support global constructors/destructors via 'collect2' for offloading (was: nvptx: Support global constructors/destructors via 'collect2') 2022-12-02 13:35 ` nvptx: Support global constructors/destructors via 'collect2' Thomas Schwinge 2022-12-20 8:03 ` [PING] " Thomas Schwinge @ 2022-12-23 13:35 ` Thomas Schwinge 2022-12-23 13:37 ` Thomas Schwinge 2023-01-20 20:46 ` [og12] " Thomas Schwinge 2023-01-20 20:41 ` [og12] nvptx: Support global constructors/destructors via 'collect2' Thomas Schwinge 2 siblings, 2 replies; 10+ messages in thread From: Thomas Schwinge @ 2022-12-23 13:35 UTC (permalink / raw) To: gcc-patches, Tom de Vries Hi! On 2022-12-02T14:35:35+0100, I wrote: > On 2022-12-01T22:13:38+0100, I wrote: >> I'm working on support for global constructors/destructors with >> GCC/nvptx > > See "nvptx: Support global constructors/destructors via 'collect2'" > [posted before] Building on that, attached is now the additional "for offloading" piece: "nvptx: Support global constructors/destructors via 'collect2' for offloading". OK to push? I did manually test this (by putting a few constructors/destructors into 'libgomp/config/nvptx/oacc-parallel.c', and observing them be executed), and also in my WIP development tree with standard libgfortran constructors (with 'LIBGFOR_MINIMAL' disabled). Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 ^ permalink raw reply [flat|nested] 10+ messages in thread
* nvptx: Support global constructors/destructors via 'collect2' for offloading (was: nvptx: Support global constructors/destructors via 'collect2') 2022-12-23 13:35 ` nvptx: Support global constructors/destructors via 'collect2' for offloading (was: " Thomas Schwinge @ 2022-12-23 13:37 ` Thomas Schwinge 2023-01-11 11:49 ` [PING] " Thomas Schwinge 2023-01-20 20:46 ` [og12] " Thomas Schwinge 1 sibling, 1 reply; 10+ messages in thread From: Thomas Schwinge @ 2022-12-23 13:37 UTC (permalink / raw) To: gcc-patches, Tom de Vries [-- Attachment #1: Type: text/plain, Size: 1076 bytes --] Hi! On 2022-12-23T14:35:16+0100, I wrote: > On 2022-12-02T14:35:35+0100, I wrote: >> On 2022-12-01T22:13:38+0100, I wrote: >>> I'm working on support for global constructors/destructors with >>> GCC/nvptx >> >> See "nvptx: Support global constructors/destructors via 'collect2'" >> [posted before] > > Building on that, attached is now the additional "for offloading" piece: > "nvptx: Support global constructors/destructors via 'collect2' for offloading". > OK to push? Now really attached. > I did manually test this (by putting a few constructors/destructors into > 'libgomp/config/nvptx/oacc-parallel.c', and observing them be executed), > and also in my WIP development tree with standard libgfortran > constructors (with 'LIBGFOR_MINIMAL' disabled). Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-nvptx-Support-global-constructors-destructors-via-co.patch --] [-- Type: text/x-diff, Size: 8131 bytes --] From fb67006eeca0c8e2bfdf86576ed3109dacaf6868 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Wed, 30 Nov 2022 22:09:35 +0100 Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2' for offloading This extends "nvptx: Support global constructors/destructors via 'collect2'" for offloading. libgcc/ * config/nvptx/crtstuff.c ["mgomp"] (__do_global_ctors__entry__mgomp) (__do_global_dtors__entry__mgomp): New. [!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry): New. libgomp/ * plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New. (nvptx_close_device, GOMP_OFFLOAD_load_image) (GOMP_OFFLOAD_unload_image): Call it. --- libgcc/config/nvptx/crtstuff.c | 64 ++++++++++++++++++- libgomp/plugin/plugin-nvptx.c | 113 ++++++++++++++++++++++++++++++++- 2 files changed, 175 insertions(+), 2 deletions(-) diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c index 0823fc49901..8dc80687e0a 100644 --- a/libgcc/config/nvptx/crtstuff.c +++ b/libgcc/config/nvptx/crtstuff.c @@ -29,6 +29,14 @@ files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we do so anyway, for symmetry with other configurations. */ + +/* See 'crt0.c', 'mgomp.c'. */ +#if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) +extern void *__nvptx_stacks[32] __attribute__((shared,nocommon)); +extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon)); +#endif + + #ifdef CRT_BEGIN void @@ -37,6 +45,33 @@ __do_global_ctors (void) DO_GLOBAL_CTORS_BODY; } +/* Need '.entry' wrapper for offloading. */ + +# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) + +__attribute__((kernel)) void __do_global_ctors__entry__mgomp (void *); + +void +__do_global_ctors__entry__mgomp (void *nvptx_stacks_0) +{ + __nvptx_stacks[0] = nvptx_stacks_0; + __nvptx_uni[0] = 0; + + __do_global_ctors (); +} + +# else + +__attribute__((kernel)) void __do_global_ctors__entry (void); + +void +__do_global_ctors__entry (void) +{ + __do_global_ctors (); +} + +# endif + #elif defined(CRT_END) /* ! CRT_BEGIN */ void @@ -45,7 +80,7 @@ __do_global_dtors (void) /* In this configuration here, there's no way that "this routine is run more than once [...] when exit is called recursively": for nvptx target, the call to '__do_global_dtors' is registered via 'atexit', which doesn't - re-enter a function already run. + re-enter a function already run, and neither does nvptx offload target. Therefore, we do *not* "arrange to remember where in the list we left off processing". */ func_ptr *p; @@ -53,6 +88,33 @@ __do_global_dtors (void) (*p++) (); } +/* Need '.entry' wrapper for offloading. */ + +# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) + +__attribute__((kernel)) void __do_global_dtors__entry__mgomp (void *); + +void +__do_global_dtors__entry__mgomp (void *nvptx_stacks_0) +{ + __nvptx_stacks[0] = nvptx_stacks_0; + __nvptx_uni[0] = 0; + + __do_global_dtors (); +} + +# else + +__attribute__((kernel)) void __do_global_dtors__entry (void); + +void +__do_global_dtors__entry (void) +{ + __do_global_dtors (); +} + +# endif + #else /* ! CRT_BEGIN && ! CRT_END */ #error "One of CRT_BEGIN or CRT_END must be defined." #endif diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index fcc97c6e0d5..395639537e8 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -338,6 +338,11 @@ struct ptx_device static struct ptx_device **ptx_devices; +static bool nvptx_do_global_cdtors (CUmodule, struct ptx_device *, + const char *); +static size_t nvptx_stacks_size (); +static void *nvptx_stacks_acquire (struct ptx_device *, size_t, int); + static inline struct nvptx_thread * nvptx_thread (void) { @@ -557,6 +562,17 @@ nvptx_close_device (struct ptx_device *ptx_dev) if (!ptx_dev) return true; + bool ret = true; + + for (struct ptx_image_data *image = ptx_dev->images; + image != NULL; + image = image->next) + { + if (!nvptx_do_global_cdtors (image->module, ptx_dev, + "__do_global_dtors__entry")) + ret = false; + } + for (struct ptx_free_block *b = ptx_dev->free_blocks; b;) { struct ptx_free_block *b_next = b->next; @@ -577,7 +593,8 @@ nvptx_close_device (struct ptx_device *ptx_dev) CUDA_CALL (cuCtxDestroy, ptx_dev->ctx); free (ptx_dev); - return true; + + return ret; } static int @@ -1280,6 +1297,93 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev) GOMP_PLUGIN_fatal ("cuMemcpyHtoD error: %s", cuda_error (r)); } +/* Invoke MODULE's global constructors/destructors. */ + +static bool +nvptx_do_global_cdtors (CUmodule module, struct ptx_device *ptx_dev, + const char *funcname) +{ + bool ret = true; + char *funcname_mgomp = NULL; + CUresult r; + CUfunction funcptr; + r = CUDA_CALL_NOCHECK (cuModuleGetFunction, + &funcptr, module, funcname); + GOMP_PLUGIN_debug (0, "cuModuleGetFunction (%s): %s\n", + funcname, cuda_error (r)); + if (r == CUDA_ERROR_NOT_FOUND) + { + /* Try '[funcname]__mgomp'. */ + + size_t funcname_len = strlen (funcname); + const char *mgomp_suffix = "__mgomp"; + size_t mgomp_suffix_len = strlen (mgomp_suffix); + funcname_mgomp + = GOMP_PLUGIN_malloc (funcname_len + mgomp_suffix_len + 1); + memcpy (funcname_mgomp, funcname, funcname_len); + memcpy (funcname_mgomp + funcname_len, + mgomp_suffix, mgomp_suffix_len + 1); + funcname = funcname_mgomp; + + r = CUDA_CALL_NOCHECK (cuModuleGetFunction, + &funcptr, module, funcname); + GOMP_PLUGIN_debug (0, "cuModuleGetFunction (%s): %s\n", + funcname, cuda_error (r)); + } + if (r == CUDA_ERROR_NOT_FOUND) + ; + else if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuModuleGetFunction (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + else + { + /* If necessary, set up soft stack. */ + void *nvptx_stacks_0; + void *kargs[1]; + if (funcname_mgomp) + { + size_t stack_size = nvptx_stacks_size (); + pthread_mutex_lock (&ptx_dev->omp_stacks.lock); + nvptx_stacks_0 = nvptx_stacks_acquire (ptx_dev, stack_size, 1); + nvptx_stacks_0 += stack_size; + kargs[0] = &nvptx_stacks_0; + } + r = CUDA_CALL_NOCHECK (cuLaunchKernel, + funcptr, + 1, 1, 1, 1, 1, 1, + /* sharedMemBytes */ 0, + /* hStream */ NULL, + /* kernelParams */ funcname_mgomp ? kargs : NULL, + /* extra */ NULL); + if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuLaunchKernel (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + + r = CUDA_CALL_NOCHECK (cuStreamSynchronize, + NULL); + if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuStreamSynchronize (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + + if (funcname_mgomp) + pthread_mutex_unlock (&ptx_dev->omp_stacks.lock); + } + + if (funcname_mgomp) + free (funcname_mgomp); + + return ret; +} + /* Load the (partial) program described by TARGET_DATA to device number ORD. Allocate and return TARGET_TABLE. If not NULL, REV_FN_TABLE will contain the on-device addresses of the functions for reverse offload. @@ -1452,6 +1556,9 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data, nvptx_set_clocktick (module, dev); + if (!nvptx_do_global_cdtors (module, dev, "__do_global_ctors__entry")) + return -1; + return fn_entries + var_entries + other_entries; } @@ -1477,6 +1584,10 @@ GOMP_OFFLOAD_unload_image (int ord, unsigned version, const void *target_data) for (prev_p = &dev->images; (image = *prev_p) != 0; prev_p = &image->next) if (image->target_data == target_data) { + if (!nvptx_do_global_cdtors (image->module, dev, + "__do_global_dtors__entry")) + ret = false; + *prev_p = image->next; if (CUDA_CALL_NOCHECK (cuModuleUnload, image->module) != CUDA_SUCCESS) ret = false; -- 2.25.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PING] nvptx: Support global constructors/destructors via 'collect2' for offloading (was: nvptx: Support global constructors/destructors via 'collect2') 2022-12-23 13:37 ` Thomas Schwinge @ 2023-01-11 11:49 ` Thomas Schwinge 0 siblings, 0 replies; 10+ messages in thread From: Thomas Schwinge @ 2023-01-11 11:49 UTC (permalink / raw) To: gcc-patches, Tom de Vries [-- Attachment #1: Type: text/plain, Size: 1185 bytes --] Hi! Ping. Grüße Thomas On 2022-12-23T14:37:47+0100, I wrote: > Hi! > > On 2022-12-23T14:35:16+0100, I wrote: >> On 2022-12-02T14:35:35+0100, I wrote: >>> On 2022-12-01T22:13:38+0100, I wrote: >>>> I'm working on support for global constructors/destructors with >>>> GCC/nvptx >>> >>> See "nvptx: Support global constructors/destructors via 'collect2'" >>> [posted before] >> >> Building on that, attached is now the additional "for offloading" piece: >> "nvptx: Support global constructors/destructors via 'collect2' for offloading". >> OK to push? > > Now really attached. > >> I did manually test this (by putting a few constructors/destructors into >> 'libgomp/config/nvptx/oacc-parallel.c', and observing them be executed), >> and also in my WIP development tree with standard libgfortran >> constructors (with 'LIBGFOR_MINIMAL' disabled). > > > Grüße > Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-nvptx-Support-global-constructors-destructors-via-co.patch --] [-- Type: text/x-diff, Size: 8131 bytes --] From fb67006eeca0c8e2bfdf86576ed3109dacaf6868 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Wed, 30 Nov 2022 22:09:35 +0100 Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2' for offloading This extends "nvptx: Support global constructors/destructors via 'collect2'" for offloading. libgcc/ * config/nvptx/crtstuff.c ["mgomp"] (__do_global_ctors__entry__mgomp) (__do_global_dtors__entry__mgomp): New. [!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry): New. libgomp/ * plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New. (nvptx_close_device, GOMP_OFFLOAD_load_image) (GOMP_OFFLOAD_unload_image): Call it. --- libgcc/config/nvptx/crtstuff.c | 64 ++++++++++++++++++- libgomp/plugin/plugin-nvptx.c | 113 ++++++++++++++++++++++++++++++++- 2 files changed, 175 insertions(+), 2 deletions(-) diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c index 0823fc49901..8dc80687e0a 100644 --- a/libgcc/config/nvptx/crtstuff.c +++ b/libgcc/config/nvptx/crtstuff.c @@ -29,6 +29,14 @@ files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we do so anyway, for symmetry with other configurations. */ + +/* See 'crt0.c', 'mgomp.c'. */ +#if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) +extern void *__nvptx_stacks[32] __attribute__((shared,nocommon)); +extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon)); +#endif + + #ifdef CRT_BEGIN void @@ -37,6 +45,33 @@ __do_global_ctors (void) DO_GLOBAL_CTORS_BODY; } +/* Need '.entry' wrapper for offloading. */ + +# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) + +__attribute__((kernel)) void __do_global_ctors__entry__mgomp (void *); + +void +__do_global_ctors__entry__mgomp (void *nvptx_stacks_0) +{ + __nvptx_stacks[0] = nvptx_stacks_0; + __nvptx_uni[0] = 0; + + __do_global_ctors (); +} + +# else + +__attribute__((kernel)) void __do_global_ctors__entry (void); + +void +__do_global_ctors__entry (void) +{ + __do_global_ctors (); +} + +# endif + #elif defined(CRT_END) /* ! CRT_BEGIN */ void @@ -45,7 +80,7 @@ __do_global_dtors (void) /* In this configuration here, there's no way that "this routine is run more than once [...] when exit is called recursively": for nvptx target, the call to '__do_global_dtors' is registered via 'atexit', which doesn't - re-enter a function already run. + re-enter a function already run, and neither does nvptx offload target. Therefore, we do *not* "arrange to remember where in the list we left off processing". */ func_ptr *p; @@ -53,6 +88,33 @@ __do_global_dtors (void) (*p++) (); } +/* Need '.entry' wrapper for offloading. */ + +# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) + +__attribute__((kernel)) void __do_global_dtors__entry__mgomp (void *); + +void +__do_global_dtors__entry__mgomp (void *nvptx_stacks_0) +{ + __nvptx_stacks[0] = nvptx_stacks_0; + __nvptx_uni[0] = 0; + + __do_global_dtors (); +} + +# else + +__attribute__((kernel)) void __do_global_dtors__entry (void); + +void +__do_global_dtors__entry (void) +{ + __do_global_dtors (); +} + +# endif + #else /* ! CRT_BEGIN && ! CRT_END */ #error "One of CRT_BEGIN or CRT_END must be defined." #endif diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index fcc97c6e0d5..395639537e8 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -338,6 +338,11 @@ struct ptx_device static struct ptx_device **ptx_devices; +static bool nvptx_do_global_cdtors (CUmodule, struct ptx_device *, + const char *); +static size_t nvptx_stacks_size (); +static void *nvptx_stacks_acquire (struct ptx_device *, size_t, int); + static inline struct nvptx_thread * nvptx_thread (void) { @@ -557,6 +562,17 @@ nvptx_close_device (struct ptx_device *ptx_dev) if (!ptx_dev) return true; + bool ret = true; + + for (struct ptx_image_data *image = ptx_dev->images; + image != NULL; + image = image->next) + { + if (!nvptx_do_global_cdtors (image->module, ptx_dev, + "__do_global_dtors__entry")) + ret = false; + } + for (struct ptx_free_block *b = ptx_dev->free_blocks; b;) { struct ptx_free_block *b_next = b->next; @@ -577,7 +593,8 @@ nvptx_close_device (struct ptx_device *ptx_dev) CUDA_CALL (cuCtxDestroy, ptx_dev->ctx); free (ptx_dev); - return true; + + return ret; } static int @@ -1280,6 +1297,93 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev) GOMP_PLUGIN_fatal ("cuMemcpyHtoD error: %s", cuda_error (r)); } +/* Invoke MODULE's global constructors/destructors. */ + +static bool +nvptx_do_global_cdtors (CUmodule module, struct ptx_device *ptx_dev, + const char *funcname) +{ + bool ret = true; + char *funcname_mgomp = NULL; + CUresult r; + CUfunction funcptr; + r = CUDA_CALL_NOCHECK (cuModuleGetFunction, + &funcptr, module, funcname); + GOMP_PLUGIN_debug (0, "cuModuleGetFunction (%s): %s\n", + funcname, cuda_error (r)); + if (r == CUDA_ERROR_NOT_FOUND) + { + /* Try '[funcname]__mgomp'. */ + + size_t funcname_len = strlen (funcname); + const char *mgomp_suffix = "__mgomp"; + size_t mgomp_suffix_len = strlen (mgomp_suffix); + funcname_mgomp + = GOMP_PLUGIN_malloc (funcname_len + mgomp_suffix_len + 1); + memcpy (funcname_mgomp, funcname, funcname_len); + memcpy (funcname_mgomp + funcname_len, + mgomp_suffix, mgomp_suffix_len + 1); + funcname = funcname_mgomp; + + r = CUDA_CALL_NOCHECK (cuModuleGetFunction, + &funcptr, module, funcname); + GOMP_PLUGIN_debug (0, "cuModuleGetFunction (%s): %s\n", + funcname, cuda_error (r)); + } + if (r == CUDA_ERROR_NOT_FOUND) + ; + else if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuModuleGetFunction (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + else + { + /* If necessary, set up soft stack. */ + void *nvptx_stacks_0; + void *kargs[1]; + if (funcname_mgomp) + { + size_t stack_size = nvptx_stacks_size (); + pthread_mutex_lock (&ptx_dev->omp_stacks.lock); + nvptx_stacks_0 = nvptx_stacks_acquire (ptx_dev, stack_size, 1); + nvptx_stacks_0 += stack_size; + kargs[0] = &nvptx_stacks_0; + } + r = CUDA_CALL_NOCHECK (cuLaunchKernel, + funcptr, + 1, 1, 1, 1, 1, 1, + /* sharedMemBytes */ 0, + /* hStream */ NULL, + /* kernelParams */ funcname_mgomp ? kargs : NULL, + /* extra */ NULL); + if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuLaunchKernel (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + + r = CUDA_CALL_NOCHECK (cuStreamSynchronize, + NULL); + if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuStreamSynchronize (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + + if (funcname_mgomp) + pthread_mutex_unlock (&ptx_dev->omp_stacks.lock); + } + + if (funcname_mgomp) + free (funcname_mgomp); + + return ret; +} + /* Load the (partial) program described by TARGET_DATA to device number ORD. Allocate and return TARGET_TABLE. If not NULL, REV_FN_TABLE will contain the on-device addresses of the functions for reverse offload. @@ -1452,6 +1556,9 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data, nvptx_set_clocktick (module, dev); + if (!nvptx_do_global_cdtors (module, dev, "__do_global_ctors__entry")) + return -1; + return fn_entries + var_entries + other_entries; } @@ -1477,6 +1584,10 @@ GOMP_OFFLOAD_unload_image (int ord, unsigned version, const void *target_data) for (prev_p = &dev->images; (image = *prev_p) != 0; prev_p = &image->next) if (image->target_data == target_data) { + if (!nvptx_do_global_cdtors (image->module, dev, + "__do_global_dtors__entry")) + ret = false; + *prev_p = image->next; if (CUDA_CALL_NOCHECK (cuModuleUnload, image->module) != CUDA_SUCCESS) ret = false; -- 2.25.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [og12] nvptx: Support global constructors/destructors via 'collect2' for offloading (was: nvptx: Support global constructors/destructors via 'collect2') 2022-12-23 13:35 ` nvptx: Support global constructors/destructors via 'collect2' for offloading (was: " Thomas Schwinge 2022-12-23 13:37 ` Thomas Schwinge @ 2023-01-20 20:46 ` Thomas Schwinge 1 sibling, 0 replies; 10+ messages in thread From: Thomas Schwinge @ 2023-01-20 20:46 UTC (permalink / raw) To: gcc-patches; +Cc: Tom de Vries [-- Attachment #1: Type: text/plain, Size: 1245 bytes --] Hi! On 2022-12-23T14:35:16+0100, I wrote: > On 2022-12-02T14:35:35+0100, I wrote: >> On 2022-12-01T22:13:38+0100, I wrote: >>> I'm working on support for global constructors/destructors with >>> GCC/nvptx >> >> See "nvptx: Support global constructors/destructors via 'collect2'" >> [posted before] > > Building on that, attached is now the additional "for offloading" piece: > "nvptx: Support global constructors/destructors via 'collect2' for offloading". > OK to push? For now pushed to devel/omp/gcc-12 branch in commit 689a5340c7e4286b451f1bc600342550c7c94da2 "nvptx: Support global constructors/destructors via 'collect2' for offloading", see attached. > I did manually test this (by putting a few constructors/destructors into > 'libgomp/config/nvptx/oacc-parallel.c', and observing them be executed), > and also in my WIP development tree with standard libgfortran > constructors (with 'LIBGFOR_MINIMAL' disabled). Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-nvptx-Support-global-constructors-destructors-via-co.patch --] [-- Type: text/x-diff, Size: 9283 bytes --] From 689a5340c7e4286b451f1bc600342550c7c94da2 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Wed, 30 Nov 2022 22:09:35 +0100 Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2' for offloading This extends "nvptx: Support global constructors/destructors via 'collect2'" for offloading. libgcc/ * config/nvptx/crtstuff.c ["mgomp"] (__do_global_ctors__entry__mgomp) (__do_global_dtors__entry__mgomp): New. [!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry): New. libgomp/ * plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New. (nvptx_close_device, GOMP_OFFLOAD_load_image) (GOMP_OFFLOAD_unload_image): Call it. --- libgcc/ChangeLog.omp | 6 ++ libgcc/config/nvptx/crtstuff.c | 64 ++++++++++++++++++- libgomp/ChangeLog.omp | 4 ++ libgomp/plugin/plugin-nvptx.c | 113 ++++++++++++++++++++++++++++++++- 4 files changed, 185 insertions(+), 2 deletions(-) diff --git a/libgcc/ChangeLog.omp b/libgcc/ChangeLog.omp index 68a99cbe427..2e7bf5cc029 100644 --- a/libgcc/ChangeLog.omp +++ b/libgcc/ChangeLog.omp @@ -1,5 +1,11 @@ 2023-01-20 Thomas Schwinge <thomas@codesourcery.com> + * config/nvptx/crtstuff.c ["mgomp"] + (__do_global_ctors__entry__mgomp) + (__do_global_dtors__entry__mgomp): New. + [!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry): + New. + * config.host <case ${host} in nvptx-*>: Add 'crtbegin.o', 'crtend.o' to 'extra_parts'. * config/nvptx/crt0.c: Invoke '__do_global_ctors', diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c index 0823fc49901..8dc80687e0a 100644 --- a/libgcc/config/nvptx/crtstuff.c +++ b/libgcc/config/nvptx/crtstuff.c @@ -29,6 +29,14 @@ files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we do so anyway, for symmetry with other configurations. */ + +/* See 'crt0.c', 'mgomp.c'. */ +#if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) +extern void *__nvptx_stacks[32] __attribute__((shared,nocommon)); +extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon)); +#endif + + #ifdef CRT_BEGIN void @@ -37,6 +45,33 @@ __do_global_ctors (void) DO_GLOBAL_CTORS_BODY; } +/* Need '.entry' wrapper for offloading. */ + +# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) + +__attribute__((kernel)) void __do_global_ctors__entry__mgomp (void *); + +void +__do_global_ctors__entry__mgomp (void *nvptx_stacks_0) +{ + __nvptx_stacks[0] = nvptx_stacks_0; + __nvptx_uni[0] = 0; + + __do_global_ctors (); +} + +# else + +__attribute__((kernel)) void __do_global_ctors__entry (void); + +void +__do_global_ctors__entry (void) +{ + __do_global_ctors (); +} + +# endif + #elif defined(CRT_END) /* ! CRT_BEGIN */ void @@ -45,7 +80,7 @@ __do_global_dtors (void) /* In this configuration here, there's no way that "this routine is run more than once [...] when exit is called recursively": for nvptx target, the call to '__do_global_dtors' is registered via 'atexit', which doesn't - re-enter a function already run. + re-enter a function already run, and neither does nvptx offload target. Therefore, we do *not* "arrange to remember where in the list we left off processing". */ func_ptr *p; @@ -53,6 +88,33 @@ __do_global_dtors (void) (*p++) (); } +/* Need '.entry' wrapper for offloading. */ + +# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__) + +__attribute__((kernel)) void __do_global_dtors__entry__mgomp (void *); + +void +__do_global_dtors__entry__mgomp (void *nvptx_stacks_0) +{ + __nvptx_stacks[0] = nvptx_stacks_0; + __nvptx_uni[0] = 0; + + __do_global_dtors (); +} + +# else + +__attribute__((kernel)) void __do_global_dtors__entry (void); + +void +__do_global_dtors__entry (void) +{ + __do_global_dtors (); +} + +# endif + #else /* ! CRT_BEGIN && ! CRT_END */ #error "One of CRT_BEGIN or CRT_END must be defined." #endif diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp index 4447b74a2ab..32aa9705296 100644 --- a/libgomp/ChangeLog.omp +++ b/libgomp/ChangeLog.omp @@ -1,5 +1,9 @@ 2023-01-20 Thomas Schwinge <thomas@codesourcery.com> + * plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New. + (nvptx_close_device, GOMP_OFFLOAD_load_image) + (GOMP_OFFLOAD_unload_image): Call it. + * plugin/plugin-nvptx.c (nvptx_exec): Assert what we know about 'blockDimX'. diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index b2fabc61cc8..8e7b63bd637 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -344,6 +344,11 @@ static struct ptx_device **ptx_devices; default is set here. */ static unsigned lowlat_pool_size = 8*1024; +static bool nvptx_do_global_cdtors (CUmodule, struct ptx_device *, + const char *); +static size_t nvptx_stacks_size (); +static void *nvptx_stacks_acquire (struct ptx_device *, size_t, int); + static inline struct nvptx_thread * nvptx_thread (void) { @@ -571,6 +576,17 @@ nvptx_close_device (struct ptx_device *ptx_dev) if (!ptx_dev) return true; + bool ret = true; + + for (struct ptx_image_data *image = ptx_dev->images; + image != NULL; + image = image->next) + { + if (!nvptx_do_global_cdtors (image->module, ptx_dev, + "__do_global_dtors__entry")) + ret = false; + } + for (struct ptx_free_block *b = ptx_dev->free_blocks; b;) { struct ptx_free_block *b_next = b->next; @@ -591,7 +607,8 @@ nvptx_close_device (struct ptx_device *ptx_dev) CUDA_CALL (cuCtxDestroy, ptx_dev->ctx); free (ptx_dev); - return true; + + return ret; } static int @@ -1313,6 +1330,93 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev) GOMP_PLUGIN_fatal ("cuMemcpyHtoD error: %s", cuda_error (r)); } +/* Invoke MODULE's global constructors/destructors. */ + +static bool +nvptx_do_global_cdtors (CUmodule module, struct ptx_device *ptx_dev, + const char *funcname) +{ + bool ret = true; + char *funcname_mgomp = NULL; + CUresult r; + CUfunction funcptr; + r = CUDA_CALL_NOCHECK (cuModuleGetFunction, + &funcptr, module, funcname); + GOMP_PLUGIN_debug (0, "cuModuleGetFunction (%s): %s\n", + funcname, cuda_error (r)); + if (r == CUDA_ERROR_NOT_FOUND) + { + /* Try '[funcname]__mgomp'. */ + + size_t funcname_len = strlen (funcname); + const char *mgomp_suffix = "__mgomp"; + size_t mgomp_suffix_len = strlen (mgomp_suffix); + funcname_mgomp + = GOMP_PLUGIN_malloc (funcname_len + mgomp_suffix_len + 1); + memcpy (funcname_mgomp, funcname, funcname_len); + memcpy (funcname_mgomp + funcname_len, + mgomp_suffix, mgomp_suffix_len + 1); + funcname = funcname_mgomp; + + r = CUDA_CALL_NOCHECK (cuModuleGetFunction, + &funcptr, module, funcname); + GOMP_PLUGIN_debug (0, "cuModuleGetFunction (%s): %s\n", + funcname, cuda_error (r)); + } + if (r == CUDA_ERROR_NOT_FOUND) + ; + else if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuModuleGetFunction (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + else + { + /* If necessary, set up soft stack. */ + void *nvptx_stacks_0; + void *kargs[1]; + if (funcname_mgomp) + { + size_t stack_size = nvptx_stacks_size (); + pthread_mutex_lock (&ptx_dev->omp_stacks.lock); + nvptx_stacks_0 = nvptx_stacks_acquire (ptx_dev, stack_size, 1); + nvptx_stacks_0 += stack_size; + kargs[0] = &nvptx_stacks_0; + } + r = CUDA_CALL_NOCHECK (cuLaunchKernel, + funcptr, + 1, 1, 1, 1, 1, 1, + /* sharedMemBytes */ 0, + /* hStream */ NULL, + /* kernelParams */ funcname_mgomp ? kargs : NULL, + /* extra */ NULL); + if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuLaunchKernel (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + + r = CUDA_CALL_NOCHECK (cuStreamSynchronize, + NULL); + if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("cuStreamSynchronize (%s) error: %s", + funcname, cuda_error (r)); + ret = false; + } + + if (funcname_mgomp) + pthread_mutex_unlock (&ptx_dev->omp_stacks.lock); + } + + if (funcname_mgomp) + free (funcname_mgomp); + + return ret; +} + /* Load the (partial) program described by TARGET_DATA to device number ORD. Allocate and return TARGET_TABLE. If not NULL, REV_FN_TABLE will contain the on-device addresses of the functions for reverse offload. @@ -1485,6 +1589,9 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data, nvptx_set_clocktick (module, dev); + if (!nvptx_do_global_cdtors (module, dev, "__do_global_ctors__entry")) + return -1; + return fn_entries + var_entries + other_entries; } @@ -1510,6 +1617,10 @@ GOMP_OFFLOAD_unload_image (int ord, unsigned version, const void *target_data) for (prev_p = &dev->images; (image = *prev_p) != 0; prev_p = &image->next) if (image->target_data == target_data) { + if (!nvptx_do_global_cdtors (image->module, dev, + "__do_global_dtors__entry")) + ret = false; + *prev_p = image->next; if (CUDA_CALL_NOCHECK (cuModuleUnload, image->module) != CUDA_SUCCESS) ret = false; -- 2.25.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [og12] nvptx: Support global constructors/destructors via 'collect2' 2022-12-02 13:35 ` nvptx: Support global constructors/destructors via 'collect2' Thomas Schwinge 2022-12-20 8:03 ` [PING] " Thomas Schwinge 2022-12-23 13:35 ` nvptx: Support global constructors/destructors via 'collect2' for offloading (was: " Thomas Schwinge @ 2023-01-20 20:41 ` Thomas Schwinge 2023-01-20 20:45 ` Thomas Schwinge 2 siblings, 1 reply; 10+ messages in thread From: Thomas Schwinge @ 2023-01-20 20:41 UTC (permalink / raw) To: gcc-patches; +Cc: Tom de Vries Hi! On 2022-12-02T14:35:35+0100, I wrote: > On 2022-12-01T22:13:38+0100, I wrote: >> I'm working on support for global constructors/destructors with >> GCC/nvptx > > See "nvptx: Support global constructors/destructors via 'collect2'" > attached; OK to push? (... with 'gcc/doc/install.texi' accordingly > updated once <https://github.com/MentorEmbedded/nvptx-tools/pull/40> > "'nm'" and newlib > <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> > "nvptx: Implement '_exit' instead of 'exit'" have been merged; any > comments to those?) For now pushed to devel/omp/gcc-12 branch in commit fe07b0003bb2092bc34d4bed504be1868b88782d "nvptx: Support global constructors/destructors via 'collect2'", see attached. > Per my quick scanning of 'gcc/config.gcc' history, for more than two > decades, there was a clear trend to remove 'use_collect2=yes' > configurations; now finally a new one is being added -- making sure we're > not slowly dispensing with the need for the early 1990s piece of work > that 'gcc/collect2*' is... ;'-P (I still find that "notable" and "funny" in a certain way.) ;-* Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [og12] nvptx: Support global constructors/destructors via 'collect2' 2023-01-20 20:41 ` [og12] nvptx: Support global constructors/destructors via 'collect2' Thomas Schwinge @ 2023-01-20 20:45 ` Thomas Schwinge 0 siblings, 0 replies; 10+ messages in thread From: Thomas Schwinge @ 2023-01-20 20:45 UTC (permalink / raw) To: gcc-patches; +Cc: Tom de Vries [-- Attachment #1: Type: text/plain, Size: 1548 bytes --] Hi! On 2023-01-20T21:41:26+0100, I wrote: > On 2022-12-02T14:35:35+0100, I wrote: >> On 2022-12-01T22:13:38+0100, I wrote: >>> I'm working on support for global constructors/destructors with >>> GCC/nvptx >> >> See "nvptx: Support global constructors/destructors via 'collect2'" >> attached; OK to push? (... with 'gcc/doc/install.texi' accordingly >> updated once <https://github.com/MentorEmbedded/nvptx-tools/pull/40> >> "'nm'" and newlib >> <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> >> "nvptx: Implement '_exit' instead of 'exit'" have been merged; any >> comments to those?) > > For now pushed to devel/omp/gcc-12 branch in > commit fe07b0003bb2092bc34d4bed504be1868b88782d > "nvptx: Support global constructors/destructors via 'collect2'", > see attached. Now really attached. >> Per my quick scanning of 'gcc/config.gcc' history, for more than two >> decades, there was a clear trend to remove 'use_collect2=yes' >> configurations; now finally a new one is being added -- making sure we're >> not slowly dispensing with the need for the early 1990s piece of work >> that 'gcc/collect2*' is... ;'-P > > (I still find that "notable" and "funny" in a certain way.) ;-* Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-nvptx-Support-global-constructors-destructors-via-co.patch --] [-- Type: text/x-diff, Size: 12741 bytes --] From fe07b0003bb2092bc34d4bed504be1868b88782d Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Sun, 13 Nov 2022 14:19:30 +0100 Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2' The function attributes 'constructor', 'destructor', and 'init_priority' now work, as do the C++ features making use of this. Test cases with effective target 'global_constructor' and 'init_priority' now generally work, and 'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented: global constructors not supported on this target". This depends on <https://github.com/MentorEmbedded/nvptx-tools/pull/40> "'nm'" generally, and for global destructors support: newlib <https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/> "nvptx: Implement '_exit' instead of 'exit'". gcc/ * collect2.cc (write_c_file_glob): Allow for 'COLLECT2_MAIN_REFERENCE' override. * config.gcc <case ${target} in nvptx-*>: Set 'use_collect2=yes'. * config/nvptx/nvptx.h: Adjust. gcc/testsuite/ * gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is 'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper in identifiers. * lib/target-supports.exp (check_effective_target_global_constructor): Enable for nvptx. libgcc/ * config.host <case ${host} in nvptx-*>: Add 'crtbegin.o', 'crtend.o' to 'extra_parts'. * config/nvptx/crt0.c: Invoke '__do_global_ctors', '__do_global_dtors'. * config/nvptx/crtstuff.c: New. * config/nvptx/t-nvptx: Adjust. --- gcc/ChangeLog.omp | 5 ++ gcc/collect2.cc | 4 ++ gcc/config.gcc | 1 + gcc/config/nvptx/nvptx.h | 35 ++++++++++- gcc/testsuite/ChangeLog.omp | 6 ++ .../no_profile_instrument_function-attr-1.c | 2 +- gcc/testsuite/lib/target-supports.exp | 3 +- libgcc/ChangeLog.omp | 9 +++ libgcc/config.host | 2 +- libgcc/config/nvptx/crt0.c | 6 ++ libgcc/config/nvptx/crtstuff.c | 58 +++++++++++++++++++ libgcc/config/nvptx/t-nvptx | 15 ++++- 12 files changed, 139 insertions(+), 7 deletions(-) create mode 100644 libgcc/config/nvptx/crtstuff.c diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp index 127b450644b..ca00bfb48f9 100644 --- a/gcc/ChangeLog.omp +++ b/gcc/ChangeLog.omp @@ -1,5 +1,10 @@ 2023-01-20 Thomas Schwinge <thomas@codesourcery.com> + * collect2.cc (write_c_file_glob): Allow for + 'COLLECT2_MAIN_REFERENCE' override. + * config.gcc <case ${target} in nvptx-*>: Set 'use_collect2=yes'. + * config/nvptx/nvptx.h: Adjust. + * config/nvptx/nvptx.cc (nvptx_assemble_undefined_decl): Notice '__nvptx_stacks', '__nvptx_uni' declarations. (nvptx_file_end): Don't emit duplicate declarations for those. diff --git a/gcc/collect2.cc b/gcc/collect2.cc index d81c7f28f16..945a9ff86dd 100644 --- a/gcc/collect2.cc +++ b/gcc/collect2.cc @@ -2238,8 +2238,12 @@ write_c_file_glob (FILE *stream, const char *name ATTRIBUTE_UNUSED) fprintf (stream, "\tdereg_frame,\n"); fprintf (stream, "\t0\n};\n\n"); +# ifdef COLLECT2_MAIN_REFERENCE + fprintf (stream, "%s\n\n", COLLECT2_MAIN_REFERENCE); +# else fprintf (stream, "extern entry_pt %s;\n", NAME__MAIN); fprintf (stream, "entry_pt *__main_reference = %s;\n\n", NAME__MAIN); +# endif } #endif /* ! LD_INIT_SWITCH */ diff --git a/gcc/config.gcc b/gcc/config.gcc index e6b9c864b0d..9c9365886cf 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -2835,6 +2835,7 @@ nvptx-*) tm_file="${tm_file} newlib-stdint.h" use_gcc_stdint=wrap tmake_file="nvptx/t-nvptx" + use_collect2=yes if test x$enable_as_accelerator = xyes; then extra_programs="${extra_programs} mkoffload\$(exeext)" tm_file="${tm_file} nvptx/offload.h" diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h index b81f9f42cd3..d815081147e 100644 --- a/gcc/config/nvptx/nvptx.h +++ b/gcc/config/nvptx/nvptx.h @@ -35,7 +35,39 @@ '../../gcc.cc:asm_options', 'HAVE_GNU_AS'. */ #define ASM_SPEC "%{v}" -#define STARTFILE_SPEC "%{mmainkernel:crt0.o%s}" +#define STARTFILE_SPEC \ + STARTFILE_SPEC_MMAINKERNEL \ + " " STARTFILE_SPEC_CDTOR + +#define ENDFILE_SPEC \ + ENDFILE_SPEC_CDTOR + +#define STARTFILE_SPEC_MMAINKERNEL "%{mmainkernel:crt0.o%s}" + +/* Support for global constructors/destructors is implemented via + 'collect2' and the following helpers. */ + +#define STARTFILE_SPEC_CDTOR "crtbegin.o%s" + +#define ENDFILE_SPEC_CDTOR "crtend.o%s" + +/* nvptx does its own wrapping of 'main' + (see 'libgcc/config/nvptx/crt0.c:__main'). */ +#define HAS_INIT_SECTION + +/* For example with old Nvidia Tesla K20c, Driver Version: 361.93.02, the + function pointers stored in the '__CTOR_LIST__', '__DTOR_LIST__' arrays + evidently evaluate to NULL in JIT compilation. Avoiding the use of + assembler names ('write_list_with_asm') doesn't help, but defining a dummy + function next to the arrays apparently does work around this issue... + + The default '__main_reference' synthesized by 'collect2' refers to our + 'crt0.o' '__main' function with incompatible signature: + + error : Function '__main' not declared __global__ in all source files + + Address both these issues via 'COLLECT2_MAIN_REFERENCE'. */ +#define COLLECT2_MAIN_REFERENCE "__attribute__((unused)) static void dummy () {}" #define TARGET_CPU_CPP_BUILTINS() nvptx_cpu_cpp_builtins () @@ -348,7 +380,6 @@ struct GTY(()) machine_function #define MOVE_MAX 8 #define MOVE_RATIO(SPEED) 4 #define FUNCTION_MODE QImode -#define HAS_INIT_SECTION 1 /* The C++ front end insists to link against libstdc++ -- which we don't build. Tell it to instead link against the innocuous libgcc. */ diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp index c942c34dc70..f35568d83a9 100644 --- a/gcc/testsuite/ChangeLog.omp +++ b/gcc/testsuite/ChangeLog.omp @@ -1,5 +1,11 @@ 2023-01-20 Thomas Schwinge <thomas@codesourcery.com> + * gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is + 'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper + in identifiers. + * lib/target-supports.exp + (check_effective_target_global_constructor): Enable for nvptx. + * gcc.target/nvptx/softstack-decl-1.c: Make 'dg-do assemble', adjust. * gcc.target/nvptx/uniform-simt-decl-1.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c index 909f8a68479..5b4101cf596 100644 --- a/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c +++ b/gcc/testsuite/gcc.dg/no_profile_instrument_function-attr-1.c @@ -18,7 +18,7 @@ int main () return foo (); } -/* { dg-final { scan-tree-dump-times "__gcov0\[._\]main.* = PROF_edge_counter" 1 "optimized"} } */ +/* { dg-final { scan-tree-dump-times "__gcov0\[$._\]main.* = PROF_edge_counter" 1 "optimized"} } */ /* { dg-final { scan-tree-dump-times "__gcov_indirect_call_profiler_v" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_time_profiler_counter = " 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "__gcov_init" 1 "optimized" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index e911b298f4c..efd02a61a6b 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -850,8 +850,7 @@ proc check_effective_target_nonlocal_goto {} { # Return 1 if global constructors are supported, 0 otherwise. proc check_effective_target_global_constructor {} { - if { [istarget nvptx-*-*] - || [istarget bpf-*-*] } { + if { [istarget bpf-*-*] } { return 0 } return 1 diff --git a/libgcc/ChangeLog.omp b/libgcc/ChangeLog.omp index f41bbbb339a..68a99cbe427 100644 --- a/libgcc/ChangeLog.omp +++ b/libgcc/ChangeLog.omp @@ -1,3 +1,12 @@ +2023-01-20 Thomas Schwinge <thomas@codesourcery.com> + + * config.host <case ${host} in nvptx-*>: Add 'crtbegin.o', + 'crtend.o' to 'extra_parts'. + * config/nvptx/crt0.c: Invoke '__do_global_ctors', + '__do_global_dtors'. + * config/nvptx/crtstuff.c: New. + * config/nvptx/t-nvptx: Adjust. + 2022-11-07 Kwok Cheung Yeung <kcy@codesourcery.com> * config/gcn/simd-math/amdgcnmach.h (VECTOR_RETURN): Store value of diff --git a/libgcc/config.host b/libgcc/config.host index 072dd26a276..29b5a2bd7cf 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -1535,7 +1535,7 @@ m32c-*-elf*|m32c-*-rtems*) ;; nvptx-*) tmake_file="$tmake_file nvptx/t-nvptx" - extra_parts="crt0.o" + extra_parts="crt0.o crtbegin.o crtend.o" ;; *) echo "*** Configuration ${host} not supported" 1>&2 diff --git a/libgcc/config/nvptx/crt0.c b/libgcc/config/nvptx/crt0.c index abf047327ae..860e2bfacad 100644 --- a/libgcc/config/nvptx/crt0.c +++ b/libgcc/config/nvptx/crt0.c @@ -19,6 +19,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see <http://www.gnu.org/licenses/>. */ +#include <stdlib.h> +#include "gbl-ctors.h" + int *__exitval_ptr; extern void __attribute__((noreturn)) exit (int status); @@ -47,5 +50,8 @@ __main (int *rval_ptr, int argc, void **argv) __nvptx_stacks[0] = stack + sizeof stack; __nvptx_uni[0] = 0; + __do_global_ctors (); + atexit (__do_global_dtors); + exit (main (argc, argv)); } diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c new file mode 100644 index 00000000000..0823fc49901 --- /dev/null +++ b/libgcc/config/nvptx/crtstuff.c @@ -0,0 +1,58 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ + +#include "gbl-ctors.h" + +/* The common 'crtstuff.c' doesn't quite provide what we need, so we roll our + own. + + There's no technical reason in this configuration here to split the two + functions '__do_global_ctors' and '__do_global_ctors' into two separate + files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we + do so anyway, for symmetry with other configurations. */ + +#ifdef CRT_BEGIN + +void +__do_global_ctors (void) +{ + DO_GLOBAL_CTORS_BODY; +} + +#elif defined(CRT_END) /* ! CRT_BEGIN */ + +void +__do_global_dtors (void) +{ + /* In this configuration here, there's no way that "this routine is run more + than once [...] when exit is called recursively": for nvptx target, the + call to '__do_global_dtors' is registered via 'atexit', which doesn't + re-enter a function already run. + Therefore, we do *not* "arrange to remember where in the list we left off + processing". */ + func_ptr *p; + for (p = __DTOR_LIST__ + 1; *p; ) + (*p++) (); +} + +#else /* ! CRT_BEGIN && ! CRT_END */ +#error "One of CRT_BEGIN or CRT_END must be defined." +#endif diff --git a/libgcc/config/nvptx/t-nvptx b/libgcc/config/nvptx/t-nvptx index ede0bf0f87d..9a0454c3a4d 100644 --- a/libgcc/config/nvptx/t-nvptx +++ b/libgcc/config/nvptx/t-nvptx @@ -3,7 +3,7 @@ LIB2ADD=$(srcdir)/config/nvptx/reduction.c \ $(srcdir)/config/nvptx/atomic.c LIB2ADDEH= -LIB2FUNCS_EXCLUDE=__main +LIB2FUNCS_EXCLUDE= crt0.o: $(srcdir)/config/nvptx/crt0.c $(crt_compile) -c $< @@ -12,3 +12,16 @@ crt0.o: $(srcdir)/config/nvptx/crt0.c # support it, and it may cause the build to fail, because of alloca usage, for # example. INHIBIT_LIBC_CFLAGS = -Dinhibit_libc + +# Support for global constructors/destructors is implemented via +# 'collect2' and the following helpers. + +LIB2FUNCS_EXCLUDE += __main + +CUSTOM_CRTSTUFF = yes + +crtbegin.o: $(srcdir)/config/nvptx/crtstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_BEGIN + +crtend.o: $(srcdir)/config/nvptx/crtstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END -- 2.25.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-01-24 9:01 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com> 2022-12-02 13:35 ` nvptx: Support global constructors/destructors via 'collect2' Thomas Schwinge 2022-12-20 8:03 ` [PING] " Thomas Schwinge 2023-01-11 11:48 ` [PING^2] " Thomas Schwinge 2023-01-24 9:01 ` Make 'libgcc/config/nvptx/crt0.c' build '--without-headers' (was: [PING] nvptx: Support global constructors/destructors via 'collect2') Thomas Schwinge 2022-12-23 13:35 ` nvptx: Support global constructors/destructors via 'collect2' for offloading (was: " Thomas Schwinge 2022-12-23 13:37 ` Thomas Schwinge 2023-01-11 11:49 ` [PING] " Thomas Schwinge 2023-01-20 20:46 ` [og12] " Thomas Schwinge 2023-01-20 20:41 ` [og12] nvptx: Support global constructors/destructors via 'collect2' Thomas Schwinge 2023-01-20 20:45 ` Thomas Schwinge
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).