* [PATCH] Various fixes for DWARF register size computation @ 2023-01-03 11:15 Florian Weimer 2023-01-03 12:05 ` Jakub Jelinek 0 siblings, 1 reply; 8+ messages in thread From: Florian Weimer @ 2023-01-03 11:15 UTC (permalink / raw) To: gcc-patches; +Cc: Andre Vieira, Andrew Pinski, Jakub Jelinek, Jeff Law The previous code had several issues. 1. XALLOCAVEC does not create any objects, so invocating the non-POD poly_uint16 assignment operator is undefined. 2. The default constructor of poly-ints does not create a zero poly-int object (unlike what happens with regular ints). 3. The register size array must have DWARF_FRAME_REGISTERS + 1 elements. The extra element can be DWARF_FRAME_RETURN_COLUMN or DWARF_ALT_FRAME_RETURN_COLUMN. To fix problem 3, merely increasing the array size is sufficient, but it inhibits the x86-64 register size optimization in libgcc because it does not use the extra register, so it has size zero. To re-enable the optimization, expose the maximum used register to libgcc. This is sufficient for the optimizers to figure out that the memcpy call in uw_install_context_1 has a fixed size argument on x86-64. This restores bootstrap on aarch64-linux-gnu and powerpc64-linux-gnu. Not sure about test suite results yet, I need to check the baseline. gcc/ * debug.h (dwarf_reg_sizes_constant): Remove declaration. (dwarf_single_register_size): New struct. * dwarf2cfi.cc (generate_dwarf_reg_sizes): Initialize extra register size. Use in-place new for initialization. Remove unnecessary memset. (dwarf_reg_sizes_constant): Remove. (dwarf_single_register_size::dwarf_single_register_size): New constructor based on removed dwarf_reg_sizes_constant function. Allocate extra size element. (expand_builtin_init_dwarf_reg_sizes): Allocate extra size element. * target.def (init_dwarf_reg_sizes_extra): Mention extra size element. * doc/tm.texi: Update. gcc/c-family/ * c-cppbuiltin.cc (c_cpp_builtins): Switch to dwarf_single_register_size for obtaining DWARF register sizes. Define __LIBGCC_DWARF_REG_MAXIMUM__. libgcc/ * unwind-dw2.c (dwarf_reg_size): Use __LIBGCC_DWARF_REG_MAXIMUM__. --- gcc/c-family/c-cppbuiltin.cc | 12 ++++++++---- gcc/debug.h | 13 ++++++++++++- gcc/doc/tm.texi | 2 +- gcc/dwarf2cfi.cc | 45 +++++++++++++++++++++++++++----------------- gcc/target.def | 2 +- libgcc/unwind-dw2.c | 7 +++++-- 6 files changed, 55 insertions(+), 26 deletions(-) diff --git a/gcc/c-family/c-cppbuiltin.cc b/gcc/c-family/c-cppbuiltin.cc index ddfd63b8eb9..8098aca41e8 100644 --- a/gcc/c-family/c-cppbuiltin.cc +++ b/gcc/c-family/c-cppbuiltin.cc @@ -1522,10 +1522,14 @@ c_cpp_builtins (cpp_reader *pfile) builtin_define_with_int_value ("__LIBGCC_DWARF_FRAME_REGISTERS__", DWARF_FRAME_REGISTERS); { - int value = dwarf_reg_sizes_constant (); - if (value > 0) - builtin_define_with_int_value ("__LIBGCC_DWARF_REG_SIZES_CONSTANT__", - value); + dwarf_single_register_size srs; + if (srs.common_size > 0) + { + builtin_define_with_int_value ("__LIBGCC_DWARF_REG_SIZES_CONSTANT__", + srs.common_size); + builtin_define_with_int_value ("__LIBGCC_DWARF_REG_MAXIMUM__", + srs.maximum_register); + } } builtin_define_with_int_value ("__LIBGCC_DWARF_CIE_DATA_ALIGNMENT__", DWARF_CIE_DATA_ALIGNMENT); diff --git a/gcc/debug.h b/gcc/debug.h index 4fe9f3570ac..2e843da8b41 100644 --- a/gcc/debug.h +++ b/gcc/debug.h @@ -245,7 +245,18 @@ extern const struct gcc_debug_hooks vmsdbg_debug_hooks; /* Dwarf2 frame information. */ -extern int dwarf_reg_sizes_constant (); +/* Query size information about DWARF registers. */ +struct dwarf_single_register_size +{ + dwarf_single_register_size(); + + /* The common register size, or 0 if the register size varies. */ + unsigned int common_size; + + /* The maximum register number that is actually present. Registers + above the maximum are size zero even if common_size is positive. */ + unsigned int maximum_register; +}; extern void dwarf2out_begin_prologue (unsigned int, unsigned int, const char *); diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index b6d7900f212..eb29cfb95aa 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -9847,7 +9847,7 @@ sizes of those pieces in the table used by the unwinder at runtime. It will be called by @code{generate_dwarf_reg_sizes} after filling in a single size corresponding to each hard register; @var{sizes} is the address of the table. It will contain -@code{DWARF_FRAME_REGISTERS} elements when this hook is called. +@code{DWARF_FRAME_REGISTERS + 1} elements when this hook is called. @end deftypefn @deftypefn {Target Hook} bool TARGET_ASM_TTYPE (rtx @var{sym}) diff --git a/gcc/dwarf2cfi.cc b/gcc/dwarf2cfi.cc index d5a27dc36c5..5bd12e070b3 100644 --- a/gcc/dwarf2cfi.cc +++ b/gcc/dwarf2cfi.cc @@ -291,11 +291,10 @@ init_one_dwarf_reg_size (int regno, machine_mode regmode, static void generate_dwarf_reg_sizes (poly_uint16 *sizes) { - for (unsigned int i = 0; i < DWARF_FRAME_REGISTERS; i++) - sizes[i] = poly_uint16{}; + for (unsigned int i = 0; i <= DWARF_FRAME_REGISTERS; i++) + new (&sizes[i]) poly_uint16(0); init_one_dwarf_reg_state init_state{}; - memset ((char *)&init_state, 0, sizeof (init_state)); for (unsigned int i = 0; i < FIRST_PSEUDO_REGISTER; i++) { @@ -334,27 +333,39 @@ generate_dwarf_reg_sizes (poly_uint16 *sizes) targetm.init_dwarf_reg_sizes_extra (sizes); } -/* Return 0 if the DWARF register sizes are not constant, otherwise - return the size constant. */ - -int -dwarf_reg_sizes_constant () +dwarf_single_register_size::dwarf_single_register_size() { - poly_uint16 *sizes = XALLOCAVEC (poly_uint16, DWARF_FRAME_REGISTERS); + poly_uint16 *sizes = XALLOCAVEC (poly_uint16, DWARF_FRAME_REGISTERS + 1); generate_dwarf_reg_sizes (sizes); - int result; - for (unsigned int i = 0; i < DWARF_FRAME_REGISTERS; i++) + /* Find the last register number actually in use. */ + for (int i = DWARF_FRAME_REGISTERS; i >= 0; --i) + { + unsigned short value; + if (!sizes[i].is_constant (&value) || value != 0) + { + maximum_register = i; + break; + } + } + + /* Check for a common register size among the used registers. */ + for (unsigned int i = 0; i <= maximum_register; ++i) { unsigned short value; if (!sizes[i].is_constant (&value)) - return 0; + { + common_size = 0; + break; + } if (i == 0) - result = value; - else if (result != value) - return 0; + common_size = value; + else if (common_size != value) + { + common_size = 0; + break; + } } - return result; } /* Generate code to initialize the dwarf register size table located @@ -363,7 +374,7 @@ dwarf_reg_sizes_constant () void expand_builtin_init_dwarf_reg_sizes (tree address) { - poly_uint16 *sizes = XALLOCAVEC (poly_uint16, DWARF_FRAME_REGISTERS); + poly_uint16 *sizes = XALLOCAVEC (poly_uint16, DWARF_FRAME_REGISTERS + 1); generate_dwarf_reg_sizes (sizes); scalar_int_mode mode = SCALAR_INT_TYPE_MODE (char_type_node); diff --git a/gcc/target.def b/gcc/target.def index da5dd31d7a4..dfffc534f83 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -4042,7 +4042,7 @@ sizes of those pieces in the table used by the unwinder at runtime.\n\ It will be called by @code{generate_dwarf_reg_sizes} after\n\ filling in a single size corresponding to each hard register;\n\ @var{sizes} is the address of the table. It will contain\n\ -@code{DWARF_FRAME_REGISTERS} elements when this hook is called.", +@code{DWARF_FRAME_REGISTERS + 1} elements when this hook is called.", void, (poly_uint16 *sizes), nullptr) /* Fetch the fixed register(s) which hold condition codes, for diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c index 792338c1e38..7a173487061 100644 --- a/libgcc/unwind-dw2.c +++ b/libgcc/unwind-dw2.c @@ -150,9 +150,12 @@ struct _Unwind_Context #ifdef __LIBGCC_DWARF_REG_SIZES_CONSTANT__ static inline unsigned char -dwarf_reg_size (int index __attribute__ ((__unused__))) +dwarf_reg_size (unsigned int index) { - return __LIBGCC_DWARF_REG_SIZES_CONSTANT__; + if (index <= __LIBGCC_DWARF_REG_MAXIMUM__) + return __LIBGCC_DWARF_REG_SIZES_CONSTANT__; + else + return 0; } #else /* Byte size of every register managed by these routines. */ base-commit: 201c21b0e847679645df1af3dd13459274f41047 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Various fixes for DWARF register size computation 2023-01-03 11:15 [PATCH] Various fixes for DWARF register size computation Florian Weimer @ 2023-01-03 12:05 ` Jakub Jelinek 2023-01-03 13:25 ` Florian Weimer 0 siblings, 1 reply; 8+ messages in thread From: Jakub Jelinek @ 2023-01-03 12:05 UTC (permalink / raw) To: Florian Weimer; +Cc: gcc-patches, Andre Vieira, Andrew Pinski, Jeff Law On Tue, Jan 03, 2023 at 12:15:23PM +0100, Florian Weimer wrote: > --- a/gcc/debug.h > +++ b/gcc/debug.h > @@ -245,7 +245,18 @@ extern const struct gcc_debug_hooks vmsdbg_debug_hooks; > > /* Dwarf2 frame information. */ > > -extern int dwarf_reg_sizes_constant (); > +/* Query size information about DWARF registers. */ > +struct dwarf_single_register_size > +{ > + dwarf_single_register_size(); Formatting, space before ( > @@ -334,27 +333,39 @@ generate_dwarf_reg_sizes (poly_uint16 *sizes) > targetm.init_dwarf_reg_sizes_extra (sizes); > } > > -/* Return 0 if the DWARF register sizes are not constant, otherwise > - return the size constant. */ > - > -int > -dwarf_reg_sizes_constant () > +dwarf_single_register_size::dwarf_single_register_size() Likewise. > + for (int i = DWARF_FRAME_REGISTERS; i >= 0; --i) > + { > + unsigned short value; > + if (!sizes[i].is_constant (&value) || value != 0) if (!known_eq (sizes[i], 0)) ? Though, I still wonder, because all of this is a hack for a single target - x86_64-linux -m64 - I think no other target has similar constant sizes, whether it wouldn't be better to revert all this compiler side stuff and handle it purely on the libgcc side - allow target headers to specify a simple expression how to compute dwarf_reg_size + don't define dwarf_reg_size_table array in that case and instead in a testcase verify that __builtin_init_dwarf_reg_size_table initializes an array to the exact same values as the libgcc/config/**/*.h overridden dwarf_reg_size version. That way, for x86_64-linux we can use ((index) <= __LIBGCC_DWARF_FRAME_REGISTERS__ ? 8 : 0) but could provide something reasonable even for other targets if it improves the unwinder sufficiently. Say s390x-linux -m64 is ((index) <= 32 ? 8 : (index) == 33 ? 4 : 0) etc. Or, if you want to do it on the compiler side, instead of predefining __LIBGCC_DWARF_REG_SIZES_CONSTANT__ and __LIBGCC_DWARF_REG_MAXIMUM__ register conditionally a new builtin, __builtin_dwarf_reg_size, which would be defined only if -fbuilding-libgcc and the compiler determines dwarf_reg_size is desirable to be computed inline without a table and would fold the builtin to say that index <= 16U ? 8 : 0 on x86_64 -m64, index <= 9U ? 4 : index - 11U <= 5U ? 12 : 0 on x86_64 -m32 etc. and if the expression is too large/complex, wouldn't predefine the builtin. Then you can #if __has_builtin(__builtin_dwarf_reg_size) use the builtin and don't provide the table + initialize it, otherwise initialize + use the table. Or, is it actually the use of table that is bad on the unwinder side, or lack of a small upper bound for what you get from the table? In that case you could predefine upper bound on the sizes instead (if constant) and simply add if (size > __LIBGCC_DWARF_REG_SIZE_MAX__) __builtin_unreachable ()). Jakub ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Various fixes for DWARF register size computation 2023-01-03 12:05 ` Jakub Jelinek @ 2023-01-03 13:25 ` Florian Weimer 2023-01-03 13:54 ` Jakub Jelinek 0 siblings, 1 reply; 8+ messages in thread From: Florian Weimer @ 2023-01-03 13:25 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches, Andre Vieira, Andrew Pinski, Jeff Law * Jakub Jelinek: > On Tue, Jan 03, 2023 at 12:15:23PM +0100, Florian Weimer wrote: >> --- a/gcc/debug.h >> +++ b/gcc/debug.h >> @@ -245,7 +245,18 @@ extern const struct gcc_debug_hooks vmsdbg_debug_hooks; >> >> /* Dwarf2 frame information. */ >> >> -extern int dwarf_reg_sizes_constant (); >> +/* Query size information about DWARF registers. */ >> +struct dwarf_single_register_size >> +{ >> + dwarf_single_register_size(); > > Formatting, space before ( > >> @@ -334,27 +333,39 @@ generate_dwarf_reg_sizes (poly_uint16 *sizes) >> targetm.init_dwarf_reg_sizes_extra (sizes); >> } >> >> -/* Return 0 if the DWARF register sizes are not constant, otherwise >> - return the size constant. */ >> - >> -int >> -dwarf_reg_sizes_constant () >> +dwarf_single_register_size::dwarf_single_register_size() > > Likewise. > >> + for (int i = DWARF_FRAME_REGISTERS; i >= 0; --i) >> + { >> + unsigned short value; >> + if (!sizes[i].is_constant (&value) || value != 0) > > if (!known_eq (sizes[i], 0)) > ? Right. > Though, I still wonder, because all of this is a hack for a single target > - x86_64-linux -m64 - I think no other target has similar constant > sizes, Really? That's odd. Is it because other architectures track callee-saved vector registers through unwinding? > whether > it wouldn't be better to revert all this compiler side stuff and handle it > purely on the libgcc side - allow target headers to specify a simple > expression how to compute dwarf_reg_size + don't define dwarf_reg_size_table > array in that case and instead in a testcase verify that > __builtin_init_dwarf_reg_size_table initializes an array to the exact same > values as the libgcc/config/**/*.h overridden dwarf_reg_size version. > That way, for x86_64-linux we can use > ((index) <= __LIBGCC_DWARF_FRAME_REGISTERS__ ? 8 : 0) > but could provide something reasonable even for other targets if it improves > the unwinder sufficiently. > Say s390x-linux -m64 is > ((index) <= 32 ? 8 : (index) == 33 ? 4 : 0) > etc. > Or, if you want to do it on the compiler side, instead of predefining > __LIBGCC_DWARF_REG_SIZES_CONSTANT__ and __LIBGCC_DWARF_REG_MAXIMUM__ > register conditionally a new builtin, __builtin_dwarf_reg_size, > which would be defined only if -fbuilding-libgcc and the compiler determines > dwarf_reg_size is desirable to be computed inline without a table and > would fold the builtin to say that > index <= 16U ? 8 : 0 on x86_64 -m64, > index <= 9U ? 4 : index - 11U <= 5U ? 12 : 0 on x86_64 -m32 etc. > and if the expression is too large/complex, wouldn't predefine the builtin. I think the pre-computation of the size array is useful even for targets where the expression is not so simple, but the array elements are still constants. A builtin like __builtin_dwarf_reg_size could use a reference to a constant static array, so that we can get rid of the array initialization code in libgcc. Before we can do that, we need to figure out if the fully dynamic register sizes on AArch64 with SVE are actually correct—and if we need to fix the non-SVE unwinder to work properly for SVE programs. So I don't want to revert the size array computation just yet. > Or, is it actually the use of table that is bad on the unwinder side, > or lack of a small upper bound for what you get from the table? > In that case you could predefine upper bound on the sizes instead (if > constant) and simply add if (size > __LIBGCC_DWARF_REG_SIZE_MAX__) > __builtin_unreachable ()). It also matters what kind of register sizes are used in practice. Looking at the FDE for _Unwind_RaiseException on i686, we only save 4-byte registers there, I think. Perhaps only non-GCC-generated code may exercise the other register sizes? That's different on AArch64, where the vector registers are saved as well. Should I repost this patch with the three nits fixed? Or should I revert two of the three patches I committed instead? Thanks, Florian ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Various fixes for DWARF register size computation 2023-01-03 13:25 ` Florian Weimer @ 2023-01-03 13:54 ` Jakub Jelinek 2023-01-12 16:50 ` Richard Sandiford 0 siblings, 1 reply; 8+ messages in thread From: Jakub Jelinek @ 2023-01-03 13:54 UTC (permalink / raw) To: Florian Weimer; +Cc: gcc-patches, Andre Vieira, Andrew Pinski, Jeff Law On Tue, Jan 03, 2023 at 02:25:21PM +0100, Florian Weimer wrote: > > Though, I still wonder, because all of this is a hack for a single target > > - x86_64-linux -m64 - I think no other target has similar constant > > sizes, > > Really? That's odd. I've tried about 30 cross compilers I had around, I admit it isn't exhaustive. > Is it because other architectures track callee-saved vector registers > through unwinding? I think it is far more than just vector registers, it can be floating point registers, or just one integral or special register of a different size etc. > > Or, if you want to do it on the compiler side, instead of predefining > > __LIBGCC_DWARF_REG_SIZES_CONSTANT__ and __LIBGCC_DWARF_REG_MAXIMUM__ > > register conditionally a new builtin, __builtin_dwarf_reg_size, > > which would be defined only if -fbuilding-libgcc and the compiler determines > > dwarf_reg_size is desirable to be computed inline without a table and > > would fold the builtin to say that > > index <= 16U ? 8 : 0 on x86_64 -m64, > > index <= 9U ? 4 : index - 11U <= 5U ? 12 : 0 on x86_64 -m32 etc. > > and if the expression is too large/complex, wouldn't predefine the builtin. > > I think the pre-computation of the size array is useful even for targets > where the expression is not so simple, but the array elements are still > constants. A builtin like __builtin_dwarf_reg_size could use a > reference to a constant static array, so that we can get rid of the > array initialization code in libgcc. Before we can do that, we need to I think constant static array might be sometimes an option too, but not always, not just because of AArch64, or other potential poly-int arches (RISCV?), but also if something could depend on some runtime check. It is true that most of the time it is constant and depends just on the target or more often on target + options combo (mostly ABI related options). > figure out if the fully dynamic register sizes on AArch64 with SVE are > actually correct—and if we need to fix the non-SVE unwinder to work > properly for SVE programs. > > So I don't want to revert the size array computation just yet. > > > Or, is it actually the use of table that is bad on the unwinder side, > > or lack of a small upper bound for what you get from the table? > > In that case you could predefine upper bound on the sizes instead (if > > constant) and simply add if (size > __LIBGCC_DWARF_REG_SIZE_MAX__) > > __builtin_unreachable ()). > > It also matters what kind of register sizes are used in practice. Yes, but that is hard to find out easily. E.g. some registers might be only saved/restored in signal frames and nowhere else, others only rarely touched but still we'd need their sizes if they are ever used. > Should I repost this patch with the three nits fixed? Or should I > revert two of the three patches I committed instead? I lean towards reversion and trying to figure out something that works on more than one arch. It doesn't have to improve all arches (say AArch64 is clearly a nightmare that isn't handled correctly even without any changes - as noted in the PRs, if libgcc is built without SVE, it will hardcode 8, while if it is built with SVE, it will be runtime dependent and will be wrong in theory when some HW has 2048 bit SVE vectors - when it is 256 bytes), but still watching into what we compile -O2 -nostdinc -fbuilding-libgcc static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1]; void foo (void) { __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table); } on at least 10 most common arches would be useful and optimizing what is easily possible would be nice. Jakub ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Various fixes for DWARF register size computation 2023-01-03 13:54 ` Jakub Jelinek @ 2023-01-12 16:50 ` Richard Sandiford 2023-01-12 17:07 ` Jakub Jelinek 0 siblings, 1 reply; 8+ messages in thread From: Richard Sandiford @ 2023-01-12 16:50 UTC (permalink / raw) To: Jakub Jelinek via Gcc-patches Cc: Florian Weimer, Jakub Jelinek, Andre Vieira, Andrew Pinski, Jeff Law Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > On Tue, Jan 03, 2023 at 02:25:21PM +0100, Florian Weimer wrote: >> > Though, I still wonder, because all of this is a hack for a single target >> > - x86_64-linux -m64 - I think no other target has similar constant >> > sizes, >> >> Really? That's odd. > > I've tried about 30 cross compilers I had around, I admit it isn't > exhaustive. > >> Is it because other architectures track callee-saved vector registers >> through unwinding? > > I think it is far more than just vector registers, it can be floating point > registers, or just one integral or special register of a different size etc. > >> > Or, if you want to do it on the compiler side, instead of predefining >> > __LIBGCC_DWARF_REG_SIZES_CONSTANT__ and __LIBGCC_DWARF_REG_MAXIMUM__ >> > register conditionally a new builtin, __builtin_dwarf_reg_size, >> > which would be defined only if -fbuilding-libgcc and the compiler determines >> > dwarf_reg_size is desirable to be computed inline without a table and >> > would fold the builtin to say that >> > index <= 16U ? 8 : 0 on x86_64 -m64, >> > index <= 9U ? 4 : index - 11U <= 5U ? 12 : 0 on x86_64 -m32 etc. >> > and if the expression is too large/complex, wouldn't predefine the builtin. >> >> I think the pre-computation of the size array is useful even for targets >> where the expression is not so simple, but the array elements are still >> constants. A builtin like __builtin_dwarf_reg_size could use a >> reference to a constant static array, so that we can get rid of the >> array initialization code in libgcc. Before we can do that, we need to > > I think constant static array might be sometimes an option too, but not > always, not just because of AArch64, or other potential poly-int arches > (RISCV?), but also if something could depend on some runtime check. > It is true that most of the time it is constant and depends just on the > target or more often on target + options combo (mostly ABI related options). > >> figure out if the fully dynamic register sizes on AArch64 with SVE are >> actually correct—and if we need to fix the non-SVE unwinder to work >> properly for SVE programs. >> >> So I don't want to revert the size array computation just yet. >> >> > Or, is it actually the use of table that is bad on the unwinder side, >> > or lack of a small upper bound for what you get from the table? >> > In that case you could predefine upper bound on the sizes instead (if >> > constant) and simply add if (size > __LIBGCC_DWARF_REG_SIZE_MAX__) >> > __builtin_unreachable ()). >> >> It also matters what kind of register sizes are used in practice. > > Yes, but that is hard to find out easily. E.g. some registers might be > only saved/restored in signal frames and nowhere else, others only rarely > touched but still we'd need their sizes if they are ever used. > >> Should I repost this patch with the three nits fixed? Or should I >> revert two of the three patches I committed instead? > > I lean towards reversion and trying to figure out something that works > on more than one arch. It doesn't have to improve all arches (say > AArch64 is clearly a nightmare that isn't handled correctly even without > any changes - as noted in the PRs, if libgcc is built without SVE, it will > hardcode 8, while if it is built with SVE, it will be runtime dependent and > will be wrong in theory when some HW has 2048 bit SVE vectors - when it is > 256 bytes), but still watching into what we compile -O2 -nostdinc -fbuilding-libgcc I'm jumping in here without fully understanding the context, so maybe this is exactly your point, but: the SIMD/FP DWARF registers are supposed to be size 8 regardless of which features are enabled. That's already only half of the hardware register size for base Armv8-A, since Advanced SIMD registers are 16 bytes in size. So yeah, if we're using the hardware register size then something is wrong. Thanks, Richard > static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1]; > > void > foo (void) > { > __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table); > } > > on at least 10 most common arches would be useful and optimizing what is > easily possible would be nice. > > Jakub ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Various fixes for DWARF register size computation 2023-01-12 16:50 ` Richard Sandiford @ 2023-01-12 17:07 ` Jakub Jelinek 2023-01-12 21:03 ` Richard Sandiford 0 siblings, 1 reply; 8+ messages in thread From: Jakub Jelinek @ 2023-01-12 17:07 UTC (permalink / raw) To: gcc-patches, Florian Weimer, Andre Vieira, Andrew Pinski, Jeff Law, richard.sandiford [-- Attachment #1: Type: text/plain, Size: 1095 bytes --] On Thu, Jan 12, 2023 at 04:50:07PM +0000, Richard Sandiford wrote: > I'm jumping in here without fully understanding the context, so maybe this > is exactly your point, but: the SIMD/FP DWARF registers are supposed to be > size 8 regardless of which features are enabled. That's already only half > of the hardware register size for base Armv8-A, since Advanced SIMD registers > are 16 bytes in size. > > So yeah, if we're using the hardware register size then something is wrong. I'm talking about what the following compiles to static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1]; void foo (void) { __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table); } (and therefore what libgcc/unwind-dw2.c (init_dwarf_reg_size_table) as well) with -O2 -fbuilding-libgcc -march=armv8-a vs. -O2 -fbuilding-libgcc -march=armv8-a+sve The former is setting I think [0..31, 46, 48..63, 72..79, 96]=8, [64..71, 80..95]=0 (and leaving others untouched, which keeps them 0). While the latter is setting [0..31, 46, 72..79, 96]=8, [64..71, 80..95]=0 and [48..63]=cntd Jakub [-- Attachment #2: 8.s1 --] [-- Type: text/plain, Size: 2365 bytes --] .arch armv8-a .file "8.c" .text .align 2 .align 4 .global foo .type foo, %function foo: .LFB0: adrp x2, .LANCHOR0 add x0, x2, :lo12:.LANCHOR0 mov w1, 8 strb w1, [x2, #:lo12:.LANCHOR0] strb w1, [x0, 1] strb w1, [x0, 2] strb w1, [x0, 3] strb w1, [x0, 4] strb w1, [x0, 5] strb w1, [x0, 6] strb w1, [x0, 7] strb w1, [x0, 8] strb w1, [x0, 9] strb w1, [x0, 10] strb w1, [x0, 11] strb w1, [x0, 12] strb w1, [x0, 13] strb w1, [x0, 14] strb w1, [x0, 15] strb w1, [x0, 16] strb w1, [x0, 17] strb w1, [x0, 18] strb w1, [x0, 19] strb w1, [x0, 20] strb w1, [x0, 21] strb w1, [x0, 22] strb w1, [x0, 23] strb w1, [x0, 24] strb w1, [x0, 25] strb w1, [x0, 26] strb w1, [x0, 27] strb w1, [x0, 28] strb w1, [x0, 29] strb w1, [x0, 30] strb w1, [x0, 31] strb wzr, [x0, 64] strb w1, [x0, 46] strb wzr, [x0, 65] strb wzr, [x0, 66] strb wzr, [x0, 67] strb wzr, [x0, 68] strb wzr, [x0, 69] strb wzr, [x0, 70] strb wzr, [x0, 71] strb w1, [x0, 72] strb w1, [x0, 73] strb w1, [x0, 74] strb w1, [x0, 75] strb w1, [x0, 76] strb w1, [x0, 77] strb w1, [x0, 78] strb w1, [x0, 79] strb wzr, [x0, 80] strb wzr, [x0, 81] strb wzr, [x0, 82] strb wzr, [x0, 83] strb wzr, [x0, 84] strb wzr, [x0, 85] strb wzr, [x0, 86] strb wzr, [x0, 87] strb wzr, [x0, 88] strb wzr, [x0, 89] strb wzr, [x0, 90] strb wzr, [x0, 91] strb wzr, [x0, 92] strb wzr, [x0, 93] strb wzr, [x0, 94] strb wzr, [x0, 95] strb w1, [x0, 48] strb w1, [x0, 49] strb w1, [x0, 50] strb w1, [x0, 51] strb w1, [x0, 52] strb w1, [x0, 53] strb w1, [x0, 54] strb w1, [x0, 55] strb w1, [x0, 56] strb w1, [x0, 57] strb w1, [x0, 58] strb w1, [x0, 59] strb w1, [x0, 60] strb w1, [x0, 61] strb w1, [x0, 62] strb w1, [x0, 63] strb w1, [x0, 96] ret .LFE0: .size foo, .-foo .bss .align 4 .set .LANCHOR0,. + 0 .type dwarf_reg_size_table, %object .size dwarf_reg_size_table, 98 dwarf_reg_size_table: .zero 98 .section .eh_frame,"aw",@progbits .Lframe1: .4byte .LECIE1-.LSCIE1 .LSCIE1: .4byte 0 .byte 0x3 .string "zR" .byte 0x1 .byte 0x78 .byte 0x1e .byte 0x1 .byte 0x1b .byte 0xc .byte 0x1f .byte 0 .align 3 .LECIE1: .LSFDE1: .4byte .LEFDE1-.LASFDE1 .LASFDE1: .4byte .LASFDE1-.Lframe1 .4byte .LFB0-. .4byte .LFE0-.LFB0 .byte 0 .align 3 .LEFDE1: .ident "GCC: (GNU) 13.0.0 20230111 (experimental)" .section .note.GNU-stack,"",@progbits [-- Attachment #3: 8.s2 --] [-- Type: text/plain, Size: 2378 bytes --] .arch armv8-a+sve .file "8.c" .text .align 2 .align 4 .global foo .type foo, %function foo: .LFB0: adrp x3, .LANCHOR0 add x0, x3, :lo12:.LANCHOR0 mov w1, 8 cntd x2 strb w1, [x3, #:lo12:.LANCHOR0] strb w1, [x0, 1] strb w1, [x0, 2] strb w1, [x0, 3] strb w1, [x0, 4] strb w1, [x0, 5] strb w1, [x0, 6] strb w1, [x0, 7] strb w1, [x0, 8] strb w1, [x0, 9] strb w1, [x0, 10] strb w1, [x0, 11] strb w1, [x0, 12] strb w1, [x0, 13] strb w1, [x0, 14] strb w1, [x0, 15] strb w1, [x0, 16] strb w1, [x0, 17] strb w1, [x0, 18] strb w1, [x0, 19] strb w1, [x0, 20] strb w1, [x0, 21] strb w1, [x0, 22] strb w1, [x0, 23] strb w1, [x0, 24] strb w1, [x0, 25] strb w1, [x0, 26] strb w1, [x0, 27] strb w1, [x0, 28] strb w1, [x0, 29] strb w1, [x0, 30] strb w1, [x0, 31] strb wzr, [x0, 64] strb w1, [x0, 46] strb wzr, [x0, 65] strb wzr, [x0, 66] strb wzr, [x0, 67] strb wzr, [x0, 68] strb wzr, [x0, 69] strb wzr, [x0, 70] strb wzr, [x0, 71] strb w1, [x0, 72] strb w1, [x0, 73] strb w1, [x0, 74] strb w1, [x0, 75] strb w1, [x0, 76] strb w1, [x0, 77] strb w1, [x0, 78] strb w1, [x0, 79] strb wzr, [x0, 80] strb wzr, [x0, 81] strb wzr, [x0, 82] strb wzr, [x0, 83] strb wzr, [x0, 84] strb wzr, [x0, 85] strb wzr, [x0, 86] strb wzr, [x0, 87] strb wzr, [x0, 88] strb wzr, [x0, 89] strb wzr, [x0, 90] strb wzr, [x0, 91] strb wzr, [x0, 92] strb wzr, [x0, 93] strb wzr, [x0, 94] strb wzr, [x0, 95] strb w2, [x0, 48] strb w2, [x0, 49] strb w2, [x0, 50] strb w2, [x0, 51] strb w2, [x0, 52] strb w2, [x0, 53] strb w2, [x0, 54] strb w2, [x0, 55] strb w2, [x0, 56] strb w2, [x0, 57] strb w2, [x0, 58] strb w2, [x0, 59] strb w2, [x0, 60] strb w2, [x0, 61] strb w2, [x0, 62] strb w2, [x0, 63] strb w1, [x0, 96] ret .LFE0: .size foo, .-foo .bss .align 3 .set .LANCHOR0,. + 0 .type dwarf_reg_size_table, %object .size dwarf_reg_size_table, 98 dwarf_reg_size_table: .zero 98 .section .eh_frame,"aw",@progbits .Lframe1: .4byte .LECIE1-.LSCIE1 .LSCIE1: .4byte 0 .byte 0x3 .string "zR" .byte 0x1 .byte 0x78 .byte 0x1e .byte 0x1 .byte 0x1b .byte 0xc .byte 0x1f .byte 0 .align 3 .LECIE1: .LSFDE1: .4byte .LEFDE1-.LASFDE1 .LASFDE1: .4byte .LASFDE1-.Lframe1 .4byte .LFB0-. .4byte .LFE0-.LFB0 .byte 0 .align 3 .LEFDE1: .ident "GCC: (GNU) 13.0.0 20230111 (experimental)" .section .note.GNU-stack,"",@progbits ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Various fixes for DWARF register size computation 2023-01-12 17:07 ` Jakub Jelinek @ 2023-01-12 21:03 ` Richard Sandiford 2023-01-13 10:05 ` [pushed] aarch64: Fix DWARF frame register sizes for predicates Richard Sandiford 0 siblings, 1 reply; 8+ messages in thread From: Richard Sandiford @ 2023-01-12 21:03 UTC (permalink / raw) To: Jakub Jelinek Cc: gcc-patches, Florian Weimer, Andre Vieira, Andrew Pinski, Jeff Law Jakub Jelinek <jakub@redhat.com> writes: > On Thu, Jan 12, 2023 at 04:50:07PM +0000, Richard Sandiford wrote: >> I'm jumping in here without fully understanding the context, so maybe this >> is exactly your point, but: the SIMD/FP DWARF registers are supposed to be >> size 8 regardless of which features are enabled. That's already only half >> of the hardware register size for base Armv8-A, since Advanced SIMD registers >> are 16 bytes in size. >> >> So yeah, if we're using the hardware register size then something is wrong. > > I'm talking about what the following compiles to > static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1]; > > void > foo (void) > { > __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table); > } > (and therefore what libgcc/unwind-dw2.c (init_dwarf_reg_size_table) as well) > with -O2 -fbuilding-libgcc -march=armv8-a vs. -O2 -fbuilding-libgcc -march=armv8-a+sve > The former is setting I think [0..31, 46, 48..63, 72..79, 96]=8, [64..71, 80..95]=0 > (and leaving others untouched, which keeps them 0). > While the latter is setting [0..31, 46, 72..79, 96]=8, [64..71, 80..95]=0 > and [48..63]=cntd Ah, interesting. So the SIMD/FP registers are OK, but the predicate registers are causing a problem. I think we should set the predicates to size 0 too, like we do for call-clobbered FP registers. Predicate registers should never need to be represented in CFI. Thanks, Richard ^ permalink raw reply [flat|nested] 8+ messages in thread
* [pushed] aarch64: Fix DWARF frame register sizes for predicates 2023-01-12 21:03 ` Richard Sandiford @ 2023-01-13 10:05 ` Richard Sandiford 0 siblings, 0 replies; 8+ messages in thread From: Richard Sandiford @ 2023-01-13 10:05 UTC (permalink / raw) To: Jakub Jelinek Cc: gcc-patches, Florian Weimer, Andre Vieira, Andrew Pinski, Jeff Law Richard Sandiford <richard.sandiford@arm.com> writes: > Jakub Jelinek <jakub@redhat.com> writes: >> On Thu, Jan 12, 2023 at 04:50:07PM +0000, Richard Sandiford wrote: >>> I'm jumping in here without fully understanding the context, so maybe this >>> is exactly your point, but: the SIMD/FP DWARF registers are supposed to be >>> size 8 regardless of which features are enabled. That's already only half >>> of the hardware register size for base Armv8-A, since Advanced SIMD registers >>> are 16 bytes in size. >>> >>> So yeah, if we're using the hardware register size then something is wrong. >> >> I'm talking about what the following compiles to >> static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1]; >> >> void >> foo (void) >> { >> __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table); >> } >> (and therefore what libgcc/unwind-dw2.c (init_dwarf_reg_size_table) as well) >> with -O2 -fbuilding-libgcc -march=armv8-a vs. -O2 -fbuilding-libgcc -march=armv8-a+sve >> The former is setting I think [0..31, 46, 48..63, 72..79, 96]=8, [64..71, 80..95]=0 >> (and leaving others untouched, which keeps them 0). >> While the latter is setting [0..31, 46, 72..79, 96]=8, [64..71, 80..95]=0 >> and [48..63]=cntd > > Ah, interesting. So the SIMD/FP registers are OK, but the predicate > registers are causing a problem. > > I think we should set the predicates to size 0 too, like we do for > call-clobbered FP registers. Predicate registers should never need > to be represented in CFI. Done with the patch below. Tested on aarch64-linux-gnu & pushed. Thanks Jakub for pointing this out. Richard gcc/ * config/aarch64/aarch64.cc (aarch64_dwarf_frame_reg_mode): New function. (TARGET_DWARF_FRAME_REG_MODE): Define. gcc/testsuite/ * gcc.target/aarch64/dwarf_reg_size_1.c: New test. * gcc.target/aarch64/dwarf_reg_size_2.c: Likewise. --- gcc/config/aarch64/aarch64.cc | 17 ++++++++++++ .../gcc.target/aarch64/dwarf_reg_size_1.c | 27 +++++++++++++++++++ .../gcc.target/aarch64/dwarf_reg_size_2.c | 6 +++++ 3 files changed, 50 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/dwarf_reg_size_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/dwarf_reg_size_2.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 80b71a7b612..2821368756b 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3443,6 +3443,20 @@ aarch64_debugger_regno (unsigned regno) return DWARF_FRAME_REGISTERS; } +/* Implement TARGET_DWARF_FRAME_REG_MODE. */ +static machine_mode +aarch64_dwarf_frame_reg_mode (int regno) +{ + /* Predicate registers are call-clobbered in the EH ABI (which is + ARM_PCS_AAPCS64), so they should not be described by CFI. + Their size changes as VL changes, so any values computed by + __builtin_init_dwarf_reg_size_table might not be valid for + all frames. */ + if (PR_REGNUM_P (regno)) + return VOIDmode; + return default_dwarf_frame_reg_mode (regno); +} + /* If X is a CONST_DOUBLE, return its bit representation as a constant integer, otherwise return X unmodified. */ static rtx @@ -27900,6 +27914,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_SCHED_REASSOCIATION_WIDTH #define TARGET_SCHED_REASSOCIATION_WIDTH aarch64_reassociation_width +#undef TARGET_DWARF_FRAME_REG_MODE +#define TARGET_DWARF_FRAME_REG_MODE aarch64_dwarf_frame_reg_mode + #undef TARGET_PROMOTED_TYPE #define TARGET_PROMOTED_TYPE aarch64_promoted_type diff --git a/gcc/testsuite/gcc.target/aarch64/dwarf_reg_size_1.c b/gcc/testsuite/gcc.target/aarch64/dwarf_reg_size_1.c new file mode 100644 index 00000000000..cb7666ddaa8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/dwarf_reg_size_1.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ +/* { dg-options "-fbuilding-libgcc" } */ + +static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1]; + +int +main (void) +{ + __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table); + /* X0-X31 and SP. */ + for (int i = 0; i < 32; ++i) + if (dwarf_reg_size_table[i] != 8) + __builtin_abort (); + /* Q0-Q31/Z0-Z31, of which only the low 64 bits of register 8-15 + are saved. */ + for (int i = 64; i < 96; ++i) + if (dwarf_reg_size_table[i] != (i >= 72 && i < 80 ? 8 : 0)) + __builtin_abort (); + /* P0-P15, which are never saved. */ + for (int i = 48; i < 63; ++i) + if (dwarf_reg_size_table[i] != 0) + __builtin_abort (); + /* VG */ + if (dwarf_reg_size_table[46] != 8) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/dwarf_reg_size_2.c b/gcc/testsuite/gcc.target/aarch64/dwarf_reg_size_2.c new file mode 100644 index 00000000000..8b7e6d4a717 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/dwarf_reg_size_2.c @@ -0,0 +1,6 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-fbuilding-libgcc" } */ + +#pragma GCC target "+sve" + +#include "dwarf_reg_size_1.c" -- 2.25.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-01-13 10:05 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-01-03 11:15 [PATCH] Various fixes for DWARF register size computation Florian Weimer 2023-01-03 12:05 ` Jakub Jelinek 2023-01-03 13:25 ` Florian Weimer 2023-01-03 13:54 ` Jakub Jelinek 2023-01-12 16:50 ` Richard Sandiford 2023-01-12 17:07 ` Jakub Jelinek 2023-01-12 21:03 ` Richard Sandiford 2023-01-13 10:05 ` [pushed] aarch64: Fix DWARF frame register sizes for predicates Richard Sandiford
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).