* [PATCH] Avoid default-initializing auto_vec<T, N> storage @ 2023-02-23 12:54 Richard Biener 2023-02-23 13:56 ` Jakub Jelinek 0 siblings, 1 reply; 15+ messages in thread From: Richard Biener @ 2023-02-23 12:54 UTC (permalink / raw) To: gcc-patches; +Cc: Jakub Jelinek The following avoids default-initializing auto_vec storage for non-POD T since that's not what the allocated storage fallback will do and it's also not expected for existing cases like auto_vec<std::pair<unsigned, unsigned>, 64> elts; which exist to optimize the allocation. Bootstrapped and tested on x86_64-unknown-linux-gnu. It saves around 1kB of text size for cc1. OK for trunk? Thanks, Richard. * vec.h (auto_vec<T, N>): Turn m_data storage into uninitialized unsigned char. --- gcc/vec.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/vec.h b/gcc/vec.h index a536b68732d..cee96254a31 100644 --- a/gcc/vec.h +++ b/gcc/vec.h @@ -1584,7 +1584,7 @@ public: private: vec<T, va_heap, vl_embed> m_auto; - T m_data[MAX (N - 1, 1)]; + alignas(T) unsigned char m_data[sizeof (T) * MAX (N - 1, 1)]; }; /* auto_vec is a sub class of vec whose storage is released when it is -- 2.35.3 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-23 12:54 [PATCH] Avoid default-initializing auto_vec<T, N> storage Richard Biener @ 2023-02-23 13:56 ` Jakub Jelinek 2023-02-23 15:02 ` Richard Biener 0 siblings, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2023-02-23 13:56 UTC (permalink / raw) To: Richard Biener, Jonathan Wakely; +Cc: gcc-patches On Thu, Feb 23, 2023 at 01:54:27PM +0100, Richard Biener wrote: > The following avoids default-initializing auto_vec storage for > non-POD T since that's not what the allocated storage fallback > will do and it's also not expected for existing cases like > > auto_vec<std::pair<unsigned, unsigned>, 64> elts; > > which exist to optimize the allocation. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > It saves around 1kB of text size for cc1. > > OK for trunk? > > Thanks, > Richard. > > * vec.h (auto_vec<T, N>): Turn m_data storage into > uninitialized unsigned char. Given that we actually never reference the m_data array anywhere, it is just to reserve space, I think even the alignas(T) there is useless. The point is that m_auto has as data members: vec_prefix m_vecpfx; T m_vecdata[1]; and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or -fsanitize=bound-sstrict incompatible) being treated as flexible array member flowing into the m_data storage after it. Though, this shows we still have work to do. Because 1) T m_vecdata[1]; still has element type T in m_auto and so is actually constructed/destructed uselessly 2) the non-POD support is certainly incomplete; we have vec_default_construct and vec_copy_construct and use that for say grow_cleared etc. or vector copies, but don't have anything that would invoke the destructors and don't invoke vec_copy_construct (aka placement new) when we just push stuff into the vector; so, I bet what we do is invalid and kind of works for non-PODs which have assignment operator which overwrites everything, doesn't use anything from the overwritten object and has the same overall effect as copy constructor, and only support trivial dtors For full non-POD support, I bet we'd need to vec_copy_construct (..., 1) on quick_push rather than invoke operator= there, we'd need to destruct stuff on pop/truncate etc., and quick_grow would need to be effectively the same as quick_grow_cleared for non-PODs. As for 1), if even m_vecdata was alignas(T) unsigned char m_vecdata[sizeof (T)]; we'll need casts in various spots and it will printing vectors by hand in gdb when the .gdbinit printers don't work properly. Oh, and perhaps we should start marking such spots in GCC with strict_flex_array attribute to make it clear where we rely on the non-strict behavior. > --- a/gcc/vec.h > +++ b/gcc/vec.h > @@ -1584,7 +1584,7 @@ public: > > private: > vec<T, va_heap, vl_embed> m_auto; > - T m_data[MAX (N - 1, 1)]; > + alignas(T) unsigned char m_data[sizeof (T) * MAX (N - 1, 1)]; > }; > > /* auto_vec is a sub class of vec whose storage is released when it is Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-23 13:56 ` Jakub Jelinek @ 2023-02-23 15:02 ` Richard Biener 2023-02-23 15:20 ` Jakub Jelinek 0 siblings, 1 reply; 15+ messages in thread From: Richard Biener @ 2023-02-23 15:02 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jonathan Wakely, gcc-patches On Thu, 23 Feb 2023, Jakub Jelinek wrote: > On Thu, Feb 23, 2023 at 01:54:27PM +0100, Richard Biener wrote: > > The following avoids default-initializing auto_vec storage for > > non-POD T since that's not what the allocated storage fallback > > will do and it's also not expected for existing cases like > > > > auto_vec<std::pair<unsigned, unsigned>, 64> elts; > > > > which exist to optimize the allocation. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > > > It saves around 1kB of text size for cc1. > > > > OK for trunk? > > > > Thanks, > > Richard. > > > > * vec.h (auto_vec<T, N>): Turn m_data storage into > > uninitialized unsigned char. > > Given that we actually never reference the m_data array anywhere, > it is just to reserve space, I think even the alignas(T) there is > useless. The point is that m_auto has as data members: > vec_prefix m_vecpfx; > T m_vecdata[1]; > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or > -fsanitize=bound-sstrict incompatible) being treated as > flexible array member flowing into the m_data storage after it. Doesn't the array otherwise eventually overlap with tail padding in m_auto? Or does an array of T never produce tail padding? > Though, this shows we still have work to do. Because > 1) T m_vecdata[1]; still has element type T in m_auto and > so is actually constructed/destructed uselessly True. > 2) the non-POD support is certainly incomplete; we have > vec_default_construct and vec_copy_construct and use that > for say grow_cleared etc. or vector copies, but don't have > anything that would invoke the destructors and don't invoke > vec_copy_construct (aka placement new) when we just push > stuff into the vector; so, I bet what we do is invalid > and kind of works for non-PODs which have assignment operator > which overwrites everything, doesn't use anything from the > overwritten object and has the same overall effect > as copy constructor, and only support trivial dtors > For full non-POD support, I bet we'd need to vec_copy_construct (..., 1) > on quick_push rather than invoke operator= there, we'd need > to destruct stuff on pop/truncate etc., and quick_grow would need > to be effectively the same as quick_grow_cleared for non-PODs. > As for 1), if even m_vecdata was > alignas(T) unsigned char m_vecdata[sizeof (T)]; > we'll need casts in various spots and it will printing vectors > by hand in gdb when the .gdbinit printers don't work properly. Yes, I'm not proposing to fix non-POD support. I want to make as-if-POD stuff like std::pair to work like it was intended. > Oh, and perhaps we should start marking such spots in GCC with > strict_flex_array attribute to make it clear where we rely on the > non-strict behavior. I think we never access the array directly as array, do we? Richard. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-23 15:02 ` Richard Biener @ 2023-02-23 15:20 ` Jakub Jelinek 2023-02-24 9:02 ` Richard Biener 0 siblings, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2023-02-23 15:20 UTC (permalink / raw) To: Richard Biener; +Cc: Jonathan Wakely, gcc-patches On Thu, Feb 23, 2023 at 03:02:01PM +0000, Richard Biener wrote: > > > * vec.h (auto_vec<T, N>): Turn m_data storage into > > > uninitialized unsigned char. > > > > Given that we actually never reference the m_data array anywhere, > > it is just to reserve space, I think even the alignas(T) there is > > useless. The point is that m_auto has as data members: > > vec_prefix m_vecpfx; > > T m_vecdata[1]; > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or > > -fsanitize=bound-sstrict incompatible) being treated as > > flexible array member flowing into the m_data storage after it. > > Doesn't the array otherwise eventually overlap with tail padding > in m_auto? Or does an array of T never produce tail padding? The array can certainly overlap with tail padding in m_auto if any. But whether m_data is aligned to alignof (T) or not doesn't change anything on it. m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, m_num; }, so I think there is on most arches tail padding if T has smaller alignment than int, so typically char/short or structs with the same size/alignments. If that happens, alignof (auto_vec_x.m_auto) will be alignof (int), there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto) is 3 * sizeof (int), m_data will have offset always aligned to alignof (T). If alignof (T) >= alignof (int), then there won't be any tail padding at the end of m_auto, there could be padding between m_vecpfx and m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and so m_data will be again already properly aligned. So, I think your patch is fine without alignas(T), the rest is just that there is more work to do incrementally, even for the case you want to deal with (the point 1) in particular). > Yes, I'm not proposing to fix non-POD support. I want to make > as-if-POD stuff like std::pair to work like it was intended. > > > Oh, and perhaps we should start marking such spots in GCC with > > strict_flex_array attribute to make it clear where we rely on the > > non-strict behavior. > > I think we never access the array directly as array, do we? Sure, the attribute should go to m_vecdata array, not to m_data. And to op array in gimple_statement_with_ops, operands array in operands, ops array in tree_omp_clause, val in tree_int_cst, str in tree_string, elts in tree_vector, a in tree_vec, elem in rtvec_def, elem in hwivec_def, u.{fld,hwint} in rtx_def and various others, we use this stuff everywhere. Also often use GTY length similarly to the proposed element_count... Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-23 15:20 ` Jakub Jelinek @ 2023-02-24 9:02 ` Richard Biener 2023-02-24 9:34 ` Richard Biener 0 siblings, 1 reply; 15+ messages in thread From: Richard Biener @ 2023-02-24 9:02 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jonathan Wakely, gcc-patches On Thu, 23 Feb 2023, Jakub Jelinek wrote: > On Thu, Feb 23, 2023 at 03:02:01PM +0000, Richard Biener wrote: > > > > * vec.h (auto_vec<T, N>): Turn m_data storage into > > > > uninitialized unsigned char. > > > > > > Given that we actually never reference the m_data array anywhere, > > > it is just to reserve space, I think even the alignas(T) there is > > > useless. The point is that m_auto has as data members: > > > vec_prefix m_vecpfx; > > > T m_vecdata[1]; > > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or > > > -fsanitize=bound-sstrict incompatible) being treated as > > > flexible array member flowing into the m_data storage after it. > > > > Doesn't the array otherwise eventually overlap with tail padding > > in m_auto? Or does an array of T never produce tail padding? > > The array can certainly overlap with tail padding in m_auto if any. > But whether m_data is aligned to alignof (T) or not doesn't change anything > on it. > m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, m_num; }, > so I think there is on most arches tail padding if T has smaller alignment > than int, so typically char/short or structs with the same size/alignments. > If that happens, alignof (auto_vec_x.m_auto) will be alignof (int), > there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto) > is 3 * sizeof (int), m_data will have offset always aligned to alignof (T). > If alignof (T) >= alignof (int), then there won't be any tail padding > at the end of m_auto, there could be padding between m_vecpfx and > m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and > so m_data will be again already properly aligned. > > So, I think your patch is fine without alignas(T), the rest is just that > there is more work to do incrementally, even for the case you want to > deal with (the point 1) in particular). Looking at vec<T, A, vl_embed>::operator[] which just does template<typename T, typename A> inline const T & vec<T, A, vl_embed>::operator[] (unsigned ix) const { gcc_checking_assert (ix < m_vecpfx.m_num); return m_vecdata[ix]; } the whole thing looks fragile at best - we basically have struct auto_vec { struct vec<vl_embed> { ... T m_vecdata[1]; } m_auto; T m_data[N-1]; }; and access m_auto.m_vecdata[] as if it extends to m_data. That's not something supported by the middle-end - not by design at least. auto_vec *p; p->m_auto.m_vecdata[i] would never alias p->m_data[j], in practice we might not see this though. Also get_ref_base_and_extent will compute a maxsize/size of sizeof(T) for any m_auto.m_vecdata[i] access, but I think we nowhere actually replace 'i' by zero based on this knowledge, but we'd perform CSE with earlier m_auto.m_vecdata[0] stores, so that might be something one could provoke. Doing a self-test like static __attribute__((noipa)) void test_auto_alias (int i) { auto_vec<int, 8> v; v.quick_grow (2); v[0] = 1; v[1] = 2; int val = v[i]; ASSERT_EQ (val, 2); } shows _27 = &_25->m_vecdata[0]; *_27 = 1; ... _7 = &_12->m_vecdata[i.235_3]; val_13 = *_7; which is safe in middle-end rules though. So what "saves" us here is that we always return a reference and never a value. There's the ::iterate member function which fails to do this, the ::quick_push function does T *slot = &m_vecdata[m_vecpfx.m_num++]; *slot = obj; with static __attribute__((noipa)) void test_auto_alias (int i) { auto_vec<int, 8> v; v.quick_grow (2); v[0] = 1; v[1] = 2; int val; for (int ix = i; v.iterate (ix, &val); ix++) ; ASSERT_EQ (val, 2); } I get that optimzied to a FAIL. I have a "fix" for this. unordered_remove has a similar issue accesing the last element. There are a few functions using the [] access member which is at least sub-optimal due to repeated bounds checking but also safe. I suppose if auto_vec would be a union of vec<vl_embed> and a storage member with the vl_embed active that would work, but then that's likely not something C++11 supports. So I think to support auto_vec we'd need to make the m_vecdata[] member in vec<vl_embed> of templated size (defaulted to 1) and get rid of the m_data member in auto_vec instead. Or have another C++ way of increasing the size of auto_vec without actually adding any member? The vec<vl_embed> data accesses then would need to go through a wrapper obtaining a correctly typed pointer to m_vecdata[] since we'd like to have that as unsigned char[] to avoid the initialization. > > Yes, I'm not proposing to fix non-POD support. I want to make > > as-if-POD stuff like std::pair to work like it was intended. > > > > > Oh, and perhaps we should start marking such spots in GCC with > > > strict_flex_array attribute to make it clear where we rely on the > > > non-strict behavior. > > > > I think we never access the array directly as array, do we? > > Sure, the attribute should go to m_vecdata array, not to m_data. > And to op array in gimple_statement_with_ops, operands array in > operands, ops array in tree_omp_clause, val in tree_int_cst, > str in tree_string, elts in tree_vector, a in tree_vec, elem in rtvec_def, > elem in hwivec_def, u.{fld,hwint} in rtx_def and various others, we > use this stuff everywhere. Also often use GTY length similarly to the > proposed element_count... Here's a patch to fix vec.h to adhere to GCC middle-ends interpretation of array accesses - &array[i] is just address arithmetic and out-of-bounds accesses are OK. But that doesn't prevent other host compilers from miscompiling stage1 I guess, I'm not sure if there's any standard conforming way to compute the address of an element in m_data from the address of m_vecdata? Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK? Thanks, Richard. From b7d47dd1c166d216eba7160f11087312f8b5bbef Mon Sep 17 00:00:00 2001 From: Richard Biener <rguenther@suse.de> Date: Fri, 24 Feb 2023 09:54:09 +0100 Subject: [PATCH] Fix vec.h alias problems To: gcc-patches@gcc.gnu.org With auto_vec we have embedded storage that spans two members, the m_auto.m_vecdata[1] and the m_data[N-1] arrays and accesses to the data are all based on m_auto.m_vecdata[]. That's OK for GCC as long as the accesses are done indirectly through a pointer since the address computation based on m_auto.m_vecdata[] is considered OK by GCC (but not by the language standards). The following fixes the few places that failed to enfoce this indirection and adds one self-test that failed before. As I was here it also fixes ::contains to avoid repeated bounds checking and the same issue in :;lower_bound which also suffers from unnecessary copying around values. * vec.h (vec<T, A, vl_embed>::lower_bound): Adjust to take a const reference to the object, access m_vecdata directly. (vec<T, A, vl_embed>::contains): Likewise. (vec<T, A, vl_embed>::iterate): Perform accesses to m_vecdata indirectly. (vec<T, A, vl_embed>::unordered_remove): Likewise. * vec.cc (test_auto_alias): New. (vec_cc_tests): Call it. --- gcc/vec.cc | 17 +++++++++++++++++ gcc/vec.h | 21 ++++++++++++++------- 2 files changed, 31 insertions(+), 7 deletions(-) diff --git a/gcc/vec.cc b/gcc/vec.cc index 511e6dff50d..afba66cb3d0 100644 --- a/gcc/vec.cc +++ b/gcc/vec.cc @@ -568,6 +568,22 @@ test_auto_delete_vec () ASSERT_EQ (dtor_count, 2); } +/* Verify accesses to m_vecdata are done indirectly. */ + +static void +test_auto_alias () +{ + volatile int i = 1; + auto_vec<int, 8> v; + v.quick_grow (2); + v[0] = 1; + v[1] = 2; + int val; + for (int ix = i; v.iterate (ix, &val); ix++) + ; + ASSERT_EQ (val, 2); +} + /* Run all of the selftests within this file. */ void @@ -587,6 +603,7 @@ vec_cc_tests () test_qsort (); test_reverse (); test_auto_delete_vec (); + test_auto_alias (); } } // namespace selftest diff --git a/gcc/vec.h b/gcc/vec.h index cee96254a31..b3ec977c23a 100644 --- a/gcc/vec.h +++ b/gcc/vec.h @@ -614,7 +614,7 @@ public: T *bsearch (const void *key, int (*compar)(const void *, const void *)); T *bsearch (const void *key, int (*compar)(const void *, const void *, void *), void *); - unsigned lower_bound (T, bool (*)(const T &, const T &)) const; + unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const; bool contains (const T &search) const; static size_t embedded_size (unsigned); void embedded_init (unsigned, unsigned = 0, unsigned = 0); @@ -929,7 +929,8 @@ vec<T, A, vl_embed>::iterate (unsigned ix, T *ptr) const { if (ix < m_vecpfx.m_num) { - *ptr = m_vecdata[ix]; + const T *slot = &m_vecdata[ix]; + *ptr = *slot; return true; } else @@ -1118,7 +1119,9 @@ inline void vec<T, A, vl_embed>::unordered_remove (unsigned ix) { gcc_checking_assert (ix < length ()); - m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num]; + const T *last_slot = &m_vecdata[--m_vecpfx.m_num]; + T *slot = &m_vecdata[ix]; + *slot = *last_slot; } @@ -1249,8 +1252,11 @@ vec<T, A, vl_embed>::contains (const T &search) const { unsigned int len = length (); for (unsigned int i = 0; i < len; i++) - if ((*this)[i] == search) - return true; + { + const T *slot = &m_vecdata[i]; + if (*slot == search) + return true; + } return false; } @@ -1262,7 +1268,8 @@ vec<T, A, vl_embed>::contains (const T &search) const template<typename T, typename A> unsigned -vec<T, A, vl_embed>::lower_bound (T obj, bool (*lessthan)(const T &, const T &)) +vec<T, A, vl_embed>::lower_bound (const T &obj, + bool (*lessthan)(const T &, const T &)) const { unsigned int len = length (); @@ -1273,7 +1280,7 @@ vec<T, A, vl_embed>::lower_bound (T obj, bool (*lessthan)(const T &, const T &)) half = len / 2; middle = first; middle += half; - T middle_elem = (*this)[middle]; + const T &middle_elem = this->m_vecdata[middle]; if (lessthan (middle_elem, obj)) { first = middle; -- 2.35.3 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 9:02 ` Richard Biener @ 2023-02-24 9:34 ` Richard Biener 2023-02-24 9:48 ` Jakub Jelinek 0 siblings, 1 reply; 15+ messages in thread From: Richard Biener @ 2023-02-24 9:34 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jonathan Wakely, gcc-patches On Fri, 24 Feb 2023, Richard Biener wrote: > On Thu, 23 Feb 2023, Jakub Jelinek wrote: > > > On Thu, Feb 23, 2023 at 03:02:01PM +0000, Richard Biener wrote: > > > > > * vec.h (auto_vec<T, N>): Turn m_data storage into > > > > > uninitialized unsigned char. > > > > > > > > Given that we actually never reference the m_data array anywhere, > > > > it is just to reserve space, I think even the alignas(T) there is > > > > useless. The point is that m_auto has as data members: > > > > vec_prefix m_vecpfx; > > > > T m_vecdata[1]; > > > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or > > > > -fsanitize=bound-sstrict incompatible) being treated as > > > > flexible array member flowing into the m_data storage after it. > > > > > > Doesn't the array otherwise eventually overlap with tail padding > > > in m_auto? Or does an array of T never produce tail padding? > > > > The array can certainly overlap with tail padding in m_auto if any. > > But whether m_data is aligned to alignof (T) or not doesn't change anything > > on it. > > m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, m_num; }, > > so I think there is on most arches tail padding if T has smaller alignment > > than int, so typically char/short or structs with the same size/alignments. > > If that happens, alignof (auto_vec_x.m_auto) will be alignof (int), > > there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto) > > is 3 * sizeof (int), m_data will have offset always aligned to alignof (T). > > If alignof (T) >= alignof (int), then there won't be any tail padding > > at the end of m_auto, there could be padding between m_vecpfx and > > m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and > > so m_data will be again already properly aligned. > > > > So, I think your patch is fine without alignas(T), the rest is just that > > there is more work to do incrementally, even for the case you want to > > deal with (the point 1) in particular). > > Looking at vec<T, A, vl_embed>::operator[] which just does > > template<typename T, typename A> > inline const T & > vec<T, A, vl_embed>::operator[] (unsigned ix) const > { > gcc_checking_assert (ix < m_vecpfx.m_num); > return m_vecdata[ix]; > } > > the whole thing looks fragile at best - we basically have > > struct auto_vec > { > struct vec<vl_embed> > { > ... > T m_vecdata[1]; > } m_auto; > T m_data[N-1]; > }; > > and access m_auto.m_vecdata[] as if it extends to m_data. That's > not something supported by the middle-end - not by design at least. > auto_vec *p; p->m_auto.m_vecdata[i] would never alias > p->m_data[j], in practice we might not see this though. Also > get_ref_base_and_extent will compute a maxsize/size of sizeof(T) > for any m_auto.m_vecdata[i] access, but I think we nowhere > actually replace 'i' by zero based on this knowledge, but we'd > perform CSE with earlier m_auto.m_vecdata[0] stores, so that > might be something one could provoke. Doing a self-test like > > static __attribute__((noipa)) void > test_auto_alias (int i) > { > auto_vec<int, 8> v; > v.quick_grow (2); > v[0] = 1; > v[1] = 2; > int val = v[i]; > ASSERT_EQ (val, 2); > } > > shows > > _27 = &_25->m_vecdata[0]; > *_27 = 1; > ... > _7 = &_12->m_vecdata[i.235_3]; > val_13 = *_7; > > which is safe in middle-end rules though. So what "saves" us > here is that we always return a reference and never a value. > There's the ::iterate member function which fails to do this, > the ::quick_push function does > > T *slot = &m_vecdata[m_vecpfx.m_num++]; > *slot = obj; > > with > > static __attribute__((noipa)) void > test_auto_alias (int i) > { > auto_vec<int, 8> v; > v.quick_grow (2); > v[0] = 1; > v[1] = 2; > int val; > for (int ix = i; v.iterate (ix, &val); ix++) > ; > ASSERT_EQ (val, 2); > } > > I get that optimzied to a FAIL. I have a "fix" for this. > unordered_remove has a similar issue accesing the last element. Turns out forwprop "breaks" this still, so the fix doesn't work. That means we have a hole here in the middle-end. We can avoid this by obfuscating things even more, like to const T *first = m_vecdata; const T *slot = first + ix; *ptr = *slot; which at least for variable 'ix' avoids forwprop from triggering. But this also means that the existing [] accessor isn't really safe, we're just lucky that we turn constant accesses to MEM[(int &)&v + off] = val; and that we now have PR108355 which made the get_ref_base_and_extent info used less often in VN. I'm testing the patch now without the new selftest, it should be good to avoid these issues for constant indexes. I can also split the patch up. But in the end I think we have to fix auto_vec in a better way. Richard. > There are a few functions using the [] access member which is > at least sub-optimal due to repeated bounds checking but also safe. > > I suppose if auto_vec would be a union of vec<vl_embed> and > a storage member with the vl_embed active that would work, but then > that's likely not something C++11 supports. > > So I think to support auto_vec we'd need to make the m_vecdata[] > member in vec<vl_embed> of templated size (defaulted to 1) > and get rid of the m_data member in auto_vec instead. Or have > another C++ way of increasing the size of auto_vec without > actually adding any member? > > The vec<vl_embed> data accesses then would need to go through > a wrapper obtaining a correctly typed pointer to m_vecdata[] > since we'd like to have that as unsigned char[] to avoid the > initialization. > > > > Yes, I'm not proposing to fix non-POD support. I want to make > > > as-if-POD stuff like std::pair to work like it was intended. > > > > > > > Oh, and perhaps we should start marking such spots in GCC with > > > > strict_flex_array attribute to make it clear where we rely on the > > > > non-strict behavior. > > > > > > I think we never access the array directly as array, do we? > > > > Sure, the attribute should go to m_vecdata array, not to m_data. > > And to op array in gimple_statement_with_ops, operands array in > > operands, ops array in tree_omp_clause, val in tree_int_cst, > > str in tree_string, elts in tree_vector, a in tree_vec, elem in rtvec_def, > > elem in hwivec_def, u.{fld,hwint} in rtx_def and various others, we > > use this stuff everywhere. Also often use GTY length similarly to the > > proposed element_count... > > Here's a patch to fix vec.h to adhere to GCC middle-ends interpretation > of array accesses - &array[i] is just address arithmetic and out-of-bounds > accesses are OK. But that doesn't prevent other host compilers from > miscompiling stage1 I guess, I'm not sure if there's any standard > conforming way to compute the address of an element in m_data from > the address of m_vecdata? > > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK? > > Thanks, > Richard. > > > From b7d47dd1c166d216eba7160f11087312f8b5bbef Mon Sep 17 00:00:00 2001 > From: Richard Biener <rguenther@suse.de> > Date: Fri, 24 Feb 2023 09:54:09 +0100 > Subject: [PATCH] Fix vec.h alias problems > To: gcc-patches@gcc.gnu.org > > With auto_vec we have embedded storage that spans two members, > the m_auto.m_vecdata[1] and the m_data[N-1] arrays and accesses > to the data are all based on m_auto.m_vecdata[]. That's OK for > GCC as long as the accesses are done indirectly through a pointer > since the address computation based on m_auto.m_vecdata[] is > considered OK by GCC (but not by the language standards). > > The following fixes the few places that failed to enfoce this > indirection and adds one self-test that failed before. > > As I was here it also fixes ::contains to avoid repeated bounds > checking and the same issue in :;lower_bound which also suffers > from unnecessary copying around values. > > * vec.h (vec<T, A, vl_embed>::lower_bound): Adjust to > take a const reference to the object, access m_vecdata > directly. > (vec<T, A, vl_embed>::contains): Likewise. > (vec<T, A, vl_embed>::iterate): Perform accesses to > m_vecdata indirectly. > (vec<T, A, vl_embed>::unordered_remove): Likewise. > * vec.cc (test_auto_alias): New. > (vec_cc_tests): Call it. > --- > gcc/vec.cc | 17 +++++++++++++++++ > gcc/vec.h | 21 ++++++++++++++------- > 2 files changed, 31 insertions(+), 7 deletions(-) > > diff --git a/gcc/vec.cc b/gcc/vec.cc > index 511e6dff50d..afba66cb3d0 100644 > --- a/gcc/vec.cc > +++ b/gcc/vec.cc > @@ -568,6 +568,22 @@ test_auto_delete_vec () > ASSERT_EQ (dtor_count, 2); > } > > +/* Verify accesses to m_vecdata are done indirectly. */ > + > +static void > +test_auto_alias () > +{ > + volatile int i = 1; > + auto_vec<int, 8> v; > + v.quick_grow (2); > + v[0] = 1; > + v[1] = 2; > + int val; > + for (int ix = i; v.iterate (ix, &val); ix++) > + ; > + ASSERT_EQ (val, 2); > +} > + > /* Run all of the selftests within this file. */ > > void > @@ -587,6 +603,7 @@ vec_cc_tests () > test_qsort (); > test_reverse (); > test_auto_delete_vec (); > + test_auto_alias (); > } > > } // namespace selftest > diff --git a/gcc/vec.h b/gcc/vec.h > index cee96254a31..b3ec977c23a 100644 > --- a/gcc/vec.h > +++ b/gcc/vec.h > @@ -614,7 +614,7 @@ public: > T *bsearch (const void *key, int (*compar)(const void *, const void *)); > T *bsearch (const void *key, > int (*compar)(const void *, const void *, void *), void *); > - unsigned lower_bound (T, bool (*)(const T &, const T &)) const; > + unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const; > bool contains (const T &search) const; > static size_t embedded_size (unsigned); > void embedded_init (unsigned, unsigned = 0, unsigned = 0); > @@ -929,7 +929,8 @@ vec<T, A, vl_embed>::iterate (unsigned ix, T *ptr) const > { > if (ix < m_vecpfx.m_num) > { > - *ptr = m_vecdata[ix]; > + const T *slot = &m_vecdata[ix]; > + *ptr = *slot; > return true; > } > else > @@ -1118,7 +1119,9 @@ inline void > vec<T, A, vl_embed>::unordered_remove (unsigned ix) > { > gcc_checking_assert (ix < length ()); > - m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num]; > + const T *last_slot = &m_vecdata[--m_vecpfx.m_num]; > + T *slot = &m_vecdata[ix]; > + *slot = *last_slot; > } > > > @@ -1249,8 +1252,11 @@ vec<T, A, vl_embed>::contains (const T &search) const > { > unsigned int len = length (); > for (unsigned int i = 0; i < len; i++) > - if ((*this)[i] == search) > - return true; > + { > + const T *slot = &m_vecdata[i]; > + if (*slot == search) > + return true; > + } > > return false; > } > @@ -1262,7 +1268,8 @@ vec<T, A, vl_embed>::contains (const T &search) const > > template<typename T, typename A> > unsigned > -vec<T, A, vl_embed>::lower_bound (T obj, bool (*lessthan)(const T &, const T &)) > +vec<T, A, vl_embed>::lower_bound (const T &obj, > + bool (*lessthan)(const T &, const T &)) > const > { > unsigned int len = length (); > @@ -1273,7 +1280,7 @@ vec<T, A, vl_embed>::lower_bound (T obj, bool (*lessthan)(const T &, const T &)) > half = len / 2; > middle = first; > middle += half; > - T middle_elem = (*this)[middle]; > + const T &middle_elem = this->m_vecdata[middle]; > if (lessthan (middle_elem, obj)) > { > first = middle; > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 9:34 ` Richard Biener @ 2023-02-24 9:48 ` Jakub Jelinek 2023-02-24 9:50 ` Jonathan Wakely 0 siblings, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2023-02-24 9:48 UTC (permalink / raw) To: Richard Biener; +Cc: Jonathan Wakely, gcc-patches On Fri, Feb 24, 2023 at 09:34:46AM +0000, Richard Biener wrote: > > Looking at vec<T, A, vl_embed>::operator[] which just does > > > > template<typename T, typename A> > > inline const T & > > vec<T, A, vl_embed>::operator[] (unsigned ix) const > > { > > gcc_checking_assert (ix < m_vecpfx.m_num); > > return m_vecdata[ix]; > > } > > > > the whole thing looks fragile at best - we basically have > > > > struct auto_vec > > { > > struct vec<vl_embed> > > { > > ... > > T m_vecdata[1]; > > } m_auto; > > T m_data[N-1]; > > }; Assuming a compiler handles the T m_vecdata[1]; as flexible array member like (which we need because standard C++ doesn't have flexible array members nor [0] arrays), I wonder if we instead of the m_auto followed by m_data trick couldn't make auto_vec have alignas(vec<vl_embed>) unsigned char buf m_data[sizeof (vec<vl_embed>) + (N - 1) * sizeof (T)]; and do a placement new of vec<vl_embed> into that m_data during auto_vec construction. Isn't it then similar to how are flexible array members normally used in C, where one uses malloc or alloca to allocate storage for them and the storage can be larger than the structure itself and flexible array member then can use storage after it? Though, of course, we'd need to test it with various compilers, GCC 4.8 till now, various versions of clang, ICC, ... Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 9:48 ` Jakub Jelinek @ 2023-02-24 9:50 ` Jonathan Wakely 2023-02-24 9:55 ` Jakub Jelinek 2023-02-24 9:55 ` Jonathan Wakely 0 siblings, 2 replies; 15+ messages in thread From: Jonathan Wakely @ 2023-02-24 9:50 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote: > > Assuming a compiler handles the T m_vecdata[1]; as flexible array member > like (which we need because standard C++ doesn't have flexible array members > nor [0] arrays), I wonder if we instead of the m_auto followed by m_data > trick couldn't make auto_vec have > alignas(vec<vl_embed>) unsigned char buf m_data[sizeof (vec<vl_embed>) + (N - 1) * sizeof (T)]; > and do a placement new of vec<vl_embed> into that m_data during auto_vec > construction. Isn't it then similar to how are flexible array members > normally used in C, where one uses malloc or alloca to allocate storage > for them and the storage can be larger than the structure itself and > flexible array member then can use storage after it? You would still be accessing past the end of the vec<vl_embed>::m_vecdata array which is UB. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 9:50 ` Jonathan Wakely @ 2023-02-24 9:55 ` Jakub Jelinek 2023-02-24 9:55 ` Jonathan Wakely 1 sibling, 0 replies; 15+ messages in thread From: Jakub Jelinek @ 2023-02-24 9:55 UTC (permalink / raw) To: Jonathan Wakely; +Cc: Richard Biener, gcc-patches On Fri, Feb 24, 2023 at 09:50:33AM +0000, Jonathan Wakely wrote: > On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote: > > > > Assuming a compiler handles the T m_vecdata[1]; as flexible array member > > like (which we need because standard C++ doesn't have flexible array members > > nor [0] arrays), I wonder if we instead of the m_auto followed by m_data > > trick couldn't make auto_vec have > > alignas(vec<vl_embed>) unsigned char buf m_data[sizeof (vec<vl_embed>) + (N - 1) * sizeof (T)]; > > and do a placement new of vec<vl_embed> into that m_data during auto_vec > > construction. Isn't it then similar to how are flexible array members > > normally used in C, where one uses malloc or alloca to allocate storage > > for them and the storage can be larger than the structure itself and > > flexible array member then can use storage after it? > > You would still be accessing past the end of the > vec<vl_embed>::m_vecdata array which is UB. Pedantically sure, but because C++ doesn't have flexible array members, people in the wild use the flexible array member like arrays for that purpose. If there was T m_vecdata[];, would it still be UB (with the flexible array member extensions)? We could use T m_vecdata[]; if the host compiler supports them and T m_vecdata[1]; otherwise in the hope that the compiler handles it similarly. After all, I think lots of other real-world programs do the same. Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 9:50 ` Jonathan Wakely 2023-02-24 9:55 ` Jakub Jelinek @ 2023-02-24 9:55 ` Jonathan Wakely 2023-02-24 10:02 ` Jakub Jelinek 1 sibling, 1 reply; 15+ messages in thread From: Jonathan Wakely @ 2023-02-24 9:55 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches On Fri, 24 Feb 2023 at 09:50, Jonathan Wakely <jwakely@redhat.com> wrote: > > On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote: > > > > Assuming a compiler handles the T m_vecdata[1]; as flexible array member > > like (which we need because standard C++ doesn't have flexible array members > > nor [0] arrays), I wonder if we instead of the m_auto followed by m_data > > trick couldn't make auto_vec have > > alignas(vec<vl_embed>) unsigned char buf m_data[sizeof (vec<vl_embed>) + (N - 1) * sizeof (T)]; > > and do a placement new of vec<vl_embed> into that m_data during auto_vec > > construction. Isn't it then similar to how are flexible array members > > normally used in C, where one uses malloc or alloca to allocate storage > > for them and the storage can be larger than the structure itself and > > flexible array member then can use storage after it? > > You would still be accessing past the end of the > vec<vl_embed>::m_vecdata array which is UB. My thinking is something like: // New tag type struct vl_relative { }; // This must only be used as a member subobject of another type // which provides the trailing storage. template<typename T> struct vec<T, va_heap, vl_relative> { T *address (void) { return (T*)(m_vecpfx+1); } const T *address (void) const { return (T*)(m_vecpfx+1); } alignas(T) alignas(vec_prefix) vec_prefix m_vecpfx; }; template<typename T, size_t N /* = 0 */> class auto_vec : public vec<T, va_heap> { // ... private: vec<T, va_heap, vl_relative> m_head; T m_data[N]; static_assert(...); }; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 9:55 ` Jonathan Wakely @ 2023-02-24 10:02 ` Jakub Jelinek 2023-02-24 10:24 ` Jakub Jelinek 0 siblings, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2023-02-24 10:02 UTC (permalink / raw) To: Jonathan Wakely; +Cc: Richard Biener, gcc-patches On Fri, Feb 24, 2023 at 09:55:13AM +0000, Jonathan Wakely wrote: > > You would still be accessing past the end of the > > vec<vl_embed>::m_vecdata array which is UB. > > My thinking is something like: > > // New tag type > struct vl_relative { }; > > // This must only be used as a member subobject of another type > // which provides the trailing storage. > template<typename T> > struct vec<T, va_heap, vl_relative> > { > T *address (void) { return (T*)(m_vecpfx+1); } > const T *address (void) const { return (T*)(m_vecpfx+1); } > > alignas(T) alignas(vec_prefix) vec_prefix m_vecpfx; > }; > > template<typename T, size_t N /* = 0 */> > class auto_vec : public vec<T, va_heap> > { > // ... > private: > vec<T, va_heap, vl_relative> m_head; > T m_data[N]; > > static_assert(...); > }; Maybe this would work, vl_relative even could be vl_embed. Because vl_embed I believe is used in two spots, part of auto_vec where it is followed by m_data and on heap or GGC allocated memory where vec<..., vl_embed> is followed by further storage for the vector. Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 10:02 ` Jakub Jelinek @ 2023-02-24 10:24 ` Jakub Jelinek 2023-02-24 10:30 ` Jonathan Wakely 0 siblings, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2023-02-24 10:24 UTC (permalink / raw) To: Jonathan Wakely, Richard Biener, gcc-patches On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches wrote: > Maybe this would work, vl_relative even could be vl_embed. > Because vl_embed I believe is used in two spots, part of > auto_vec where it is followed by m_data and on heap or GGC > allocated memory where vec<..., vl_embed> is followed by > further storage for the vector. So roughtly something like below? Except I get weird crashes with it in the gen* tools. And we'd need to adjust the gdb python hooks which also use m_vecdata. --- gcc/vec.h.jj 2023-01-02 09:32:32.177143804 +0100 +++ gcc/vec.h 2023-02-24 11:19:37.900157177 +0100 @@ -586,8 +586,8 @@ public: unsigned allocated (void) const { return m_vecpfx.m_alloc; } unsigned length (void) const { return m_vecpfx.m_num; } bool is_empty (void) const { return m_vecpfx.m_num == 0; } - T *address (void) { return m_vecdata; } - const T *address (void) const { return m_vecdata; } + T *address (void) { return (T *) (this + 1); } + const T *address (void) const { return (const T *) (this + 1); } T *begin () { return address (); } const T *begin () const { return address (); } T *end () { return address () + length (); } @@ -629,10 +629,9 @@ public: friend struct va_gc_atomic; friend struct va_heap; - /* FIXME - These fields should be private, but we need to cater to + /* FIXME - This field should be private, but we need to cater to compilers that have stricter notions of PODness for types. */ - vec_prefix m_vecpfx; - T m_vecdata[1]; + alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx; }; @@ -879,7 +878,7 @@ inline const T & vec<T, A, vl_embed>::operator[] (unsigned ix) const { gcc_checking_assert (ix < m_vecpfx.m_num); - return m_vecdata[ix]; + return address ()[ix]; } template<typename T, typename A> @@ -887,7 +886,7 @@ inline T & vec<T, A, vl_embed>::operator[] (unsigned ix) { gcc_checking_assert (ix < m_vecpfx.m_num); - return m_vecdata[ix]; + return address ()[ix]; } @@ -929,7 +928,7 @@ vec<T, A, vl_embed>::iterate (unsigned i { if (ix < m_vecpfx.m_num) { - *ptr = m_vecdata[ix]; + *ptr = address ()[ix]; return true; } else @@ -955,7 +954,7 @@ vec<T, A, vl_embed>::iterate (unsigned i { if (ix < m_vecpfx.m_num) { - *ptr = CONST_CAST (T *, &m_vecdata[ix]); + *ptr = CONST_CAST (T *, &address ()[ix]); return true; } else @@ -978,7 +977,7 @@ vec<T, A, vl_embed>::copy (ALONE_MEM_STA { vec_alloc (new_vec, len PASS_MEM_STAT); new_vec->embedded_init (len, len); - vec_copy_construct (new_vec->address (), m_vecdata, len); + vec_copy_construct (new_vec->address (), address (), len); } return new_vec; } @@ -1018,7 +1017,7 @@ inline T * vec<T, A, vl_embed>::quick_push (const T &obj) { gcc_checking_assert (space (1)); - T *slot = &m_vecdata[m_vecpfx.m_num++]; + T *slot = &address ()[m_vecpfx.m_num++]; *slot = obj; return slot; } @@ -1031,7 +1030,7 @@ inline T & vec<T, A, vl_embed>::pop (void) { gcc_checking_assert (length () > 0); - return m_vecdata[--m_vecpfx.m_num]; + return address ()[--m_vecpfx.m_num]; } @@ -1056,7 +1055,7 @@ vec<T, A, vl_embed>::quick_insert (unsig { gcc_checking_assert (length () < allocated ()); gcc_checking_assert (ix <= length ()); - T *slot = &m_vecdata[ix]; + T *slot = &address ()[ix]; memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T)); *slot = obj; } @@ -1071,7 +1070,7 @@ inline void vec<T, A, vl_embed>::ordered_remove (unsigned ix) { gcc_checking_assert (ix < length ()); - T *slot = &m_vecdata[ix]; + T *slot = &address ()[ix]; memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T)); } @@ -1118,7 +1117,7 @@ inline void vec<T, A, vl_embed>::unordered_remove (unsigned ix) { gcc_checking_assert (ix < length ()); - m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num]; + address ()[ix] = address ()[--m_vecpfx.m_num]; } @@ -1130,7 +1129,7 @@ inline void vec<T, A, vl_embed>::block_remove (unsigned ix, unsigned len) { gcc_checking_assert (ix + len <= length ()); - T *slot = &m_vecdata[ix]; + T *slot = &address ()[ix]; m_vecpfx.m_num -= len; memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T)); } @@ -1309,7 +1308,7 @@ vec<T, A, vl_embed>::embedded_size (unsi vec, vec_embedded>::type vec_stdlayout; static_assert (sizeof (vec_stdlayout) == sizeof (vec), ""); static_assert (alignof (vec_stdlayout) == alignof (vec), ""); - return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T); + return sizeof (vec_stdlayout) + alloc * sizeof (T); } @@ -1476,10 +1475,10 @@ public: { return m_vec ? m_vec->length () : 0; } T *address (void) - { return m_vec ? m_vec->m_vecdata : NULL; } + { return m_vec ? m_vec->address () : NULL; } const T *address (void) const - { return m_vec ? m_vec->m_vecdata : NULL; } + { return m_vec ? m_vec->address () : NULL; } T *begin () { return address (); } const T *begin () const { return address (); } @@ -1584,7 +1583,7 @@ public: private: vec<T, va_heap, vl_embed> m_auto; - T m_data[MAX (N - 1, 1)]; + unsigned char m_data[MAX (N - 1, 1)]; }; /* auto_vec is a sub class of vec whose storage is released when it is Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 10:24 ` Jakub Jelinek @ 2023-02-24 10:30 ` Jonathan Wakely 2023-02-24 10:59 ` Jakub Jelinek 0 siblings, 1 reply; 15+ messages in thread From: Jonathan Wakely @ 2023-02-24 10:30 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches On Fri, 24 Feb 2023 at 10:24, Jakub Jelinek <jakub@redhat.com> wrote: > > On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches wrote: > > Maybe this would work, vl_relative even could be vl_embed. > > Because vl_embed I believe is used in two spots, part of > > auto_vec where it is followed by m_data and on heap or GGC > > allocated memory where vec<..., vl_embed> is followed by > > further storage for the vector. > > So roughtly something like below? Except I get weird crashes with it > in the gen* tools. And we'd need to adjust the gdb python hooks > which also use m_vecdata. > > --- gcc/vec.h.jj 2023-01-02 09:32:32.177143804 +0100 > +++ gcc/vec.h 2023-02-24 11:19:37.900157177 +0100 > @@ -586,8 +586,8 @@ public: > unsigned allocated (void) const { return m_vecpfx.m_alloc; } > unsigned length (void) const { return m_vecpfx.m_num; } > bool is_empty (void) const { return m_vecpfx.m_num == 0; } > - T *address (void) { return m_vecdata; } > - const T *address (void) const { return m_vecdata; } > + T *address (void) { return (T *) (this + 1); } > + const T *address (void) const { return (const T *) (this + 1); } > T *begin () { return address (); } > const T *begin () const { return address (); } > T *end () { return address () + length (); } > @@ -629,10 +629,9 @@ public: > friend struct va_gc_atomic; > friend struct va_heap; > > - /* FIXME - These fields should be private, but we need to cater to > + /* FIXME - This field should be private, but we need to cater to > compilers that have stricter notions of PODness for types. */ > - vec_prefix m_vecpfx; > - T m_vecdata[1]; > + alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx; > }; > > > @@ -879,7 +878,7 @@ inline const T & > vec<T, A, vl_embed>::operator[] (unsigned ix) const > { > gcc_checking_assert (ix < m_vecpfx.m_num); > - return m_vecdata[ix]; > + return address ()[ix]; > } > > template<typename T, typename A> > @@ -887,7 +886,7 @@ inline T & > vec<T, A, vl_embed>::operator[] (unsigned ix) > { > gcc_checking_assert (ix < m_vecpfx.m_num); > - return m_vecdata[ix]; > + return address ()[ix]; > } > > > @@ -929,7 +928,7 @@ vec<T, A, vl_embed>::iterate (unsigned i > { > if (ix < m_vecpfx.m_num) > { > - *ptr = m_vecdata[ix]; > + *ptr = address ()[ix]; > return true; > } > else > @@ -955,7 +954,7 @@ vec<T, A, vl_embed>::iterate (unsigned i > { > if (ix < m_vecpfx.m_num) > { > - *ptr = CONST_CAST (T *, &m_vecdata[ix]); > + *ptr = CONST_CAST (T *, &address ()[ix]); > return true; > } > else > @@ -978,7 +977,7 @@ vec<T, A, vl_embed>::copy (ALONE_MEM_STA > { > vec_alloc (new_vec, len PASS_MEM_STAT); > new_vec->embedded_init (len, len); > - vec_copy_construct (new_vec->address (), m_vecdata, len); > + vec_copy_construct (new_vec->address (), address (), len); > } > return new_vec; > } > @@ -1018,7 +1017,7 @@ inline T * > vec<T, A, vl_embed>::quick_push (const T &obj) > { > gcc_checking_assert (space (1)); > - T *slot = &m_vecdata[m_vecpfx.m_num++]; > + T *slot = &address ()[m_vecpfx.m_num++]; > *slot = obj; > return slot; > } > @@ -1031,7 +1030,7 @@ inline T & > vec<T, A, vl_embed>::pop (void) > { > gcc_checking_assert (length () > 0); > - return m_vecdata[--m_vecpfx.m_num]; > + return address ()[--m_vecpfx.m_num]; > } > > > @@ -1056,7 +1055,7 @@ vec<T, A, vl_embed>::quick_insert (unsig > { > gcc_checking_assert (length () < allocated ()); > gcc_checking_assert (ix <= length ()); > - T *slot = &m_vecdata[ix]; > + T *slot = &address ()[ix]; > memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T)); > *slot = obj; > } > @@ -1071,7 +1070,7 @@ inline void > vec<T, A, vl_embed>::ordered_remove (unsigned ix) > { > gcc_checking_assert (ix < length ()); > - T *slot = &m_vecdata[ix]; > + T *slot = &address ()[ix]; > memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T)); > } > > @@ -1118,7 +1117,7 @@ inline void > vec<T, A, vl_embed>::unordered_remove (unsigned ix) > { > gcc_checking_assert (ix < length ()); > - m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num]; > + address ()[ix] = address ()[--m_vecpfx.m_num]; > } > > > @@ -1130,7 +1129,7 @@ inline void > vec<T, A, vl_embed>::block_remove (unsigned ix, unsigned len) > { > gcc_checking_assert (ix + len <= length ()); > - T *slot = &m_vecdata[ix]; > + T *slot = &address ()[ix]; > m_vecpfx.m_num -= len; > memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T)); > } > @@ -1309,7 +1308,7 @@ vec<T, A, vl_embed>::embedded_size (unsi > vec, vec_embedded>::type vec_stdlayout; > static_assert (sizeof (vec_stdlayout) == sizeof (vec), ""); > static_assert (alignof (vec_stdlayout) == alignof (vec), ""); > - return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T); > + return sizeof (vec_stdlayout) + alloc * sizeof (T); > } > > > @@ -1476,10 +1475,10 @@ public: > { return m_vec ? m_vec->length () : 0; } > > T *address (void) > - { return m_vec ? m_vec->m_vecdata : NULL; } > + { return m_vec ? m_vec->address () : NULL; } > > const T *address (void) const > - { return m_vec ? m_vec->m_vecdata : NULL; } > + { return m_vec ? m_vec->address () : NULL; } > > T *begin () { return address (); } > const T *begin () const { return address (); } > @@ -1584,7 +1583,7 @@ public: > > private: > vec<T, va_heap, vl_embed> m_auto; > - T m_data[MAX (N - 1, 1)]; > + unsigned char m_data[MAX (N - 1, 1)]; This needs to be alignas(T) unsigned char m_data[sizeof(T) * N]; The alignas might be redundant (because the member before it is aligned as T) but we don't want the N-1 now that this contains the entire storage, there isn't an element in the m_auto vec before it. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 10:30 ` Jonathan Wakely @ 2023-02-24 10:59 ` Jakub Jelinek 2023-02-24 11:04 ` Jakub Jelinek 0 siblings, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2023-02-24 10:59 UTC (permalink / raw) To: Jonathan Wakely; +Cc: Richard Biener, gcc-patches On Fri, Feb 24, 2023 at 10:30:00AM +0000, Jonathan Wakely wrote: > On Fri, 24 Feb 2023 at 10:24, Jakub Jelinek <jakub@redhat.com> wrote: > > > > On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches wrote: > > > Maybe this would work, vl_relative even could be vl_embed. > > > Because vl_embed I believe is used in two spots, part of > > > auto_vec where it is followed by m_data and on heap or GGC > > > allocated memory where vec<..., vl_embed> is followed by > > > further storage for the vector. > > > > So roughtly something like below? Except I get weird crashes with it > > in the gen* tools. And we'd need to adjust the gdb python hooks > > which also use m_vecdata. > > > > --- gcc/vec.h.jj 2023-01-02 09:32:32.177143804 +0100 > > +++ gcc/vec.h 2023-02-24 11:19:37.900157177 +0100 > > @@ -586,8 +586,8 @@ public: > > unsigned allocated (void) const { return m_vecpfx.m_alloc; } > > unsigned length (void) const { return m_vecpfx.m_num; } > > bool is_empty (void) const { return m_vecpfx.m_num == 0; } > > - T *address (void) { return m_vecdata; } > > - const T *address (void) const { return m_vecdata; } > > + T *address (void) { return (T *) (this + 1); } > > + const T *address (void) const { return (const T *) (this + 1); } > > T *begin () { return address (); } > > const T *begin () const { return address (); } > > T *end () { return address () + length (); } > > @@ -629,10 +629,9 @@ public: > > friend struct va_gc_atomic; > > friend struct va_heap; > > > > - /* FIXME - These fields should be private, but we need to cater to > > + /* FIXME - This field should be private, but we need to cater to > > compilers that have stricter notions of PODness for types. */ > > - vec_prefix m_vecpfx; > > - T m_vecdata[1]; > > + alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx; > > }; > > > > > > @@ -879,7 +878,7 @@ inline const T & > > vec<T, A, vl_embed>::operator[] (unsigned ix) const > > { > > gcc_checking_assert (ix < m_vecpfx.m_num); > > - return m_vecdata[ix]; > > + return address ()[ix]; > > } > > > > template<typename T, typename A> > > @@ -887,7 +886,7 @@ inline T & > > vec<T, A, vl_embed>::operator[] (unsigned ix) > > { > > gcc_checking_assert (ix < m_vecpfx.m_num); > > - return m_vecdata[ix]; > > + return address ()[ix]; > > } > > > > > > @@ -929,7 +928,7 @@ vec<T, A, vl_embed>::iterate (unsigned i > > { > > if (ix < m_vecpfx.m_num) > > { > > - *ptr = m_vecdata[ix]; > > + *ptr = address ()[ix]; > > return true; > > } > > else > > @@ -955,7 +954,7 @@ vec<T, A, vl_embed>::iterate (unsigned i > > { > > if (ix < m_vecpfx.m_num) > > { > > - *ptr = CONST_CAST (T *, &m_vecdata[ix]); > > + *ptr = CONST_CAST (T *, &address ()[ix]); > > return true; > > } > > else > > @@ -978,7 +977,7 @@ vec<T, A, vl_embed>::copy (ALONE_MEM_STA > > { > > vec_alloc (new_vec, len PASS_MEM_STAT); > > new_vec->embedded_init (len, len); > > - vec_copy_construct (new_vec->address (), m_vecdata, len); > > + vec_copy_construct (new_vec->address (), address (), len); > > } > > return new_vec; > > } > > @@ -1018,7 +1017,7 @@ inline T * > > vec<T, A, vl_embed>::quick_push (const T &obj) > > { > > gcc_checking_assert (space (1)); > > - T *slot = &m_vecdata[m_vecpfx.m_num++]; > > + T *slot = &address ()[m_vecpfx.m_num++]; > > *slot = obj; > > return slot; > > } > > @@ -1031,7 +1030,7 @@ inline T & > > vec<T, A, vl_embed>::pop (void) > > { > > gcc_checking_assert (length () > 0); > > - return m_vecdata[--m_vecpfx.m_num]; > > + return address ()[--m_vecpfx.m_num]; > > } > > > > > > @@ -1056,7 +1055,7 @@ vec<T, A, vl_embed>::quick_insert (unsig > > { > > gcc_checking_assert (length () < allocated ()); > > gcc_checking_assert (ix <= length ()); > > - T *slot = &m_vecdata[ix]; > > + T *slot = &address ()[ix]; > > memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T)); > > *slot = obj; > > } > > @@ -1071,7 +1070,7 @@ inline void > > vec<T, A, vl_embed>::ordered_remove (unsigned ix) > > { > > gcc_checking_assert (ix < length ()); > > - T *slot = &m_vecdata[ix]; > > + T *slot = &address ()[ix]; > > memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T)); > > } > > > > @@ -1118,7 +1117,7 @@ inline void > > vec<T, A, vl_embed>::unordered_remove (unsigned ix) > > { > > gcc_checking_assert (ix < length ()); > > - m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num]; > > + address ()[ix] = address ()[--m_vecpfx.m_num]; > > } > > > > > > @@ -1130,7 +1129,7 @@ inline void > > vec<T, A, vl_embed>::block_remove (unsigned ix, unsigned len) > > { > > gcc_checking_assert (ix + len <= length ()); > > - T *slot = &m_vecdata[ix]; > > + T *slot = &address ()[ix]; > > m_vecpfx.m_num -= len; > > memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T)); > > } > > @@ -1309,7 +1308,7 @@ vec<T, A, vl_embed>::embedded_size (unsi > > vec, vec_embedded>::type vec_stdlayout; > > static_assert (sizeof (vec_stdlayout) == sizeof (vec), ""); > > static_assert (alignof (vec_stdlayout) == alignof (vec), ""); > > - return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T); > > + return sizeof (vec_stdlayout) + alloc * sizeof (T); > > } > > > > > > @@ -1476,10 +1475,10 @@ public: > > { return m_vec ? m_vec->length () : 0; } > > > > T *address (void) > > - { return m_vec ? m_vec->m_vecdata : NULL; } > > + { return m_vec ? m_vec->address () : NULL; } > > > > const T *address (void) const > > - { return m_vec ? m_vec->m_vecdata : NULL; } > > + { return m_vec ? m_vec->address () : NULL; } > > > > T *begin () { return address (); } > > const T *begin () const { return address (); } > > @@ -1584,7 +1583,7 @@ public: > > > > private: > > vec<T, va_heap, vl_embed> m_auto; > > - T m_data[MAX (N - 1, 1)]; > > + unsigned char m_data[MAX (N - 1, 1)]; > > This needs to be alignas(T) unsigned char m_data[sizeof(T) * N]; unsigned char m_data[MAX (N, 2) * sizeof (T)]; if we want to preserve current behavior I think. I've screwed up, when I was about to change this line, I've realized I want to look at the embedded_size stuff first and then forgot to update this. With the above line it builds, though I bet one will need to compare the generated code etc. and test with GCC 4.8. Though I guess we now have also an option to change auto_vec with N 0/1 to have 1 embedded element instead of 2, so that would also mean changing all of MAX (N, 2) to MAX (N, 1) if we want. --- gcc/vec.h.jj 2023-01-02 09:32:32.177143804 +0100 +++ gcc/vec.h 2023-02-24 11:54:49.859564268 +0100 @@ -586,8 +586,8 @@ public: unsigned allocated (void) const { return m_vecpfx.m_alloc; } unsigned length (void) const { return m_vecpfx.m_num; } bool is_empty (void) const { return m_vecpfx.m_num == 0; } - T *address (void) { return m_vecdata; } - const T *address (void) const { return m_vecdata; } + T *address (void) { return (T *) (this + 1); } + const T *address (void) const { return (const T *) (this + 1); } T *begin () { return address (); } const T *begin () const { return address (); } T *end () { return address () + length (); } @@ -629,10 +629,9 @@ public: friend struct va_gc_atomic; friend struct va_heap; - /* FIXME - These fields should be private, but we need to cater to + /* FIXME - This field should be private, but we need to cater to compilers that have stricter notions of PODness for types. */ - vec_prefix m_vecpfx; - T m_vecdata[1]; + alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx; }; @@ -879,7 +878,7 @@ inline const T & vec<T, A, vl_embed>::operator[] (unsigned ix) const { gcc_checking_assert (ix < m_vecpfx.m_num); - return m_vecdata[ix]; + return address ()[ix]; } template<typename T, typename A> @@ -887,7 +886,7 @@ inline T & vec<T, A, vl_embed>::operator[] (unsigned ix) { gcc_checking_assert (ix < m_vecpfx.m_num); - return m_vecdata[ix]; + return address ()[ix]; } @@ -929,7 +928,7 @@ vec<T, A, vl_embed>::iterate (unsigned i { if (ix < m_vecpfx.m_num) { - *ptr = m_vecdata[ix]; + *ptr = address ()[ix]; return true; } else @@ -955,7 +954,7 @@ vec<T, A, vl_embed>::iterate (unsigned i { if (ix < m_vecpfx.m_num) { - *ptr = CONST_CAST (T *, &m_vecdata[ix]); + *ptr = CONST_CAST (T *, &address ()[ix]); return true; } else @@ -978,7 +977,7 @@ vec<T, A, vl_embed>::copy (ALONE_MEM_STA { vec_alloc (new_vec, len PASS_MEM_STAT); new_vec->embedded_init (len, len); - vec_copy_construct (new_vec->address (), m_vecdata, len); + vec_copy_construct (new_vec->address (), address (), len); } return new_vec; } @@ -1018,7 +1017,7 @@ inline T * vec<T, A, vl_embed>::quick_push (const T &obj) { gcc_checking_assert (space (1)); - T *slot = &m_vecdata[m_vecpfx.m_num++]; + T *slot = &address ()[m_vecpfx.m_num++]; *slot = obj; return slot; } @@ -1031,7 +1030,7 @@ inline T & vec<T, A, vl_embed>::pop (void) { gcc_checking_assert (length () > 0); - return m_vecdata[--m_vecpfx.m_num]; + return address ()[--m_vecpfx.m_num]; } @@ -1056,7 +1055,7 @@ vec<T, A, vl_embed>::quick_insert (unsig { gcc_checking_assert (length () < allocated ()); gcc_checking_assert (ix <= length ()); - T *slot = &m_vecdata[ix]; + T *slot = &address ()[ix]; memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T)); *slot = obj; } @@ -1071,7 +1070,7 @@ inline void vec<T, A, vl_embed>::ordered_remove (unsigned ix) { gcc_checking_assert (ix < length ()); - T *slot = &m_vecdata[ix]; + T *slot = &address ()[ix]; memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T)); } @@ -1118,7 +1117,7 @@ inline void vec<T, A, vl_embed>::unordered_remove (unsigned ix) { gcc_checking_assert (ix < length ()); - m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num]; + address ()[ix] = address ()[--m_vecpfx.m_num]; } @@ -1130,7 +1129,7 @@ inline void vec<T, A, vl_embed>::block_remove (unsigned ix, unsigned len) { gcc_checking_assert (ix + len <= length ()); - T *slot = &m_vecdata[ix]; + T *slot = &address ()[ix]; m_vecpfx.m_num -= len; memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T)); } @@ -1309,7 +1308,7 @@ vec<T, A, vl_embed>::embedded_size (unsi vec, vec_embedded>::type vec_stdlayout; static_assert (sizeof (vec_stdlayout) == sizeof (vec), ""); static_assert (alignof (vec_stdlayout) == alignof (vec), ""); - return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T); + return sizeof (vec_stdlayout) + alloc * sizeof (T); } @@ -1476,10 +1475,10 @@ public: { return m_vec ? m_vec->length () : 0; } T *address (void) - { return m_vec ? m_vec->m_vecdata : NULL; } + { return m_vec ? m_vec->address () : NULL; } const T *address (void) const - { return m_vec ? m_vec->m_vecdata : NULL; } + { return m_vec ? m_vec->address () : NULL; } T *begin () { return address (); } const T *begin () const { return address (); } @@ -1584,7 +1583,7 @@ public: private: vec<T, va_heap, vl_embed> m_auto; - T m_data[MAX (N - 1, 1)]; + unsigned char m_data[MAX (N, 2) * sizeof (T)]; }; /* auto_vec is a sub class of vec whose storage is released when it is Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] Avoid default-initializing auto_vec<T, N> storage 2023-02-24 10:59 ` Jakub Jelinek @ 2023-02-24 11:04 ` Jakub Jelinek 0 siblings, 0 replies; 15+ messages in thread From: Jakub Jelinek @ 2023-02-24 11:04 UTC (permalink / raw) To: Jonathan Wakely, Richard Biener, gcc-patches On Fri, Feb 24, 2023 at 11:59:53AM +0100, Jakub Jelinek via Gcc-patches wrote: > > This needs to be alignas(T) unsigned char m_data[sizeof(T) * N]; > > unsigned char m_data[MAX (N, 2) * sizeof (T)]; > > if we want to preserve current behavior I think. > > I've screwed up, when I was about to change this line, I've realized > I want to look at the embedded_size stuff first and then forgot > to update this. With the above line it builds, though I bet one will need > to compare the generated code etc. and test with GCC 4.8. > > Though I guess we now have also an option to change auto_vec with N 0/1 > to have 1 embedded element instead of 2, so that would also mean changing > all of MAX (N, 2) to MAX (N, 1) if we want. It builds then in non-optimized build, but when I try to build it in stage3, the useless middle-end warnings kick in: In member function ‘T* vec<T, A, vl_embed>::quick_push(const T&) [with T = parameter; A = va_heap]’, inlined from ‘T* vec<T>::quick_push(const T&) [with T = parameter]’ at ../../gcc/vec.h:1957:28, inlined from ‘void populate_pattern_routine(create_pattern_info*, merge_state_info*, state*, const vec<parameter, va_heap, vl_ptr>&)’ at ../../gcc/genrecog.cc:3008:29: ../../gcc/vec.h:1021:3: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 1021 | *slot = obj; | ^ ../../gcc/vec.h: In function ‘void populate_pattern_routine(create_pattern_info*, merge_state_info*, state*, const vec<parameter, va_heap, vl_ptr>&)’: ../../gcc/vec.h:1585:29: note: at offset 12 into destination object ‘auto_vec<parameter, 5>::m_auto’ of size 8 1585 | vec<T, va_heap, vl_embed> m_auto; | ^~~~~~ In member function ‘T* vec<T, A, vl_embed>::quick_push(const T&) [with T = parameter; A = va_heap]’, inlined from ‘T* vec<T>::quick_push(const T&) [with T = parameter]’ at ../../gcc/vec.h:1957:28, inlined from ‘decision* init_pattern_use(create_pattern_info*, merge_state_info*, const vec<parameter, va_heap, vl_ptr>&)’ at ../../gcc/genrecog.cc:2886:26: ../../gcc/vec.h:1021:3: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 1021 | *slot = obj; | ^ ../../gcc/vec.h: In function ‘decision* init_pattern_use(create_pattern_info*, merge_state_info*, const vec<parameter, va_heap, vl_ptr>&)’: ../../gcc/vec.h:1585:29: note: at offset 12 into destination object ‘auto_vec<parameter, 5>::m_auto’ of size 8 1585 | vec<T, va_heap, vl_embed> m_auto; | ^~~~~~ Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-02-24 11:04 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-02-23 12:54 [PATCH] Avoid default-initializing auto_vec<T, N> storage Richard Biener 2023-02-23 13:56 ` Jakub Jelinek 2023-02-23 15:02 ` Richard Biener 2023-02-23 15:20 ` Jakub Jelinek 2023-02-24 9:02 ` Richard Biener 2023-02-24 9:34 ` Richard Biener 2023-02-24 9:48 ` Jakub Jelinek 2023-02-24 9:50 ` Jonathan Wakely 2023-02-24 9:55 ` Jakub Jelinek 2023-02-24 9:55 ` Jonathan Wakely 2023-02-24 10:02 ` Jakub Jelinek 2023-02-24 10:24 ` Jakub Jelinek 2023-02-24 10:30 ` Jonathan Wakely 2023-02-24 10:59 ` Jakub Jelinek 2023-02-24 11:04 ` Jakub Jelinek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).