On 05/20/2017 04:01 AM, Carlos O'Donell wrote: > Why would someone choose to use dynarray instead of a scratch buffer? Scratch buffers hard-coded to 1000-something bytes of on-stack storage. This can be a problem if you need several (usually small) arrays. The on-stack allocation is configurable for dynamic arrays. Scratch buffers are geared towards untyped byte sequences, not arrays of a single type. Scratch buffers are generally not expected to preserve their contents when resized. Some existing scratch buffer users probably should switch to dynarray once we have that functionality. scratch_buffer_set_array_size is a bit of a hack because I didn't want to create the full array functionality at that time. I expect that this function can eventually go away (and so should scratch_buffer_grow_preserve), so that scratch buffers are strictly for NSS-style buffers. At that point, we can use a more aggressive resizing policy (because retries are so expensive). > Can dynarray guarantee it is only heap allocated to make it easier to use > in certain stack-limited cases? While scratch-buffers are designed to be stack > allocated with heap fallback? By default, at least two array elements are stack-allocated, and up to 128 bytes of stack space are used. This can be configured through DYNARRAY_INITIAL_SIZE. > The tests while thorough need much more comments explaining the purpose of each > test and the iterations being done by the tests. For example for each sequence > of nested for loops we need an outer comment to describe exactly what is being > tested and why that is important. I added more comments in the attached patch. > The initial dynarray size makes the API feel like a variant of scratch buffer. > Why have an initial inline size that can be allocated on the stack? I'd like to > avoid stack allocations for dynarray's and make them completely heap-based. > Does that make sense from a design perspective? It would make them an easy > choice when you want to avoid stack based allocations. Most arrays are small. A small stack-allocated array will allow us to create the final array with the correct size for such small arrays. We already need three words for the dynarray header. I could reduce the default stack allocation size to something in the same ballpark (say five words) if on-stack size is a concern. > The dynamic array growth factor appears to be a value of 2, and that doesn't > match best practice nor academic literature. I would be happier with a growth > factor closer to 1.5. I used Paul's suggestion of (1.5 * N + 1). I could not find any relevant discussion of growth factors. > Review of low level details: > > tst-dynarray-fail - Takes on average 40s on an "average" machine. > The timeout needs to be bumped up to 50s. I see ~93% of the > time in _dl_addr because the free hook uses that to > lookup information and that's very expensive. I set a timeout of 90 seconds. For me, it completes much, much faster for some reason. > malloc/dynarray_at_failure.c - Must use __snprintf to avoid PLT. Has slight > formatting issue. Fixed. >> +#ifdef DYNARRAY_INITIAL_SIZE >> +# if DYNARRAY_INITIAL_SIZE < 0 >> +# error "DYNARRAY_INITIAL_SIZE must be non-negative" >> +# endif >> +# if DYNARRAY_INITIAL_SIZE > 0 >> +# define DYNARRAY_HAVE_SCRATCH 1 >> +# else >> +# define DYNARRAY_HAVE_SCRATCH 0 >> +# endif >> +#else >> +/* Provide a reasonable default which limits the size of >> + DYNARRAY_STRUCT. */ >> +# define DYNARRAY_INITIAL_SIZE \ >> + (sizeof (DYNARRAY_ELEMENT) > 64 ? 2 : 128 / sizeof (DYNARRAY_ELEMENT)) >> +# define DYNARRAY_HAVE_SCRATCH 1 > > Not OK. > > I dont like defaults for this interface. I think the user should > explicitly request a size or this should fail. It avoids macro API typos > which intend to define DYNARRAY_INITIAL_SIZE but don't, and then a default > applies. > > Why have a default at all though? Convenience. This has to be a preprocessor constant, so that we can make the presence of the scratch array inside the dynarray struct conditional on the initial size value. I do think we need some defaults here because without them, new features we add will eventually need patching all over the place. This would be very hostile to backporting in particular. > Can we make the choice to have dynarray always heap allocated? #define DYNARRAY_INITIAL_SIZE 0 does that. I clarified that in the file comment. > It would make this simpler, you just never have an initial scratch. > > If you need scratch space use growable scratch buffers? See above why scratch buffers can be a poor fit. >> +/* Remove the last element of LIST if it is present. */ >> +__attribute__ ((unused)) >> +static void >> +DYNARRAY_NAME (remove_last) (struct DYNARRAY_STRUCT *list) >> +{ >> + /* We deliberately skip error checking here. */ > > Why do you deliberatly skip error checking? I updated the comment. >> +/* Remove all elements from the list. The elements are freed, but the >> + list itself is not. */ >> +__attribute__ ((unused)) >> +static void >> +DYNARRAY_NAME (clear) (struct DYNARRAY_STRUCT *list) >> +{ >> + /* We deliberately skip error checking here. */ > > Why? Likewise. >> +#ifdef DYNARRAY_FINAL_TYPE >> +/* Transfer the dynamic array to a permanent location at *RESULT. >> + Returns true on success on false on allocation failure. In either >> + case, *LIST is re-initialized and can be reused. A NULL pointer is >> + stored in *RESULT if LIST refers to an empty list. On success, the >> + pointer in *RESULT is heap-allocated and must be deallocated using >> + free. */ >> +__attribute__ ((unused, warn_unused_result)) >> +static bool >> +DYNARRAY_NAME (finalize) (struct DYNARRAY_STRUCT *list, >> + DYNARRAY_FINAL_TYPE *result) >> +{ >> + struct dynarray_finalize_result res; >> + if (__libc_dynarray_finalize (&list->dynarray_abstract, >> + DYNARRAY_SCRATCH (list), >> + sizeof (DYNARRAY_ELEMENT), &res)) >> + { >> + /* On success, the result owns all the data. */ >> + DYNARRAY_NAME (init) (list); >> + *result = (DYNARRAY_FINAL_TYPE) { res.array, res.length }; > > Can we do a named structure member assignment here just for clarity and safety? This would force specific names for this user-supplied structure. I did not want to do that. >> + A minimal example which provides a growing list of integers can be >> + defined like this: >> + >> + struct int_array >> + { >> + int *array; >> + size_t length; >> + }; > > Please enhance this example to explain tha the layout of the final > type has to have ceratain members in a particular order. I added a comment. >> + /* Double the allocated size. */ >> + { >> + new_allocated = 2 * list->allocated; > > Not OK. Please review academic literature on growth factor values. See above. >> + size_t allocation_size = used * element_size; >> + void *heap_array = malloc (allocation_size); >> + if (heap_array != NULL) >> + { >> + /* The new array takes ownership of the strings. */ >> + if (list->array != NULL) >> + memcpy (heap_array, list->array, allocation_size); >> + if (list->array != scratch) >> + free (list->array); >> + *result = (struct dynarray_finalize_result) { heap_array, used }; > > Can we use named structure member assignment? Fixed. >> +static void >> +check_stream (const char *what, const struct xmemstream *stream, >> + const char *expected) >> +{ >> + if (strcmp (stream->buffer, expected) != 0) >> + { >> + support_record_failure (); >> + printf ("error: captured %s data incorrect\n" >> + " expected: %s\n" >> + " actual: %s\n", >> + what, expected, stream->buffer); >> + } >> + if (stream->length != strlen (expected)) >> + { >> + support_record_failure (); >> + printf ("error: captured %s data length incorrect\n" >> + " expected: %zu\n" >> + " actual: %zu\n", >> + what, strlen (expected), stream->length); >> + } >> +} >> + >> +static int >> +do_test (void) >> +{ >> + const int lengths[] = {0, 1, 17, 512, 20000, -1}; >> + >> + for (int i = 0; lengths[i] >= 0; ++i) >> + for (int j = 0; lengths[j] >= 0; ++j) >> + for (int write_mode = 0; write_mode < write_mode_last; ++write_mode) >> + for (int signal = 0; signal < 2; ++signal) >> + for (int status = 0; status < 2; ++status) > > This needs detailed comments. Fixed. The attached patch adds a new add function for arrays, which is convenient to construct arrays with simple elements (such as NUL-terminated strings). For arrays which have sentinels when completed (such as NUL-terminated strings and argv lists), the existing finalize interface with the type definition is overkill, so I added a default implementation which directly returns the heap-allocated pointer. In addition, I added a few nonnull attributes and told the compiler to inline the fast path for emplace (and the new add function). Thanks, Florian