From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x336.google.com (mail-ot1-x336.google.com [IPv6:2607:f8b0:4864:20::336]) by sourceware.org (Postfix) with ESMTPS id BD320381D462 for ; Thu, 15 Dec 2022 18:40:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BD320381D462 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-ot1-x336.google.com with SMTP id r2-20020a9d7cc2000000b006718a7f7fbaso22118otn.2 for ; Thu, 15 Dec 2022 10:40:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=ynS7H138zx8/7UNUHNPKOlV5tumNUkdCZPllmjqz5S8=; b=IUB8Ioqa/OxIXVoIx+szliKW6gSZb9jpl7ZdEEJZdWDwEsixV9T5YrHoVD+BgZWSVV rFpViqihGUd+B0kPTjLKpyU2bm4dVCrd+7xUJS/nL6Bk5bMJ/wvrGcAwcE3rDGaLVgk+ GCfyd6976F9v0yAbWQWcF+GkrsFmYddWH2dzkVGtyYCMts+tqKeuKtz9M46sqhLbUTLX eGOiE5AR6TpmjawggLzsTIqQHFxF0ZW0M+HMgBBg2xt5XIZqHNJ/16RkVJm5sj85S8R6 SBE7AYLf1p70UkJ+ashLNiA14aTJBNksunYu0sNoxlK2m/udPssv0Fh61uS3trq9c5pu hrOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ynS7H138zx8/7UNUHNPKOlV5tumNUkdCZPllmjqz5S8=; b=giM+MugmNOkjDdv03m/vkm76a+HsaV4O/yBX/eGr2ZjFZ6sTa9M43bfwecDTCtcIB7 RNpMV/n1Uqh0yWsEVkMPC3srwd2l6SLf30SaSOd9zKuvTjVDXWs1SINZ6nqTHdi7oEQJ lRVEr3gUdw37yiySNbiE6zyYsrdSVcno080Nk3mg6kqzWmTU9ZdvK6K74nLrt6LziTVe dBDuXk16AyOybxmfuLWHQMjKsWRKlnupLpHC4Y5l7CvUmzfS8JuF4DaiAdQO24rBttoi gj0svDGmB4PxslPjvAIqHuUbDAsoD5m7OopzGJJKVBNiC1pODSNpspEjjnrpqZDBzcRV Y4eg== X-Gm-Message-State: ANoB5pmyli66XFydosf/Hgi1AZCWpwCBmoc59gVUhe+sZ329pmqT+Y8f sFF8hdvV50eaGsUdDTvVl5hGnA== X-Google-Smtp-Source: AA0mqf6lgYiUNLevLhZljleRYbv3e7LRi7HQft+sxxd5m9wneqoKbHL1WFcFN3ky6zjxrI3Skwl5sw== X-Received: by 2002:a05:6830:10d6:b0:670:c8a5:c5b3 with SMTP id z22-20020a05683010d600b00670c8a5c5b3mr5424730oto.13.1671129650782; Thu, 15 Dec 2022 10:40:50 -0800 (PST) Received: from ?IPV6:2804:1b3:a7c2:f05e:14ba:6503:a761:d926? ([2804:1b3:a7c2:f05e:14ba:6503:a761:d926]) by smtp.gmail.com with ESMTPSA id v3-20020a0568301bc300b006705e35c4e2sm3824542ota.35.2022.12.15.10.40.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 15 Dec 2022 10:40:49 -0800 (PST) Message-ID: <08779a8a-1dcb-495c-6d1c-5c629141a606@linaro.org> Date: Thu, 15 Dec 2022 15:40:47 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.5.1 Subject: Re: [PATCH v5 08/11] libio: Convert __vasprintf_internal to buffers Content-Language: en-US To: Florian Weimer , libc-alpha@sourceware.org References: <214908c0def333495c359630d7822e7ccb6b8154.1670858473.git.fweimer@redhat.com> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: <214908c0def333495c359630d7822e7ccb6b8154.1670858473.git.fweimer@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 12/12/22 12:23, Florian Weimer via Libc-alpha wrote: > The buffer resizing algorithm is slightly different. The initial > buffer is on the stack, and small buffers are directly allocated > on the heap using the exact required size. The overhead of the > additional copy is compensated by the lowered setup cost for buffers > compared to libio streams. Patch look good, some comments below. > --- > include/printf_buffer.h | 4 + > libio/vasprintf.c | 141 ++++++++++++++++++++--------- > stdio-common/printf_buffer_flush.c | 4 + > 3 files changed, 104 insertions(+), 45 deletions(-) > > diff --git a/include/printf_buffer.h b/include/printf_buffer.h > index 92393399fb..47e252b52e 100644 > --- a/include/printf_buffer.h > +++ b/include/printf_buffer.h > @@ -48,6 +48,7 @@ enum __printf_buffer_mode > __printf_buffer_mode_snprintf, > __printf_buffer_mode_sprintf_chk, > __printf_buffer_mode_to_file, > + __printf_buffer_mode_asprintf, > __printf_buffer_mode_strfmon, > __printf_buffer_mode_fp, /* For __printf_fp_l_buffer. */ > __printf_buffer_mode_fp_to_wide, /* For __wprintf_fp_l_buffer. */ > @@ -301,6 +302,9 @@ void __printf_buffer_flush_snprintf (struct __printf_buffer_snprintf *) > struct __printf_buffer_to_file; > void __printf_buffer_flush_to_file (struct __printf_buffer_to_file *) > attribute_hidden; > +struct __printf_buffer_asprintf; > +void __printf_buffer_flush_asprintf (struct __printf_buffer_asprintf *) > + attribute_hidden; > struct __printf_buffer_fp; > void __printf_buffer_flush_fp (struct __printf_buffer_fp *) > attribute_hidden; > diff --git a/libio/vasprintf.c b/libio/vasprintf.c > index 4430a266c6..f4ff7ec98b 100644 > --- a/libio/vasprintf.c > +++ b/libio/vasprintf.c > @@ -24,64 +24,115 @@ > This exception applies to code released by its copyright holders > in files containing the exception. */ > > -#include > +#include > +#include > +#include > +#include > +#include > +#include > #include > -#include > +#include > +#include > + > +struct __printf_buffer_asprintf > +{ > + /* base.write_base points either to a heap-allocated buffer, or to > + the direct array below. */ > + struct __printf_buffer base; > + > + /* Initial allocation. 200 should be large enough to copy almost > + all asprintf usages with just a single (final, correctly sized) > + heap allocation. */ > + char direct[200]; > +}; There are couple or more buffer sizes scattered in this patchset, maybe it would be better to consolidate them in one place so it can be easily tuned. For instance, since it is used with memcpy below, maybe it would be better to use a size that compiler can inline (it would be arch-dependent, but using a size that most usual architecture inline is a net gain). > + > +void > +__printf_buffer_flush_asprintf (struct __printf_buffer_asprintf *buf) > +{ > + size_t current_pos = buf->base.write_ptr - buf->base.write_base; > + if (current_pos >= INT_MAX) Shouldn't it be SSIZE_MAX? > + { > + /* The result is not representable. No need to continue. */ > + __set_errno (EOVERFLOW); > + __printf_buffer_mark_failed (&buf->base); > + return; > + } > + > + size_t current_size = buf->base.write_end - buf->base.write_base; > + /* Implement an exponentiatial sizing policy. Keep the size s/exponentiatial/exponential > + congruent 8 (mod 16), to account for the footer in glibc > + malloc. */ > + size_t new_size = ((current_size + current_size / 2) & -15) | 8; Maybe use ALIGN_UP ((current_size + current_size / 2), 16) here? > + char *new_buffer; > + if (buf->base.write_base == buf->direct) > + { > + new_buffer = malloc (new_size); > + if (new_buffer == NULL) > + { > + __printf_buffer_mark_failed (&buf->base); > + return; > + } > + memcpy (new_buffer, buf->direct, current_pos); > + } > + else > + { > + new_buffer = realloc (buf->base.write_base, new_size); > + if (new_buffer == NULL) > + { > + __printf_buffer_mark_failed (&buf->base); > + return; > + } > + } > + > + /* Set up the new write area. */ > + buf->base.write_base = new_buffer; > + buf->base.write_ptr = new_buffer + current_pos; > + buf->base.write_end = new_buffer + new_size; > +} > + > Ok. > int > __vasprintf_internal (char **result_ptr, const char *format, va_list args, > unsigned int mode_flags) > { > - /* Initial size of the buffer to be used. Will be doubled each time an > - overflow occurs. */ > - const size_t init_string_size = 100; > - char *string; > - _IO_strfile sf; > - int ret; > - size_t needed; > - size_t allocated; > - /* No need to clear the memory here (unlike for open_memstream) since > - we know we will never seek on the stream. */ > - string = (char *) malloc (init_string_size); > - if (string == NULL) > - return -1; > -#ifdef _IO_MTSAFE_IO > - sf._sbf._f._lock = NULL; > -#endif > - _IO_no_init (&sf._sbf._f, _IO_USER_LOCK, -1, NULL, NULL); > - _IO_JUMPS (&sf._sbf) = &_IO_str_jumps; > - _IO_str_init_static_internal (&sf, string, init_string_size, string); > - sf._sbf._f._flags &= ~_IO_USER_BUF; > - sf._s._allocate_buffer_unused = (_IO_alloc_type) malloc; > - sf._s._free_buffer_unused = (_IO_free_type) free; > - ret = __vfprintf_internal (&sf._sbf._f, format, args, mode_flags); > - if (ret < 0) > + struct __printf_buffer_asprintf buf; > + __printf_buffer_init (&buf.base, buf.direct, array_length (buf.direct), > + __printf_buffer_mode_asprintf); > + > + __printf_buffer (&buf.base, format, args, mode_flags); > + int done = __printf_buffer_done (&buf.base); > + if (done < 0) > { > - free (sf._sbf._f._IO_buf_base); > - return ret; > + if (buf.base.write_base != buf.direct) > + free (buf.base.write_base); > + return done; > + } > + > + /* Transfer to the final buffer. */ > + char *result; > + size_t size = buf.base.write_ptr - buf.base.write_base; > + if (buf.base.write_base == buf.direct) > + { > + result = malloc (size + 1); > + if (result == NULL) > + return -1; > + memcpy (result, buf.direct, size); > } > - /* Only use realloc if the size we need is of the same (binary) > - order of magnitude then the memory we allocated. */ > - needed = sf._sbf._f._IO_write_ptr - sf._sbf._f._IO_write_base + 1; > - allocated = sf._sbf._f._IO_write_end - sf._sbf._f._IO_write_base; > - if ((allocated >> 1) <= needed) > - *result_ptr = (char *) realloc (sf._sbf._f._IO_buf_base, needed); > else > { > - *result_ptr = (char *) malloc (needed); > - if (*result_ptr != NULL) > + result = realloc (buf.base.write_base, size + 1); > + if (result == NULL) > { > - memcpy (*result_ptr, sf._sbf._f._IO_buf_base, needed - 1); > - free (sf._sbf._f._IO_buf_base); > + free (buf.base.write_base); > + return -1; > } > - else > - /* We have no choice, use the buffer we already have. */ > - *result_ptr = (char *) realloc (sf._sbf._f._IO_buf_base, needed); > } > - if (*result_ptr == NULL) > - *result_ptr = sf._sbf._f._IO_buf_base; > - (*result_ptr)[needed - 1] = '\0'; > - return ret; > + > + /* Add NUL termination. */ > + result[size] = '\0'; > + *result_ptr = result; > + > + return done; > } > > int Ok. > diff --git a/stdio-common/printf_buffer_flush.c b/stdio-common/printf_buffer_flush.c > index f44c8b887f..14fe1b2df4 100644 > --- a/stdio-common/printf_buffer_flush.c > +++ b/stdio-common/printf_buffer_flush.c > @@ -27,6 +27,7 @@ > #ifndef SHARED > # pragma weak __printf_buffer_flush_snprintf > # pragma weak __printf_buffer_flush_to_file > +# pragma weak __printf_buffer_flush_asprintf > # pragma weak __printf_buffer_flush_fp > # pragma weak __printf_buffer_flush_fp_to_wide > # pragma weak __printf_buffer_flush_fphex_to_wide > @@ -49,6 +50,9 @@ __printf_buffer_do_flush (struct __printf_buffer *buf) > case __printf_buffer_mode_to_file: > __printf_buffer_flush_to_file ((struct __printf_buffer_to_file *) buf); > return; > + case __printf_buffer_mode_asprintf: > + __printf_buffer_flush_asprintf ((struct __printf_buffer_asprintf *) buf); > + return; > case __printf_buffer_mode_strfmon: > __set_errno (E2BIG); > __printf_buffer_mark_failed (buf);