public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>, libc-alpha@sourceware.org
Subject: Re: [PATCH 08/18] malloc: Add specialized dynarray for C strings
Date: Thu, 17 Aug 2017 12:39:00 -0000	[thread overview]
Message-ID: <7c1f39c3-4eff-60c5-8380-90719736baca@linaro.org> (raw)
In-Reply-To: <a6d5c2d7-9141-1c7c-ad35-a8bc65e03eaf@redhat.com>



On 17/08/2017 07:12, Florian Weimer wrote:
> On 08/11/2017 04:50 PM, Adhemerval Zanella wrote:
>> This patch adds an specialized dynarray to manage C strings using the
>> dynarray internal implementation.  It uses some private fields from
>> dynarray and thus it provided specific files to access and manage
>> the internal string buffer.
> 
> There is a lot of complexity in this code to maintain the invariant that
> the stored array is NUL-terminated.  It also means that the dynarray
> functions must not be used directly.

Not really, however some won't follow the NUL-terminated semantic (I use
the provided free function for instance).  Unfortunately we do not have
access modifiers as for c++ to avoid issues calling these functions, but
since it is an internal API I think we can live with it. 

> 
> std::string in C++ has a c_str() method which adds the null termination
> on demand.  This means that char_array_str could fail due to memory
> allocation, but I think it would simplify the code in general.  I tried
> to write the dynarray functions in such a way that errors are sticky, so
> one error check (char_array_str returns NULL) could cover all the
> previous manipulations.

That is not my understanding checking on libstd++v3 code:

include/bits/basic_string.h:

2222       const _CharT*
2223       c_str() const _GLIBCXX_NOEXCEPT
2224       { return _M_data(); }

 158       pointer
 159       _M_data() const
 160       { return _M_dataplus._M_p; }

AFAIK current std::string does not add the NUL on demand, but rather keep
the internal buffer NUL-terminated.  Some basic checking with gdb does
hold it true:

(gdb) l
3
4       int main ()
5       {
6         std::string str("test");
7
8         printf ("%s\n", str.c_str());
9         return 0;
10      }
(gdb) p str
$10 = {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7fffffffd6d0 "test"}, 
  _M_string_length = 4, {_M_local_buf = "test", '\000' <repeats 11 times>, _M_allocated_capacity = 1953719668}}
(gdb) x/6c str._M_dataplus._M_p
0x7fffffffd6d0: 116 't' 101 'e' 115 's' 116 't' 0 '\000'        0 '\000'

Also, c++ does not try to use the same API for both containers and string
because they do differ exactly regarding NUL-termination and C-string
handling.  And I think this make sense to provide different ABI to 
different data types and hiding all the required internal handling.

Also, it will still require some char_array wrappers to hiding the C-string
manipulation (append/prepend/initialization) and it will add complications
if char_array_str adds an emplace element at the end. I assume it will 
add the element in place instead of returning a copy (since the idea is 
to actually have inplace data for char_array manipulation), so all other 
functions will need to actually check if the char_array buffer has the 
NULL-byte before manipulating. It only adds complexity and more possible
code patch for internal buffer manipulation.

  reply	other threads:[~2017-08-17 12:39 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-11 14:50 [PATCH v2 00/18] posix: glob fixes and refactor Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 12/18] posix: Use dynarray for globname in glob Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 13/18] posix: Remove all alloca usage " Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 05/18] posix: Rewrite to use struct scratch_buffer instead of extend_alloca Adhemerval Zanella
2017-09-01 23:50   ` Paul Eggert
2017-09-02 10:40   ` Paul Eggert
2017-08-11 14:51 ` [PATCH 08/18] malloc: Add specialized dynarray for C strings Adhemerval Zanella
2017-08-17 10:12   ` Florian Weimer
2017-08-17 12:39     ` Adhemerval Zanella [this message]
2017-08-17 14:48     ` Pedro Alves
2017-08-11 14:51 ` [PATCH 04/18] posix: Allow glob to match dangling symlinks [BZ #866] Adhemerval Zanella
2017-08-31 22:11   ` Paul Eggert
2017-08-11 14:51 ` [PATCH 18/18] posix: Fix glob with GLOB_NOCHECK returning modified patterns (BZ#10246) Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 11/18] posix: Remove alloca usage on glob dirname Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 16/18] posix: More check for overflow allocation in glob Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 03/18] posix: Consolidate glob implementation Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 01/18] posix: Sync glob with gnulib [BZ #1062] Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 15/18] posix: Add common function to get home directory Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 14/18] posix: Use char_array for home_dir in glob Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 07/18] posix: User LOGIN_NAME_MAX for all user names " Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 09/18] posix: Use char_array for internal glob dirname Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 17/18] posix: Use enum for __glob_pattern_type result Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 06/18] posix: Remove glob GET_LOGIN_NAME_MAX usage Adhemerval Zanella
2017-09-02 22:50   ` Paul Eggert
2017-08-11 14:51 ` [PATCH 02/18] posix: Adjust glob tests to libsupport Adhemerval Zanella
2017-08-11 14:51 ` [PATCH 10/18] posix: Remove alloca usage for GLOB_BRACE on glob Adhemerval Zanella
2017-08-17 10:19 ` [PATCH v2 00/18] posix: glob fixes and refactor Florian Weimer
2017-08-17 12:07   ` Adhemerval Zanella
2017-08-17 14:11     ` Paul Eggert
2017-08-17 17:32       ` Adhemerval Zanella
2017-08-17 18:07         ` Florian Weimer
2017-08-17 19:51         ` Paul Eggert
2017-08-17 20:05           ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7c1f39c3-4eff-60c5-8380-90719736baca@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).