From: Sergio Durigan Junior <sergiodj@redhat.com>
To: Pedro Alves <palves@redhat.com>
Cc: GDB Patches <gdb-patches@sourceware.org>,
Simon Marchi <simon.marchi@polymtl.ca>
Subject: Re: [PATCH v3] C++ify gdb/common/environ.c
Date: Mon, 01 May 2017 02:22:00 -0000 [thread overview]
Message-ID: <87o9vdjoz4.fsf@redhat.com> (raw)
In-Reply-To: <e8360438-0edc-2020-5622-34d4344981e3@redhat.com> (Pedro Alves's message of "Wed, 19 Apr 2017 19:14:17 +0100")
Hey,
Thanks for the review, and really sorry about the delay. I have a few
comments and questions below.
On Wednesday, April 19 2017, Pedro Alves wrote:
> On 04/18/2017 04:03 AM, Sergio Durigan Junior wrote:
>
>> - if (e->allocated < i)
>> +const char *
>> +gdb_environ::get (const std::string &var) const
>> +{
>> + try
>> {
>> - e->allocated = std::max (i, e->allocated + 10);
>> - e->vector = (char **) xrealloc ((char *) e->vector,
>> - (e->allocated + 1) * sizeof (char *));
>> + return (char *) this->m_environ_map.at (var).c_str ();
>> }
>> -
>> - memcpy (e->vector, environ, (i + 1) * sizeof (char *));
>> -
>> - while (--i >= 0)
>> + catch (const std::out_of_range &ex)
>
> Please use unordered_map::find instead. In general,
> use "at" with exceptions when you're "sure" the element
> exists. It'll be simpler, more efficient, and less
> roundabout, since at() essentially does a find() and then
> throws if not found.
I decided to go the easier way and not use unordered_map. I confess I
was having second thoughts myself when I posted the patch, and your
message only made me more confident that I don't want to follow this
route.
>> {
>> - int len = strlen (e->vector[i]);
>> - char *newobj = (char *) xmalloc (len + 1);
>> -
>> - memcpy (newobj, e->vector[i], len + 1);
>> - e->vector[i] = newobj;
>> + return NULL;
>> }
>> }
>>
>
>> void
>> -set_in_environ (struct gdb_environ *e, const char *var, const char *value)
>> +gdb_environ::set (const std::string &var, const std::string &value)
>> {
>> - int i;
>> - int len = strlen (var);
>> - char **vector = e->vector;
>> - char *s;
>> + std::unordered_map<std::string, std::string>::iterator el;
>>
>> - for (i = 0; (s = vector[i]) != NULL; i++)
>> - if (strncmp (s, var, len) == 0 && s[len] == '=')
>> - break;
>> + el = this->m_environ_map.find (var);
>
> In C++, it's a good practice to avoid first initializing,
> and then assigning separately, because the default constructor
> will then do useless work. (And in some cases, there may not even
> be a default constructor.)
>
> I'm not a fan or "auto" everywhere, but with iterators,
> I find it OK:
>
> auto el = this->m_environ_map.find (var);
Without the unordered_map, this is no longer a problem. But of course,
thanks for the review.
>>
>> - if (s == 0)
>> + if (el != this->m_environ_map.end ())
>> {
>> - if (i == e->allocated)
>> + if (el->second == value)
>> + return;
>> + else
>> {
>> - e->allocated += 10;
>> - vector = (char **) xrealloc ((char *) vector,
>> - (e->allocated + 1) * sizeof (char *));
>> - e->vector = vector;
>> + /* If we found this item, it means that we have to update
>> + its value on the map. */
>> + el->second = value;
>> + /* And we also have to update its value on the
>> + environ_vector. For that, we just delete the item here
>> + and recreate it (with the new value) later. */
>> + unset_in_vector (this->m_environ_vector, var);
>> }
>> - vector[i + 1] = 0;
>> }
>> else
>> - xfree (s);
>> -
>> - s = (char *) xmalloc (len + strlen (value) + 2);
>> - strcpy (s, var);
>> - strcat (s, "=");
>> - strcat (s, value);
>> - vector[i] = s;
>> -
>> - /* This used to handle setting the PATH and GNUTARGET variables
>> - specially. The latter has been replaced by "set gnutarget"
>> - (which has worked since GDB 4.11). The former affects searching
>> - the PATH to find SHELL, and searching the PATH to find the
>> - argument of "symbol-file" or "exec-file". Maybe we should have
>> - some kind of "set exec-path" for that. But in any event, having
>> - "set env" affect anything besides the inferior is a bad idea.
>> - What if we want to change the environment we pass to the program
>> - without afecting GDB's behavior? */
>> -
>> - return;
>> + this->m_environ_map.insert (std::make_pair (var, value));
>> +
>> + this->m_environ_vector.insert (this->m_environ_vector.end () - 1,
>> + xstrdup (std::string (var
>> + + '='
>> + + value).c_str ()));
>
> I don't really understand the "m_environ_vector.end () - 1" here.
This is needed because the last element of m_environ_vector needs to be
always NULL. Therefore, this code is basically inserting the new
variable in the second-to-last position of the vector. I made a comment
on top of it to clarify this part.
Also, there are other places where I need to iterate through the
elements of the vector, and I'm also using the "- 1" in these places.
I'll put comments where applicable.
> Also, that's a lot of unnecessary copying/duping in that last
> statement. Using concat instead of xstrdup + operator+ would
> easily avoid it.
You're right, I've replaced it by a concat and it's simpler now.
> And you could easily avoid having to dup the string into both the
> map and the vector by making the map's value be a const char *
> instead of a std::string:
>
> std::unordered_map<std::string, const char *>
>
> and then make a map's value point directly into the value
> part of the string that is owned by the vector (as in,
> the substring that starts at VALUE in "VAR=VALUE").
That was a good idea.
>> }
>>
>> -/* Remove the setting for variable VAR from environment E. */
>> +/* See common/environ.h. */
>>
>> void
>> -unset_in_environ (struct gdb_environ *e, const char *var)
>> +gdb_environ::unset (const std::string &var)
>> {
>> - int len = strlen (var);
>> - char **vector = e->vector;
>> - char *s;
>> + this->m_environ_map.erase (var);
>> + unset_in_vector (this->m_environ_vector, var);
>> +}
>
>
>> -extern struct gdb_environ *make_environ (void);
>> +class gdb_environ
>> +{
>> +public:
>> + /* Regular constructor and destructor. */
>> + gdb_environ ();
>> + ~gdb_environ ();
>>
>> -extern void free_environ (struct gdb_environ *);
>> + /* Reinitialize the environment stored. This is used when the user
>> + wants to delete all environment variables added by him/her. */
>> + void reinit ();
>
> This should mention that the initial state is copied from the host's
> environ. I think the default ctor could use a similar comment.
> I was going to suggest to call this clear() instead, to go
> with the standard containers' terminology, until I realized what
> "init" really does. Or maybe even find a more explicit name,
> reset_from_host_environ or some such.
Sorry, I guess I wasn't aware of the importance of specifying that the
variables come from the host. Somehow I thought this was implied.
I guess reset_from_host_environ is a good name for the method; my other
option would be "reinit_using_host_environ", but that's longer.
> Perhaps an even clearer approach would be to make the default ctor
> not do a deep copy of the host's "environ", but instead add a
> static factor method that returned such a new gdb_environ, like:
>
> static gdb_environ
> gdb_environ::from_host_environ ()
> {
> // build/return a gdb_environ that wraps the host's environ global.
> }
>
> Not sure. Only experimenting would tell.
Sorry, I'm not sure I understand your suggestion. Please correct me if
I'm wrong.
You're basically saying that the default ctor shouldn't actually do
anything; instead, the class should have this new from_host_environ
method which would be the "official" ctor. The users would then call
from_host_environ directly when they wanted an instance of the class,
and the method would be responsible for initializing the object. Is
that correct?
If yes, I think I fail to see the advantage of this method over having a
normal ctor (aside from explicitly naming the new ctor after the fact
that we're using the host environ to build the object).
... after some reading ...
Oh, I think I see what you're suggesting. Instead of building one
gdb_environ every time someone requests it, we build just one and pass
it along. OK, now it makes sense.
> Speaking of copying the host environ:
>
>> _initialize_mi_cmd_env (void)
>> {
>> - struct gdb_environ *environment;
>> + gdb_environ environment;
>> const char *env;
>>
>> /* We want original execution path to reset to, if desired later.
>> @@ -278,13 +278,10 @@ _initialize_mi_cmd_env (void)
>> current_inferior ()->environment. Also, there's no obvious
>> place where this code can be moved such that it surely run
>> before any code possibly mangles original PATH. */
>> - environment = make_environ ();
>> - init_environ (environment);
>> - env = get_in_environ (environment, path_var_name);
>> + env = environment.get (path_var_name);
>>
>> /* Can be null if path is not set. */
>> if (!env)
>> env = "";
>> orig_path = xstrdup (env);
>> - free_environ (environment);
>> }
>
> This usage of gdb_environ looks like pointless
> wrapping / dupping / freeing to me -- I don't see why we
> need to dup the whole host environment just to get at some
> env variable. Using good old getenv(3) directly should do,
> and end up trimming off a bit of work from gdb's startup.
Absolutely. I didn't pay close attention to this bit; I was just
mechanically converting the code.
> I could see perhaps wanting to avoid / optimize the linear
> walks that getenv must do, if we have many getenv calls in
> gdb, but then that suggests keeping a gdb_environ global that
> is initialized early on and is reused. But that would miss
> any setenv call that changes the environment since that
> gdb_environ is created, so I'd prefer not do that unless
> we find a real need.
I'll make the code use getenv instead.
>> +
>> +private:
>> + /* Helper function that initializes our data structures with the
>> + environment variables. */
>> + void init ();
>> +
>> + /* Helper function that clears our data structures. */
>> + void clear ();
>
> s/function/method/g throughout (ChangeLog too).
Ops, thanks. Fixed.
>> +
>> + /* A map representing the environment variables and their values.
>> + It is easier to store them using a map because of the faster
>> + lookups. */
>
> It's not that it's easier to store them. Not having a map
> would be even easier. Just do a linear walk on the vector,
> just like getenv does. You want to instead say that we're
> optimizing for get operations. (I'm not sure this is the
> right trade off for this class, but I'll go along with it.)
Well, no need to worry about this anymore. The new code doesn't use
unordered_map, as mentioned above.
>> + std::unordered_map<std::string, std::string> m_environ_map;
>> +
>> + /* A vector containing the environment variables. This is useful
>> + for when we need to obtain a 'char **' with all the existing
>> + variables. */
>
> This should say that the entries are in VAR=VALUE form, and what
> is the typical user that needs that format.
I'll expand the comment.
>> + std::vector<char *> m_environ_vector;
>> +};
>>
>> #endif /* defined (ENVIRON_H) */
>
>> else
>> {
>> - char **vector = environ_vector (current_inferior ()->environment);
>> + const std::vector<char *> &vector =
>> + current_inferior ()->environment.get_vector ();
>
> = goes on the next line.
Fixed.
>>
>> - while (*vector)
>> + for (char *elem : vector)
>> {
>> - puts_filtered (*vector++);
>> + puts_filtered (elem);
>> puts_filtered ("\n");
>> }
>> }
>
> I'm not sure it's worth it to expose the std::vector implementation
> detail just for this loop. I don't really understand how this
> can work though, given that the vector is NULL terminated.
I'm not exposing the std::vector anymore. Instead, I'm just using the
regular 'char **' and looping until I find a NULL.
> I agree with Simon -- this is begging for some unit tests. :-)
I'm writing them. It's taking some time because it's the first time I'm
using the unittest framework.
Thanks,
--
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/
next prev parent reply other threads:[~2017-05-01 2:22 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-13 4:05 [PATCH] " Sergio Durigan Junior
2017-04-15 18:51 ` [PATCH v2] " Sergio Durigan Junior
2017-04-15 21:22 ` Simon Marchi
2017-04-18 2:49 ` Sergio Durigan Junior
2017-04-16 5:09 ` Simon Marchi
2017-04-16 17:32 ` Sergio Durigan Junior
2017-04-18 3:03 ` [PATCH v3] " Sergio Durigan Junior
2017-04-19 4:56 ` Simon Marchi
2017-04-19 16:30 ` Pedro Alves
2017-04-19 18:14 ` Pedro Alves
2017-05-01 2:22 ` Sergio Durigan Junior [this message]
2017-05-04 15:30 ` Pedro Alves
2017-06-14 19:22 ` [PATCH v4] " Sergio Durigan Junior
2017-06-16 15:45 ` Pedro Alves
2017-06-16 18:01 ` Sergio Durigan Junior
2017-06-16 18:23 ` Pedro Alves
2017-06-16 21:59 ` Sergio Durigan Junior
2017-06-16 22:23 ` [PATCH v5] " Sergio Durigan Junior
2017-06-17 8:54 ` Simon Marchi
2017-06-19 4:19 ` Sergio Durigan Junior
2017-06-19 13:40 ` Pedro Alves
2017-06-19 16:19 ` Sergio Durigan Junior
2017-06-19 12:13 ` Pedro Alves
2017-06-20 14:02 ` Pedro Alves
2017-06-19 4:36 ` [PATCH v6] " Sergio Durigan Junior
2017-06-19 4:51 ` Sergio Durigan Junior
2017-06-19 7:18 ` Simon Marchi
2017-06-19 14:26 ` Pedro Alves
2017-06-19 15:30 ` Simon Marchi
2017-06-19 15:44 ` Pedro Alves
2017-06-19 15:47 ` Pedro Alves
2017-06-19 16:26 ` Simon Marchi
2017-06-19 16:55 ` Pedro Alves
2017-06-19 17:59 ` Sergio Durigan Junior
2017-06-19 18:09 ` Pedro Alves
2017-06-19 18:23 ` Sergio Durigan Junior
2017-06-19 18:36 ` Pedro Alves
2017-06-19 18:38 ` Pedro Alves
2017-06-19 14:26 ` Pedro Alves
2017-06-19 16:13 ` Sergio Durigan Junior
2017-06-19 16:38 ` Pedro Alves
2017-06-19 16:46 ` Sergio Durigan Junior
2017-06-19 18:27 ` [PATCH v7] " Sergio Durigan Junior
2017-06-20 3:27 ` [PATCH v8] " Sergio Durigan Junior
2017-06-20 12:13 ` Pedro Alves
2017-06-20 12:46 ` Simon Marchi
2017-06-20 13:00 ` Sergio Durigan Junior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87o9vdjoz4.fsf@redhat.com \
--to=sergiodj@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=palves@redhat.com \
--cc=simon.marchi@polymtl.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).