public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simon.marchi@polymtl.ca>
To: Lancelot SIX <lsix@lancelotsix.com>
Cc: gdb-patches@sourceware.org, Simon Marchi <simon.marchi@efficios.com>
Subject: Re: [PATCH 5/5] gdbsupport: add path_join function
Date: Mon, 18 Apr 2022 14:43:08 -0400	[thread overview]
Message-ID: <e57129cb-b280-bb81-b317-4f3b0797f064@polymtl.ca> (raw)
In-Reply-To: <20220415143827.t2nlcfhmh2pondev@ubuntu.lan>



On 2022-04-15 10:38, Lancelot SIX wrote:
> Hi,
> 
>> +/* Join COMPONENT and all subsequent null-terminated string arguments into a
>> +   single path.
>> +
>> +   The last argument must be nullptr to denote the end of the argument list
>> +
>> +   Empty arguments or arguments consisting only of directory separators are
>> +   ignored.  */
>> +
>> +extern std::string path_join (const char *component, ...)
>> +  ATTRIBUTE_SENTINEL;
> 
> The last part of the interface seems odd to me.  Mostly because it is
> quite different to the os.path.join API from python.
> 
> Most notable, in python `os.path.join (a, b)` will yield b if b is an
> absolute path:
> 
>     >>> os.path.join("/home/foo", "/")
>     '/'
>     >>> os.path.join("/foo", "/bar")
>     '/bar'
> 
> One rational for this behavior would be to try to mimic using cd on each
> argument in order:
> 
>     def strange_join(*args):
>         cwd = os.getcwd()
> 	try:
> 	    for comp in args:
> 	        os.chdir(comp)
> 	    return os.getcwd()
>         finally:
> 	    os.chdir(cwd)
> 
> Of course this implementation would be wrong for multiple reasons:
> - it must work for dirs which do not exist
> - the ".." remain ".."
> - many other...
> 
> This behaviour regarding abs path is also consistent what I have with
> std::filesystem in c++17 (which we be nice if we could use, but
> unfortunately for the moment we have to stick to c++11)
> 
>     auto p = std::filesystem::path {"/foo"} / "/bar";
>     assert (p == "/bar");
> 
> If you compare to one of the testcase you add, the behavior is
> different:
> 
>     s = ::path_join ("/foo", "/bar", nullptr);
>     SELF_CHECK (s == "/foo/bar");
> 
> Is this different semantics made on purpose?

I indeed noticed how os.path.join handled ("/foo", "/bar").

I went with the behavior in my patch mostly because there are cases in
GDB where we really need to append absolute paths to another path.  For
example, looking up files in sysroots.  If the sysroot is set to
"/the/sysroot/" (with or without the trailing slash) and we want to load
the target library /usr/lib/libfoo.so, it would be nice if

    path_join (sysroot_path, lib_path)

gave

    /the/sysroot/usr/lib/libfoo.so

without a double-slash.  Same when we are looking up files in the
debug-file-directory (/usr/lib/debug).

When changing those implementations that currently use concat(...) or
string_printf("%s/%s"), we would need to be careful: if the right hand
side is an absolute path, then we would have to skip the leading
directory separator before passing it to path_join, to get the desired
result.  And then we're back at having some special path handling
details everywhere.

So, IMO, the behavior I implemented makes more sense for us.  I just
don't know what it would mean on Windows.  What would this give?

  path_join ("C:/hello", "D:/hi", nullptr);

> Note that there is another part of the behavior which is different in
> your proposal v.s.  python and std::filesystem.  Appending an empty
> component at the end ensures that the path terminates with a path
> separator:
> 
>     >>> os.path.join("foo", "")
>     "foo/"
> 
> If I use your patch, I see:
> 
>     s = ::path_join ("foo", "", nullptr);
>     SELF_CHECK (s == "foo");

I also noticed that.  In my opinion, adding that last slash is not
really useful.  For consistency, we should always present directories
the same way, either with or without a trailing slash.  Assuming we have
this code:

  s = path_join ("/base/dir", either_a_subdirectory_or_empty, nullptr);

Then if either_a_subdirectory_or_empty is empty, I'd like the result to
be /base/dir, exactly as if we had done:

  if (!either_a_subdirectory_or_empty.empty ())
    s = path_join ("/base/dir", either_a_subdirectory_or_empty, nullptr);
  else
    s = "/base/dir";

So again, the reason for my choice is just that it likely leads to
simpler code at call sites, in our probable use cases.

> Using similar semantics to std::filesystem might allow us to replace our
> implementation with the c++ standard library when we switch to c++17.

I understand the idea, and considered it.  The trailing slash one isn't
very important, but I think the std::filesystem behavior with the
absolute path would be annoying.

> On a different note, I am not sure I am a huge fan of the last nullptr
> argument which is required.
> 
> You can achieve a similar interface except for this nullptr using light
> variadic templates (not too much, should be easy enough to maintain).
> Is this something you considered?
> 
> One implementation could be:
> 
>     /* The real worker can be compiled in a .cc file if you wish.  */
>     extern std::string path_join_1 (std::string a, std::string b);
> 
>     template <typename ...Args)
>     std::string
>     path_join (std::string a, std::string b, Args... comps)
>     {
>       return path_join (path_join (a, b), comps...);
>     }
> 
>     template <>
>     std::string
>     path_join (std::string a, std::string b)
>     {
>       return path_join_1 (a, b);
>     }
> 
> I do not pretend this is optimal (it is not).  In this POC I pass all
> arguments as std::string and by value, this is probably not what you
> want to do in a production grade implementation, but this just gives the
> overall idea.
> 
> What do you think?

If you can make it work using gdb::string_view or const char *, I think
it would be ok.  I'll give it a try.  But otherwise, I am personally
fine with the sentinel nullptr, given that the compiler gives you a
warning if you forget it.

Simon

  parent reply	other threads:[~2022-04-18 18:44 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-14 20:01 [PATCH 1/5] gdb: call gdb_tilde_expand instead of gdb_tilde_expand_up in source_script_with_search Simon Marchi
2022-04-14 20:01 ` [PATCH 2/5] gdbsupport: make gdb_abspath return an std::string Simon Marchi
2022-04-18 19:41   ` Tom Tromey
2022-04-18 20:09     ` Pedro Alves
2022-04-18 20:11     ` Simon Marchi
2022-04-14 20:01 ` [PATCH 3/5] gdbsupport: make gdb_realpath_keepfile " Simon Marchi
2022-04-18 19:44   ` Tom Tromey
2022-04-14 20:01 ` [PATCH 4/5] gdb: use gdb_tilde_expand instead of gdb_tilde_expand_up in source_script_with_search Simon Marchi
2022-04-18 19:44   ` Tom Tromey
2022-04-18 20:12     ` Simon Marchi
2022-04-14 20:01 ` [PATCH 5/5] gdbsupport: add path_join function Simon Marchi
2022-04-15  5:59   ` Eli Zaretskii
2022-04-18 18:11     ` Simon Marchi
2022-04-18 18:30       ` Eli Zaretskii
2022-04-18 19:24       ` Pedro Alves
2022-04-15 14:38   ` Lancelot SIX
2022-04-15 16:55     ` Lancelot SIX
2022-04-18 18:43     ` Simon Marchi [this message]
2022-04-18 19:09       ` Pedro Alves
2022-04-18 19:12         ` Simon Marchi
2022-04-18 20:55           ` Simon Marchi
2022-04-18 21:07             ` Pedro Alves
2022-04-19  0:19               ` Simon Marchi
2022-04-18 19:22       ` Pedro Alves
2022-04-18 20:01       ` Tom Tromey
2022-04-18 23:11       ` Lancelot SIX
2022-04-20  0:22         ` Simon Marchi
2022-04-18 19:36 ` [PATCH 1/5] gdb: call gdb_tilde_expand instead of gdb_tilde_expand_up in source_script_with_search Tom Tromey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e57129cb-b280-bb81-b317-4f3b0797f064@polymtl.ca \
    --to=simon.marchi@polymtl.ca \
    --cc=gdb-patches@sourceware.org \
    --cc=lsix@lancelotsix.com \
    --cc=simon.marchi@efficios.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).