public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Formatting command line arguments when starting a Cygwin process from a native process
@ 2016-05-05 15:24 David Allsopp
  2016-05-05 16:47 ` Erik Soderquist
  0 siblings, 1 reply; 16+ messages in thread
From: David Allsopp @ 2016-05-05 15:24 UTC (permalink / raw)
  To: cygwin

I am trying to work out the precise details for character escaping when
starting a Cygwin process from a native (i.e. non-Cygwin) Windows process.

I have an array of command line arguments which I want passed verbatim to
the process, as though it were invoked using execv, with no globbing to take
place. I therefore disable globbing by including the noglob option in the
CYGWIN environment variable. My reading of winsup/cygwin/dcrt0.cc suggests
that I should convert argv to a single string to pass to the Windows
CreateProcess API call by protecting any whitespace characters (\t, \r, \n
and space itself) with double quotes. Then the escaped individual argv items
can be concatenated together with a space between each one.

For example:

  argv[0] = "foo"
  argv[1] = "bar baz"

then the resulting command line string should be:

  lpCommandLine = "foo bar\" \"baz"

and if I've interpreted build_argv and quoted correctly in dcrt0.cc, then as
long as allow_glob is 0 (which it is, via the noglob option in the CYGWIN
environment variable) then the Cygwin DLL will correctly reconstruct argv
based on that string returned by the Windows GetCommandLineW call made in
dll_crt0_1.

However, it appears that the single quote character may only be used to
quote strings if globbing is enabled (dcrt0.cc line 321) so how should one
encode the following argv?

  argv[0] = "foo"
  argv[1] = "bar \"baz\""

There doesn't seem to be anything along the lines of the trickery in the
Windows API's CommandLineToArgvW function if globbing is turned off?

Thanks for any pointers to the correct solution!


David


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-05 15:24 Formatting command line arguments when starting a Cygwin process from a native process David Allsopp
@ 2016-05-05 16:47 ` Erik Soderquist
  2016-05-06  8:03   ` David Allsopp
  0 siblings, 1 reply; 16+ messages in thread
From: Erik Soderquist @ 2016-05-05 16:47 UTC (permalink / raw)
  To: cygwin

On Thu, May 5, 2016 at 11:24 AM, David Allsopp wrote:
>
> I am trying to work out the precise details for character escaping when
> starting a Cygwin process from a native (i.e. non-Cygwin) Windows process.
<snip>
> For example:
>
>   argv[0] = "foo"
>   argv[1] = "bar baz"
>
> then the resulting command line string should be:
>
>   lpCommandLine = "foo bar\" \"baz"

If I recall correctly, Windows cmd.exe uses the carrot (^) as the
general escape from shell character, so

C:\cygwin64\bin>.\echo.exe -e ^"hello\nworld^"
hello
world

works.

However, I've found Windows's interpretation to be inconsistent, so
often have to play with it to find what the "right combination" is for
a particular instance.

I find echoing the parameters to a temporary text file and then using
the file as input to be more reliable and easier to troubleshoot, and
it breaks apart whether it is Windows cli inconsistencies or receiving
program issues very nicely with the text file content as an
intermediary

-- Erik

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-05 16:47 ` Erik Soderquist
@ 2016-05-06  8:03   ` David Allsopp
  2016-05-06 13:17     ` Erik Soderquist
  2016-05-06 14:35     ` Andrey Repin
  0 siblings, 2 replies; 16+ messages in thread
From: David Allsopp @ 2016-05-06  8:03 UTC (permalink / raw)
  To: cygwin

[With apologies if threading is broken; I erroneously thought as the list was not subscriber-only that replies would use reply-all and so wasn't subscribed]

On Thu, May 5, 2016 at 06:47 PM, Erik Soderquist wrote:
> On Thu, May 5, 2016 at 11:24 AM, David Allsopp wrote:
> >
> > I am trying to work out the precise details for character escaping
> > when starting a Cygwin process from a native (i.e. non-Cygwin) Windows
> process.
> <snip>
> > For example:
> >
> >   argv[0] = "foo"
> >   argv[1] = "bar baz"
> >
> > then the resulting command line string should be:
> >
> >   lpCommandLine = "foo bar\" \"baz"
> 
> If I recall correctly, Windows cmd.exe uses the carrot (^) as the general
> escape from shell character, so
> 
> C:\cygwin64\bin>.\echo.exe -e ^"hello\nworld^"	
> hello
> world
> 
> works.

Indeed - but I'm not using cmd, or any shell for that matter (that's actually the point) - I am in a native Win32 process invoking a Cygwin process directly using the Windows API's CreateProcess call. As it happens, the program I have already has the arguments for the Cygwin process in an array, but Windows internally requires a single command line string (which is not in any related to Cmd).

> However, I've found Windows's interpretation to be inconsistent, so often
> have to play with it to find what the "right combination" is for a
> particular instance.
> 
> I find echoing the parameters to a temporary text file and then using the
> file as input to be more reliable and easier to troubleshoot, and it
> breaks apart whether it is Windows cli inconsistencies or receiving
> program issues very nicely with the text file content as an intermediary

This is an OK tack, but I don't wish to do this by experimentation and get caught out later by a case I didn't think of, so what I'm trying to determine is *exactly* how the Cygwin DLL processes the command line via its source code so that I can present it with my argv array converted to a single command line and be certain that the Cygwin will recover the same argv DLL.

My reading of the relevant sources suggests that with globbing disabled, backslash escape sequences are *never* interpreted (since the quote function returns early - dcrt0.cc, line 171). If there is no way of encoding the double quote character, then perhaps I have to run with globbing enabled but ensure that the globify function will never actually expand anything - but as that's a lot of work, I was wondering if I was missing something with the simpler "noglob" case.


David


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-06  8:03   ` David Allsopp
@ 2016-05-06 13:17     ` Erik Soderquist
  2016-05-06 14:35     ` Andrey Repin
  1 sibling, 0 replies; 16+ messages in thread
From: Erik Soderquist @ 2016-05-06 13:17 UTC (permalink / raw)
  To: cygwin

On Fri, May 6, 2016 at 4:03 AM, David Allsopp wrote:
>
> [With apologies if threading is broken; I erroneously thought as
> the list was not subscriber-only that replies would use reply-all
> and so wasn't subscribed]

Didn't break for me, though that might be google's threading in gmail
rather than standard threading.

> > C:\cygwin64\bin>.\echo.exe -e ^"hello\nworld^"
> > hello
> > world
> >
> > works.
>
> Indeed - but I'm not using cmd, or any shell for that matter
> (that's actually the point) - I am in a native Win32 process
> invoking a Cygwin process directly using the Windows API's
> CreateProcess call.  As it happens, the program I have already
> has the arguments for the Cygwin process in an array, but Windows
> internally requires a single command line string (which is not in
> any related to Cmd).

The you are way over my head...


> > However, I've found Windows's interpretation to be inconsistent, so often
> > have to play with it to find what the "right combination" is for a
> > particular instance.
> >
> > I find echoing the parameters to a temporary text file and then using the
> > file as input to be more reliable and easier to troubleshoot, and it
> > breaks apart whether it is Windows cli inconsistencies or receiving
> > program issues very nicely with the text file content as an intermediary
>
> This is an OK tack, but I don't wish to do this by experimentation
> and get caught out later by a case I didn't think of, so what I'm
> trying to determine is *exactly* how the Cygwin DLL processes the
> command line via its source code so that I can present it with my
> argv array converted to a single command line and be certain that
> the Cygwin will recover the same argv DLL.
>
> My reading of the relevant sources suggests that with globbing
> disabled, backslash escape sequences are *never* interpreted (since
> the quote function returns early - dcrt0.cc, line 171). If there is
> no way of encoding the double quote character, then perhaps I have
> to run with globbing enabled but ensure that the globify function
> will never actually expand anything - but as that's a lot of work,
> I was wondering if I was missing something with the simpler
> "noglob" case.

Again, way over my head, I'm currently a shell scripter...

-- Erik

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-06  8:03   ` David Allsopp
  2016-05-06 13:17     ` Erik Soderquist
@ 2016-05-06 14:35     ` Andrey Repin
  2016-05-07  7:45       ` David Allsopp
  1 sibling, 1 reply; 16+ messages in thread
From: Andrey Repin @ 2016-05-06 14:35 UTC (permalink / raw)
  To: David Allsopp, cygwin

Greetings, David Allsopp!

> [With apologies if threading is broken; I erroneously thought as the list
> was not subscriber-only that replies would use reply-all and so wasn't subscribed]

As long as your mail client is fine, you're fine.

> I'm not using cmd, or any shell for that matter (that's
> actually the point) - I am in a native Win32 process invoking a Cygwin
> process directly using the Windows API's CreateProcess call. As it happens,
> the program I have already has the arguments for the Cygwin process in an
> array, but Windows internally requires a single command line string (which
> is not in any related to Cmd).

Then all you need is a rudimentary quoting.
The rest will be handled by getopt when the command line is parsed.

>> However, I've found Windows's interpretation to be inconsistent, so often
>> have to play with it to find what the "right combination" is for a
>> particular instance.
>> 
>> I find echoing the parameters to a temporary text file and then using the
>> file as input to be more reliable and easier to troubleshoot, and it
>> breaks apart whether it is Windows cli inconsistencies or receiving
>> program issues very nicely with the text file content as an intermediary

> This is an OK tack, but I don't wish to do this by experimentation and get
> caught out later by a case I didn't think of, so what I'm trying to
> determine is *exactly* how the Cygwin DLL processes the command line via its
> source code so that I can present it with my argv array converted to a
> single command line and be certain that the Cygwin will recover the same argv DLL.

> My reading of the relevant sources suggests that with globbing disabled,
> backslash escape sequences are *never* interpreted (since the quote function
> returns early - dcrt0.cc, line 171). If there is no way of encoding the
> double quote character, then perhaps I have to run with globbing enabled but
> ensure that the globify function will never actually expand anything - but
> as that's a lot of work, I was wondering if I was missing something with the simpler "noglob" case.

The point being, when you pass the shell and enter direct process execution,
you don't need much of shell magic at all.
Shell conventions designed to ease interaction between system and operator.
But you have a system talking to the system, you can be very literal.


-- 
With best regards,
Andrey Repin
Friday, May 6, 2016 17:18:00

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-06 14:35     ` Andrey Repin
@ 2016-05-07  7:45       ` David Allsopp
  2016-05-09  9:43         ` Peter Rosin
  2016-05-09 14:57         ` Aaron Digulla
  0 siblings, 2 replies; 16+ messages in thread
From: David Allsopp @ 2016-05-07  7:45 UTC (permalink / raw)
  To: cygwin

Andrey Repin wrote:
> Greetings, David Allsopp!

And greetings to you, too!

<snip>

> > I'm not using cmd, or any shell for that matter (that's actually the
> > point) - I am in a native Win32 process invoking a Cygwin process 
> > directly using the Windows API's CreateProcess call. As it happens, 
> > the program I have already has the arguments for the Cygwin process 
> > in an array, but Windows internally requires a single command line 
> > string (which is not in any related to Cmd).
> 
> Then all you need is a rudimentary quoting.

Yes, but the question still remains what that rudimentary quoting is - i.e.
I can see how to quote spaces which appear in elements of argv, but I cannot
see how to quote double quotes! 

> The rest will be handled by getopt when the command line is parsed.

That's outside my required level - I'm interested in Cygwin's emulation
handling the difference between an operating system which actually passes
argc and argv when creating processes (Posix exec/spawn) and Windows (which
only passes a single string command line). The Microsoft C Runtime and
Windows have a "clear" (at least by MS standards) specification of how that
single string gets converted to argv, I'm trying to determine Cygwin's -
getopt definitely isn't part of that.

> >> However, I've found Windows's interpretation to be inconsistent, so 
> >> often have to play with it to find what the "right combination" is 
> >> for a particular instance.
> >>
> >> I find echoing the parameters to a temporary text file and then 
> >> using the file as input to be more reliable and easier to 
> >> troubleshoot, and it breaks apart whether it is Windows cli 
> >> inconsistencies or receiving program issues very nicely with the 
> >> text file content as an intermediary
> 
> > This is an OK tack, but I don't wish to do this by experimentation 
> > and get caught out later by a case I didn't think of, so what I'm 
> > trying to determine is *exactly* how the Cygwin DLL processes the 
> > command line via its source code so that I can present it with my 
> > argv array converted to a single command line and be certain that 
> > the Cygwin will
> recover the same argv DLL.
> 
> > My reading of the relevant sources suggests that with globbing 
> > disabled, backslash escape sequences are *never* interpreted (since 
> > the quote function returns early - dcrt0.cc, line 171). If there is 
> > no way of encoding the double quote character, then perhaps I have 
> > to run with globbing enabled but ensure that the globify function 
> > will never actually expand anything - but as that's a lot of work, I 
> > was wondering
> if I was missing something with the simpler "noglob" case.
> 
> The point being, when you pass the shell and enter direct process 
> execution, you don't need much of shell magic at all.
> Shell conventions designed to ease interaction between system and 
> operator.
> But you have a system talking to the system, you can be very literal.

Indeed, which is why I'm trying to avoid the shell! But I can't be entirely
literal, because Posix and Windows are not compatible, so I need to
determine precisely how Cygwin's emulation works... and so far, it doesn't
seem to be a terribly clearly defined animal!

So, resorting to C files to try to demonstrate it further. spawn.cc seems to
suggest that there should be some kind of escaping available, but I'm
struggling to follow the code. Consider these two:

callee.c
  #include <stdio.h>
  int main (int argc, char* argv[])
  {
    int i;

    printf("argc = %d\n", argc);
    for (i = 0; i < argc; i++) {
      printf("argv[%d] = %s\n", i, *argv++);
    }
    return 0;
  }

caller.c
  #include <windows.h>
  #include <stdio.h>

  int main (void)
  {
    LPTSTR commandLine;
    STARTUPINFO startupInfo = {sizeof(STARTUPINFO), NULL, NULL, NULL, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, NULL, NULL, NULL, NULL};
    PROCESS_INFORMATION process = {NULL, NULL, 0, 0};

    commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz *";
    if (!CreateProcess("callee.exe", commandLine, NULL, NULL, FALSE, 0,
NULL, NULL, &startupInfo, &process)) {
      printf("Error spawning process!\n");
      return 1;
    } else {
      WaitForSingleObject(process.hProcess, INFINITE);
      CloseHandle(process.hThread);
      CloseHandle(process.hProcess);
      return 0;
    }
  }

If you compile as follows:

  $ gcc -o callee callee.c
  $ i686-w64-mingw32-gcc -o caller caller.c
  $ export CYGWIN=noglob      # Or the * will be expanded
  $ ./caller

and the output is as required:
  argc = 6
  argv[0] = callee
  argv[1] = @te
  st
  argv[2] = fo@o
  argv[3] = bar baz
  argv[4] = fliggle
  argv[5] = *

But if I want to embed an actual " character in any of those arguments, I
cannot see any way to escape it which actually works at the moment. For
example, if you change commandLine in caller.c to be "callee.exe test\\\"
argument" then the erroneous output is:

  argc = 2
  argv[0] = callee
  argv[1] = test\ argument

where the required output is

  argc = 3
  argv[0] = callee
  argv[1] = test"
  argv[2] = argument

Any further clues appreciated. Is it actually even a bug?!


David


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-07  7:45       ` David Allsopp
@ 2016-05-09  9:43         ` Peter Rosin
  2016-05-09 10:06           ` Marco Atzeri
  2016-05-09 15:49           ` David Allsopp
  2016-05-09 14:57         ` Aaron Digulla
  1 sibling, 2 replies; 16+ messages in thread
From: Peter Rosin @ 2016-05-09  9:43 UTC (permalink / raw)
  To: cygwin, David Allsopp

Hi!

On 2016-05-07 09:45, David Allsopp wrote:
> Andrey Repin wrote:
>> Greetings, David Allsopp!
> 
> And greetings to you, too!
> 
> <snip>
> 
>>> I'm not using cmd, or any shell for that matter (that's actually the
>>> point) - I am in a native Win32 process invoking a Cygwin process 
>>> directly using the Windows API's CreateProcess call. As it happens, 
>>> the program I have already has the arguments for the Cygwin process 
>>> in an array, but Windows internally requires a single command line 
>>> string (which is not in any related to Cmd).
>>
>> Then all you need is a rudimentary quoting.
> 
> Yes, but the question still remains what that rudimentary quoting is - i.e.
> I can see how to quote spaces which appear in elements of argv, but I cannot
> see how to quote double quotes! 
> 
>> The rest will be handled by getopt when the command line is parsed.
> 
> That's outside my required level - I'm interested in Cygwin's emulation
> handling the difference between an operating system which actually passes
> argc and argv when creating processes (Posix exec/spawn) and Windows (which
> only passes a single string command line). The Microsoft C Runtime and
> Windows have a "clear" (at least by MS standards) specification of how that
> single string gets converted to argv, I'm trying to determine Cygwin's -
> getopt definitely isn't part of that.
> 
>>>> However, I've found Windows's interpretation to be inconsistent, so 
>>>> often have to play with it to find what the "right combination" is 
>>>> for a particular instance.
>>>>
>>>> I find echoing the parameters to a temporary text file and then 
>>>> using the file as input to be more reliable and easier to 
>>>> troubleshoot, and it breaks apart whether it is Windows cli 
>>>> inconsistencies or receiving program issues very nicely with the 
>>>> text file content as an intermediary
>>
>>> This is an OK tack, but I don't wish to do this by experimentation 
>>> and get caught out later by a case I didn't think of, so what I'm 
>>> trying to determine is *exactly* how the Cygwin DLL processes the 
>>> command line via its source code so that I can present it with my 
>>> argv array converted to a single command line and be certain that 
>>> the Cygwin will
>> recover the same argv DLL.
>>
>>> My reading of the relevant sources suggests that with globbing 
>>> disabled, backslash escape sequences are *never* interpreted (since 
>>> the quote function returns early - dcrt0.cc, line 171). If there is 
>>> no way of encoding the double quote character, then perhaps I have 
>>> to run with globbing enabled but ensure that the globify function 
>>> will never actually expand anything - but as that's a lot of work, I 
>>> was wondering
>> if I was missing something with the simpler "noglob" case.
>>
>> The point being, when you pass the shell and enter direct process 
>> execution, you don't need much of shell magic at all.
>> Shell conventions designed to ease interaction between system and 
>> operator.
>> But you have a system talking to the system, you can be very literal.
> 
> Indeed, which is why I'm trying to avoid the shell! But I can't be entirely
> literal, because Posix and Windows are not compatible, so I need to
> determine precisely how Cygwin's emulation works... and so far, it doesn't
> seem to be a terribly clearly defined animal!
> 
> So, resorting to C files to try to demonstrate it further. spawn.cc seems to
> suggest that there should be some kind of escaping available, but I'm
> struggling to follow the code. Consider these two:
> 
> callee.c
>   #include <stdio.h>
>   int main (int argc, char* argv[])
>   {
>     int i;
> 
>     printf("argc = %d\n", argc);
>     for (i = 0; i < argc; i++) {
>       printf("argv[%d] = %s\n", i, *argv++);
>     }
>     return 0;
>   }
> 
> caller.c
>   #include <windows.h>
>   #include <stdio.h>
> 
>   int main (void)
>   {
>     LPTSTR commandLine;
>     STARTUPINFO startupInfo = {sizeof(STARTUPINFO), NULL, NULL, NULL, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, NULL, NULL, NULL, NULL};
>     PROCESS_INFORMATION process = {NULL, NULL, 0, 0};
> 
>     commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz *";
>     if (!CreateProcess("callee.exe", commandLine, NULL, NULL, FALSE, 0,
> NULL, NULL, &startupInfo, &process)) {
>       printf("Error spawning process!\n");
>       return 1;
>     } else {
>       WaitForSingleObject(process.hProcess, INFINITE);
>       CloseHandle(process.hThread);
>       CloseHandle(process.hProcess);
>       return 0;
>     }
>   }
> 
> If you compile as follows:
> 
>   $ gcc -o callee callee.c
>   $ i686-w64-mingw32-gcc -o caller caller.c
>   $ export CYGWIN=noglob      # Or the * will be expanded
>   $ ./caller
> 
> and the output is as required:
>   argc = 6
>   argv[0] = callee
>   argv[1] = @te
>   st
>   argv[2] = fo@o
>   argv[3] = bar baz
>   argv[4] = fliggle
>   argv[5] = *
> 
> But if I want to embed an actual " character in any of those arguments, I
> cannot see any way to escape it which actually works at the moment. For
> example, if you change commandLine in caller.c to be "callee.exe test\\\"
> argument" then the erroneous output is:
> 
>   argc = 2
>   argv[0] = callee
>   argv[1] = test\ argument
> 
> where the required output is
> 
>   argc = 3
>   argv[0] = callee
>   argv[1] = test"
>   argv[2] = argument
> 
> Any further clues appreciated. Is it actually even a bug?!

I think cygwin emulates posix shell style command line parsing when
invoked from a Win32 process (like you do). So, try single quotes:

commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz '*' '\"\\'\"'";

I get this (w/o noglob):

argc = 7
argv[0] = callee
argv[1] = @te
st
argv[2] = fo@o
argv[3] = bar baz
argv[4] = baz
argv[5] = *
argv[6] = "'"

Cheers,
Peter

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-09  9:43         ` Peter Rosin
@ 2016-05-09 10:06           ` Marco Atzeri
  2016-05-09 15:49             ` David Allsopp
  2016-05-09 15:49           ` David Allsopp
  1 sibling, 1 reply; 16+ messages in thread
From: Marco Atzeri @ 2016-05-09 10:06 UTC (permalink / raw)
  To: cygwin

On 09/05/2016 11:43, Peter Rosin wrote:
> Hi!

>>
>>>> I'm not using cmd, or any shell for that matter (that's actually the
>>>> point) - I am in a native Win32 process invoking a Cygwin process
>>>> directly using the Windows API's CreateProcess call. As it happens,
>>>> the program I have already has the arguments for the Cygwin process
>>>> in an array, but Windows internally requires a single command line
>>>> string (which is not in any related to Cmd).

Ultimate overview of MS escape howto :

https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/

Regards
Marco



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin  process from a native process
  2016-05-07  7:45       ` David Allsopp
  2016-05-09  9:43         ` Peter Rosin
@ 2016-05-09 14:57         ` Aaron Digulla
  2016-05-09 15:19           ` David Allsopp
  1 sibling, 1 reply; 16+ messages in thread
From: Aaron Digulla @ 2016-05-09 14:57 UTC (permalink / raw)
  To: David Allsopp; +Cc: cygwin


Am Samstag, 07. Mai 2016 09:45 CEST, "David Allsopp" <dra27@cantab.net> schrieb:


> > Then all you need is a rudimentary quoting.
>
> Yes, but the question still remains what that rudimentary quoting is - i.e.
> I can see how to quote spaces which appear in elements of argv, but I cannot
> see how to quote double quotes!

This should help: https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/

My line of thought is that Cygwin can't get anything which Windows can't send it. So the first step to solve this mess is to make sure the arguments which you send to CreateProcess() are correct.

The next step would be to write a small C utility which dumps it's arguments, so you can properly debug all kinds of characters.

PS: I always point people to string list/array type methods to create processes which fix all the problems with spaces and odd characters (quotes, umlauts, etc). It seems that Windows doesn't have such a method to create processes. Which kind of makes sense; Windows is very, very mouse centered.

Regards,

--
Aaron "Optimizer" Digulla a.k.a. Philmann Dark
"It's not the universe that's limited, it's our imagination.
Follow me and I'll show you something beyond the limits."
http://blog.pdark.de/


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin  process from a native process
  2016-05-09 14:57         ` Aaron Digulla
@ 2016-05-09 15:19           ` David Allsopp
  2016-05-10 13:30             ` Aaron Digulla
  0 siblings, 1 reply; 16+ messages in thread
From: David Allsopp @ 2016-05-09 15:19 UTC (permalink / raw)
  To: Aaron Digulla, cygwin

Aaron Digulla wrote:
> 
> Am Samstag, 07. Mai 2016 09:45 CEST, "David Allsopp" <dra27@cantab.net>
> schrieb:
> 
> 
> > > Then all you need is a rudimentary quoting.
> >
> > Yes, but the question still remains what that rudimentary quoting is -
> i.e.
> > I can see how to quote spaces which appear in elements of argv, but I
> > cannot see how to quote double quotes!
> 
> This should help:
> https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/e
> veryone-quotes-command-line-arguments-the-wrong-way/

This provides documentation for how Microsoft implementations do it, not how Cygwin does it. The Cygwin DLL is responsible for determining how a Cygwin process gets argc and argv from GetCommandLineW.

> My line of thought is that Cygwin can't get anything which Windows can't
> send it. So the first step to solve this mess is to make sure the
> arguments which you send to CreateProcess() are correct.
> 
> The next step would be to write a small C utility which dumps it's
> arguments, so you can properly debug all kinds of characters.

See later email, but IMHO the conversion is something Cygwin should have precisely documented, not determined by brittle experimentation.

> PS: I always point people to string list/array type methods to create
> processes which fix all the problems with spaces and odd characters
> (quotes, umlauts, etc). It seems that Windows doesn't have such a method
> to create processes. Which kind of makes sense; Windows is very, very
> mouse centered.

I fail to see the connection with mice! What Windows (NT) does have is a legacy where the decision on how to convert a command line to a list/array of arguments is determined per-process (and so not the responsibility of command line shells) vs Unix which puts the burden of converting a single command line to the array on the shell instead. Nominally, the Windows way is more flexible, though I don't think that flexibility is actually useful (especially if you look at the comments in the command line -> argv conversion in Microsoft's C Runtime library!). 


David


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-09  9:43         ` Peter Rosin
  2016-05-09 10:06           ` Marco Atzeri
@ 2016-05-09 15:49           ` David Allsopp
  1 sibling, 0 replies; 16+ messages in thread
From: David Allsopp @ 2016-05-09 15:49 UTC (permalink / raw)
  To: 'Peter Rosin', cygwin

Hi!

Peter Rosin wrote:
> I think cygwin emulates posix shell style command line parsing when
> invoked from a Win32 process (like you do). So, try single quotes:
> 
> commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz '*'
> '\"\\'\"'";
> 
> I get this (w/o noglob):
> 
> argc = 7
> argv[0] = callee
> argv[1] = @te
> st
> argv[2] = fo@o
> argv[3] = bar baz
> argv[4] = baz
> argv[5] = *
> argv[6] = "'"

Yes, that seems to be approximately the way I arrived at too - my concern
with all the extra quoting is then hitting the Windows limit on command line
length (I'd like to avoid needing @response files as much as possible). So
with various experimenting, the slightly odd scheme I've come up with is
that if none of the arguments contain double-quote characters, then set
noglob and use the quoting mechanism previously described (so whitespace
within an argument or an @ at the beginning of an argument needing
double-quoting) and if a double-quote character appears in any of the
arguments, then don't set noglob and escape every argument with
double-quotes (to avoid globbing) - any double-quote characters within can
then be escaped with "'"'" (i.e. terminate the current quote string,
single-quote a double-quote and then resume a quote string!). Messy, but
that seems to be about the only way...


David 


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-09 10:06           ` Marco Atzeri
@ 2016-05-09 15:49             ` David Allsopp
  2016-05-09 16:02               ` Marco Atzeri
  0 siblings, 1 reply; 16+ messages in thread
From: David Allsopp @ 2016-05-09 15:49 UTC (permalink / raw)
  To: 'Marco Atzeri', cygwin

Marco Atzeri wrote:
> 
> Ultimate overview of MS escape howto :
> 
> https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/e
> veryone-quotes-command-line-arguments-the-wrong-way/

This is a great article (which I'd not come across before), but this relates
to Microsoft's mechanisms for quoting which aren't applicable here - it's
definitely the Cygwin DLL which does it!


David 



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-09 15:49             ` David Allsopp
@ 2016-05-09 16:02               ` Marco Atzeri
  2016-05-09 16:14                 ` David Allsopp
  0 siblings, 1 reply; 16+ messages in thread
From: Marco Atzeri @ 2016-05-09 16:02 UTC (permalink / raw)
  To: cygwin

On 09/05/2016 17:49, David Allsopp wrote:
> Marco Atzeri wrote:
>>
>> Ultimate overview of MS escape howto :
>>
>> https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/e
>> veryone-quotes-command-line-arguments-the-wrong-way/
>
> This is a great article (which I'd not come across before), but this relates
> to Microsoft's mechanisms for quoting which aren't applicable here - it's
> definitely the Cygwin DLL which does it!
>
>
> David
>

Hi David,
I am puzzled, I had the impression you asked:
"I am trying to work out the precise details for character escaping when
starting a Cygwin process from a native (i.e. non-Cygwin) Windows process."

So the exec should be on "Windows process" side,
why inside the Cygwin DLL ?

Regards
Marco


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin process from a native process
  2016-05-09 16:02               ` Marco Atzeri
@ 2016-05-09 16:14                 ` David Allsopp
  0 siblings, 0 replies; 16+ messages in thread
From: David Allsopp @ 2016-05-09 16:14 UTC (permalink / raw)
  To: 'Marco Atzeri', cygwin

Marco Atzeri wrote:
> On 09/05/2016 17:49, David Allsopp wrote:
> > Marco Atzeri wrote:
> >>
> >> Ultimate overview of MS escape howto :
> >>
> >> https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04
> >> /23/e veryone-quotes-command-line-arguments-the-wrong-way/
> >
> > This is a great article (which I'd not come across before), but this
> > relates to Microsoft's mechanisms for quoting which aren't applicable
> > here - it's definitely the Cygwin DLL which does it!
> >
> >
> > David
> >
> 
> Hi David,

Hi!

> I am puzzled, I had the impression you asked:
> "I am trying to work out the precise details for character escaping when
> starting a Cygwin process from a native (i.e. non-Cygwin) Windows
> process."
> 
> So the exec should be on "Windows process" side, why inside the Cygwin DLL
> ?

In Windows, there is no "exec" - the equivalent functions in Microsoft's C
runtime are themselves emulations boiling down to a CreateProcess call
(unlike on Unix, where the exec family are actual system calls). That
CreateProcess call takes exactly a single string for the entire command line
which the Cygwin DLL then has to convert to argc/argv in order to call the
main function in the Cygwin program started (Microsoft's C runtime has to do
the same thing which is part of what that blog explains). Any sane native
Windows program will use (or be compliant with) CommandLineToArgvW, but no
program has to do it that on Windows (indeed, Cygwin takes "advantage" of
this to do its own thing).

I have total control of my native call to CreateProcess, so it's definitely
about working out exactly what Cygwin does with the command line. AFAICT,
when a Cygwin program execs another Cygwin program, they actually
communicate argc and argv via memory, rather than through the process
invocation (although I'm really not sure that I've interpreted that
correctly) which is why this doesn't come up in pure Cygwin-land.

I think I'm there, it's just not as clean (or documented) as might be nice!


David


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin  process from a native process
  2016-05-09 15:19           ` David Allsopp
@ 2016-05-10 13:30             ` Aaron Digulla
  2016-05-10 17:02               ` David Allsopp
  0 siblings, 1 reply; 16+ messages in thread
From: Aaron Digulla @ 2016-05-10 13:30 UTC (permalink / raw)
  To: cygwin


Am Montag, 09. Mai 2016 17:19 CEST, David Allsopp <david@allsopps.net> schrieb:

> Aaron Digulla wrote:
> >
> > Am Samstag, 07. Mai 2016 09:45 CEST, "David Allsopp" <dra27@cantab.net>
> > schrieb:
> >
> >
> > > > Then all you need is a rudimentary quoting.
> > >
> > > Yes, but the question still remains what that rudimentary quoting is -
> > i.e.
> > > I can see how to quote spaces which appear in elements of argv, but I
> > > cannot see how to quote double quotes!
> >
> > This should help:
> > https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/e
> > veryone-quotes-command-line-arguments-the-wrong-way/
>
> This provides documentation for how Microsoft implementations do it, not how Cygwin does it. The Cygwin DLL is responsible for determining how a Cygwin process gets argc and argv from GetCommandLineW.

That's correct but I read your question as "how do I start executables linked against Cygwin from another Windows process"

To do that, you need to convert the argument list/array into the stupid Windows format because that's what the Cygwin process will expect.

> > My line of thought is that Cygwin can't get anything which Windows can't
> > send it. So the first step to solve this mess is to make sure the
> > arguments which you send to CreateProcess() are correct.
> >
> > The next step would be to write a small C utility which dumps it's

> > arguments, so you can properly debug all kinds of characters.
>
> See later email, but IMHO the conversion is something Cygwin should have precisely documented, not determined by brittle experimentation.

Ah... no. You're mixing two or three things. Let me enumerate:

1. You have to give your OS (Windows or Unix) the information which process you want to start and which arguments to pass. Unix has two ways (string array and single string), Windows has only single string.
2. The OS will put the information into a structure of some kind and pass that to the new process.
3. If you have a shell (CMD.exe, bash, etc), they will take the structure and parse it according to some rules. They will then convert the result again into an OS call.
4. The C runtime of your executable will know where to get the OS structure and how to turn the structure into char ** argv.

Where is Cygwin in all this? It's part of step #3. Cygwin emulates exec() and similar Unix OS functions which an emulated shell like BASH will use. Which means Cygwin code in the DLL is irrelevant if you don't have a Unix shell somewhere in your process chain.

If you just want to execute a Cygwin process (= Windows process which includes the Cygwin.dll), you need to know #1, #2 and #4.


> > PS: I always point people to string list/array type methods to create
> > processes which fix all the problems with spaces and odd characters
> > (quotes, umlauts, etc). It seems that Windows doesn't have such a method
> > to create processes. Which kind of makes sense; Windows is very, very
> > mouse centered.
>
> I fail to see the connection with mice! What Windows (NT) does have is a legacy where the decision on how to convert a command line to a list/array of arguments is determined per-process (and so not the responsibility of command line shells) vs Unix which puts the burden of converting a single command line to the array on the shell instead. Nominally, the Windows way is more flexible, though I don't think that flexibility is actually useful (especially if you look at the comments in the command line -> argv conversion in Microsoft's C Runtime library!).

Using single strings to run commands causes all kinds or problems. It's a brittle API, which you should avoid. It kind of feels more simple but it's the "too simple" kind which Einstein mentioned ("You should make things as simple as possible. But not more simle.")

Think of it that way: Your executable gets the arguments as a string array. On the parent process side, Unix allows you to create a process with an array of strings. That's a natural API which doesn't need any kind of conversion (maybe you need to copy the strings but that's it).

If you convert from and to a single string all the time, you need a way to quote and escape. So this is more error prone than the plain array solution.

And lastly: If everyone would always use arrays, we wouldn't have this long, tedious discussion.

Regards,

--
Aaron "Optimizer" Digulla a.k.a. Philmann Dark
"It's not the universe that's limited, it's our imagination.
Follow me and I'll show you something beyond the limits."
http://blog.pdark.de/


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Formatting command line arguments when starting a Cygwin  process from a native process
  2016-05-10 13:30             ` Aaron Digulla
@ 2016-05-10 17:02               ` David Allsopp
  0 siblings, 0 replies; 16+ messages in thread
From: David Allsopp @ 2016-05-10 17:02 UTC (permalink / raw)
  To: 'Aaron Digulla', cygwin

Aaron Digulla wrote:
> David Allsopp wrote:
> > Aaron Digulla wrote:
> > >
> > > Am Samstag, 07. Mai 2016 09:45 CEST, "David Allsopp"
> > > schrieb:
> > >
> > >
> > > > > Then all you need is a rudimentary quoting.
> > > >
> > > > Yes, but the question still remains what that rudimentary quoting
> > > > is -
> > > i.e.
> > > > I can see how to quote spaces which appear in elements of argv,
> > > > but I cannot see how to quote double quotes!
> > >
> > > This should help:
> > > https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/0
> > > 4/23/e veryone-quotes-command-line-arguments-the-wrong-way/
> >
> > This provides documentation for how Microsoft implementations do it, not
> how Cygwin does it. The Cygwin DLL is responsible for determining how a
> Cygwin process gets argc and argv from GetCommandLineW.
> 
> That's correct but I read your question as "how do I start executables
> linked against Cygwin from another Windows process"

That's the correct reading of my question!

> To do that, you need to convert the argument list/array into the stupid
> Windows format because that's what the Cygwin process will expect.

This is not correct - both by reading the code and by testing. If you put this into the program caller.c I posted previously in thread which is intended to start with argv[1] being the literal string a\\b (i.e. 4 characters)

commandLine = "callee \"a\\\\b\"";

and then if callee.c is compiled with i686-w64-mingw32-gcc (Microsoft-world - escaping rules according to MSDN):

$ ./caller
argc = 2
argv[0] = callee
argv[1] = a\\b

with argv[1] correctly given in Microsoft's escaping. But if you compile callee.c with gcc (Cygwin-world), you get:

$ ./caller
argc = 2
argv[0] = callee
argv[1] = a\b

With the Cygwin DLL applying its interpretation of the escaping (treating \\ within a quoted parameter as an escaped single backslash), and quite clearly not MSDN's (treating \\ as two backslash characters because it's no followed by a double-quote). As an aside, if you have CYGWIN=noglob, you will actually get the same output as the native Windows case with two backslashes (more evidence, if you still need it, about how my question is everything to do with the Cygwin DLL, and nothing to do with MSDN and Microsoft's escaping rules).

There's also the small matter of Cygwin's @file trick for reading command line arguments from files (i.e. an extra escaping rule not indicated in MSDN because it's not part of Windows) - this time have commandLine = "@test", run echo foo>test and this time with a Microsoft-compiled callee, you'll get argv[1] = @test and with a Cygwin-compiled one, you'll get argv[1] = foo

> > > My line of thought is that Cygwin can't get anything which Windows
> > > can't send it. So the first step to solve this mess is to make sure
> > > the arguments which you send to CreateProcess() are correct.
> > >
> > > The next step would be to write a small C utility which dumps it's
> 
> > > arguments, so you can properly debug all kinds of characters.
> >
> > See later email, but IMHO the conversion is something Cygwin should have
> precisely documented, not determined by brittle experimentation.
> 
> Ah... no. You're mixing two or three things. Let me enumerate:
> 
> 1. You have to give your OS (Windows or Unix) the information which
> process you want to start and which arguments to pass. Unix has two ways
> (string array and single string), Windows has only single string.

Which system call in Unix allows you to start a process by giving a single string instead of an array of arguments (or a series of parameters)?

> 2. The OS will put the information into a structure of some kind and pass
> that to the new process.
> 3. If you have a shell (CMD.exe, bash, etc), they will take the structure
> and parse it according to some rules. They will then convert the result
> again into an OS call.
> 4. The C runtime of your executable will know where to get the OS
> structure and how to turn the structure into char ** argv.
> Where is Cygwin in all this? It's part of step #3. Cygwin emulates exec()
> and similar Unix OS functions which an emulated shell like BASH will use.
> Which means Cygwin code in the DLL is irrelevant if you don't have a Unix
> shell somewhere in your process chain.

No, Cygwin is loosely part of step #3 and definitely the whole of #4. If you invoke a Cygwin process from a native process, the Cygwin DLL does the work of #4, not the C runtime. Step #3 is entirely irrelevant to my problem because there is no shell involved.

> If you just want to execute a Cygwin process (= Windows process which
> includes the Cygwin.dll), you need to know #1, #2 and #4.

Indeed - which basically was my original question. Precisely how Cygwin deals with the conversion in Step 4.


David


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-05-10 17:02 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-05 15:24 Formatting command line arguments when starting a Cygwin process from a native process David Allsopp
2016-05-05 16:47 ` Erik Soderquist
2016-05-06  8:03   ` David Allsopp
2016-05-06 13:17     ` Erik Soderquist
2016-05-06 14:35     ` Andrey Repin
2016-05-07  7:45       ` David Allsopp
2016-05-09  9:43         ` Peter Rosin
2016-05-09 10:06           ` Marco Atzeri
2016-05-09 15:49             ` David Allsopp
2016-05-09 16:02               ` Marco Atzeri
2016-05-09 16:14                 ` David Allsopp
2016-05-09 15:49           ` David Allsopp
2016-05-09 14:57         ` Aaron Digulla
2016-05-09 15:19           ` David Allsopp
2016-05-10 13:30             ` Aaron Digulla
2016-05-10 17:02               ` David Allsopp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).