public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Re: Pipes truncating data in cygwin from main and cygwin-3_4-branch
       [not found] <13B0370E-B61A-44B9-A885-5FF1B8F4AC5F@callow.im>
@ 2023-08-15  0:30 ` Takashi Yano
       [not found]   ` <CCC360AD-21B1-40E4-8B6B-FCFAF5612939@callow.im>
  0 siblings, 1 reply; 4+ messages in thread
From: Takashi Yano @ 2023-08-15  0:30 UTC (permalink / raw)
  To: cygwin; +Cc: キャロウ マーク

On Mon, 14 Aug 2023 20:51:39 +0900
キャロウ マーク wrote:
> I have a problem that looks like pipes are truncating data when I cat a file to my program’s stdin. A simple `cat foo | cat > bar` works fine. bar ends up identical to foo. It is more complicated than that. My application is doing this
> std::stringstream buffer;
> buffer << std::cin.rdbuf();
> std::istream* isp = &buffer;
> Initial reads after this work fine. Once the app has read everything up to the payload data in the file, it does
> 
> off_t dataStart = (off_t)(isp->tellg());
> isp->seekg(0, ios_base::end);
> off_t dataEnd = (off_t)(isp->tellg());
> dataSizeInFile = dataEnd - dataStart;
> The tellg result shows the size is significantly less than the actual file data. 43k less in a 170k file. It is seemingly being truncated somewhere.
> 
> Later the app does
> 
>    isp->seekg(0);
>    std::streambuf* _streambuf = (isp->rdbuf());
> and starts reading from _streambuf. All data read from _streambuf is gibberish.
> 
> The application code makes no distinction between a pipe and stdin redirection from a file. It just uses std::cin. stdin redirection still works.
> 
> I created a minimal reproducer. More on that in a moment.
> 
> I first encountered this in Git for Windows 2.41.0. I had no problem in previous versions. I reported this to the Git for Windows project. See https://github.com/git-for-windows/git/issues/4464. You can find the minimal reproducer over there. It consists of 2 parts, a script and a small c++ program. The script finds the size of the target file then cats it to the test program passing the file size as a command line option. The test program does what I have described above and compares the file size determined from the seek to the end with the provided size.
> 
> A G4W project member reports that the problem reproduces on vanilla Cygwin in the branches mentioned in the subject and that G4W and MSYS2 are on the cygwin-3_4-branch release train. He recommends reporting the bug to you, You can find his(?) full comment here <https://github.com/git-for-windows/git/issues/4464#issuecomment-1671137446>.

Your test case does not work in command prompt as well.

Try
type testfile | test-pipe sizeoftestfile
in command prompt. It will fail.

New pipe implementation since cygwin 3.4.x provides the pipes
more similar to pipe in command prompt for non-cygwin apps.

Since your test case is compiled with cl.exe, it is non-cygwin
apps.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Pipes truncating data in cygwin from main and cygwin-3_4-branch
       [not found]   ` <CCC360AD-21B1-40E4-8B6B-FCFAF5612939@callow.im>
@ 2023-08-15  6:42     ` Takashi Yano
       [not found]       ` <FCF9C3F3-7ED6-406A-8420-05379E4C5875@callow.im>
  0 siblings, 1 reply; 4+ messages in thread
From: Takashi Yano @ 2023-08-15  6:42 UTC (permalink / raw)
  To: cygwin; +Cc: キャロウ マーク

On Tue, 15 Aug 2023 09:53:16 +0900
キャロウ マーク wrote:
> > On Aug 15, 2023, at 9:30, Takashi Yano <takashi.yano@nifty.ne.jp> wrote:
> > 
> > Your test case does not work in command prompt as well.
> > 
> > Try
> > type testfile | test-pipe sizeoftestfile
> > in command prompt. It will fail.
> 
> Interesting.
> 
> > 
> > New pipe implementation since cygwin 3.4.x provides the pipes
> > more similar to pipe in command prompt for non-cygwin apps.
> 
> What are the differences between these pipes? What changed?

Many changes. But, the change which triggers this behaviour
is setting FILE_SYNCHRONOUS_IO_NONALERT create option.

With this option, seekg() of Microsoft library gets success
for pipes, despite it should not I suppose.

In Linux, man page states that fseek() on pipes will fail.

> > Since your test case is compiled with cl.exe, it is non-cygwin
> > apps.
> 
> When the failure first appeared the bash shell included with Git for Windows was being used to run the pipe (`bash -c "cat foo | bar”`). The shell was started by ctest which had been run from PowerShell. Is this cygwin or non-cygwin?

It depends on "bar". If bar is compiled with cl.exe or gcc/g++ in
MSYS2 mingw environment, it is non-cygwin app.

> in https://github.com/git-for-windows/git/issues/4464#issuecomment-1671137446 the author provided minimal adaption to the reproducer to compile it with g++ and reproduced the failure on Cygwin. It looks like the failure happens for both cygwin and non-cygwin. It is probably related to the new pipe implementation you mentioned.

Is there any failure case where the pipe reader is cygwin (or MSYS2)
binary?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Pipes truncating data in cygwin from main and cygwin-3_4-branch
       [not found]       ` <FCF9C3F3-7ED6-406A-8420-05379E4C5875@callow.im>
@ 2023-08-15  9:45         ` Takashi Yano
  2023-08-15 10:01         ` キャロウ マーク
  1 sibling, 0 replies; 4+ messages in thread
From: Takashi Yano @ 2023-08-15  9:45 UTC (permalink / raw)
  To: cygwin; +Cc: キャロウ マーク

On Tue, 15 Aug 2023 18:09:50 +0900
キャロウ マーク wrote:
> > On Aug 15, 2023, at 15:42, Takashi Yano <takashi.yano@nifty.ne.jp> wrote:
> > 
> >> 
> >>> 
> >>> New pipe implementation since cygwin 3.4.x provides the pipes
> >>> more similar to pipe in command prompt for non-cygwin apps.
> >> 
> >> What are the differences between these pipes? What changed?
> > 
> > Many changes. But, the change which triggers this behaviour
> > is setting FILE_SYNCHRONOUS_IO_NONALERT create option.
> > 
> > With this option, seekg() of Microsoft library gets success
> > for pipes, despite it should not I suppose.
> 
> On what create is this option specified?

Pipe itself. We use NtCreateNamedPipeFile() to create a pipe.
https://learn.microsoft.com/en-us/windows/win32/devnotes/nt-create-named-pipe-file

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Pipes truncating data in cygwin from main and cygwin-3_4-branch
       [not found]       ` <FCF9C3F3-7ED6-406A-8420-05379E4C5875@callow.im>
  2023-08-15  9:45         ` Takashi Yano
@ 2023-08-15 10:01         ` キャロウ マーク
  1 sibling, 0 replies; 4+ messages in thread
From: キャロウ マーク @ 2023-08-15 10:01 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]



> On Aug 15, 2023, at 18:09, キャロウ マーク <github@callow.im> wrote:
> 
> ...
> 
> I will put some tracing in `test-pipe.c++` to see whether it is currently buffering or not.

Indeed the seekg is returning success so my code uses the path where it is expected to work. If I force buffering, the test passes.

I tried the same tests in PowerShell instead of Git for Windows Bash. There too seekg() reports success.  If I force buffering tellg() after isp->seekg(0, ios_base::end) reports 171,926 bytes vs a file size of 170,512 which is strange. Without buffering tellg() reports 8192 bytes. Basically the same behaviour as with G4W bash except there tellg() reports 170,512 bytes and 126,976 bytes respectively.

> 
> We’ve only tested non-Cygwin consumers. When the author of the above GitHub issue comment was testing on Cygwin he compiled the consumer with mingw-w64 g++.
> 

If Cygwin consumers also see this changed seekg behaviour then they will have problems too.

Regards

    -Mark


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 528 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-08-15 10:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <13B0370E-B61A-44B9-A885-5FF1B8F4AC5F@callow.im>
2023-08-15  0:30 ` Pipes truncating data in cygwin from main and cygwin-3_4-branch Takashi Yano
     [not found]   ` <CCC360AD-21B1-40E4-8B6B-FCFAF5612939@callow.im>
2023-08-15  6:42     ` Takashi Yano
     [not found]       ` <FCF9C3F3-7ED6-406A-8420-05379E4C5875@callow.im>
2023-08-15  9:45         ` Takashi Yano
2023-08-15 10:01         ` キャロウ マーク

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).