* Wrong file position after writing 65537 bytes to block device @ 2017-12-16 6:13 Ivan Kozik 2017-12-18 16:32 ` Corinna Vinschen 0 siblings, 1 reply; 10+ messages in thread From: Ivan Kozik @ 2017-12-16 6:13 UTC (permalink / raw) To: cygwin Hello, I have discovered that if you write 65536 + 1 bytes to a block device in cygwin, the file position can become 65536 + 512. With /dev/sdc as a throwaway USB block device: (cygwin_write.c is pasted below) # gcc -O2 -Wall -o cygwin_write cygwin_write.c # ./cygwin_write /dev/sdc 66048 I am running 64-bit cygwin 2.9.0 on an updated Windows 8.1. I saw the same results with an 8TB drive and a 512MB USB stick. Best regards, Ivan cygwin_write.c: #include <stdio.h> int main(int argc, char *argv[]) { FILE *f = fopen(argv[1], "w"); char x[65536 + 1]; fwrite(x, 1, 65536 + 1, f); printf("%ld", ftell(f)); return 0; } -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-16 6:13 Wrong file position after writing 65537 bytes to block device Ivan Kozik @ 2017-12-18 16:32 ` Corinna Vinschen 2017-12-19 5:03 ` Steven Penny 0 siblings, 1 reply; 10+ messages in thread From: Corinna Vinschen @ 2017-12-18 16:32 UTC (permalink / raw) To: cygwin [-- Attachment #1: Type: text/plain, Size: 1260 bytes --] On Dec 16 02:07, Ivan Kozik wrote: > Hello, > > I have discovered that if you write 65536 + 1 bytes to a block device > in cygwin, the file position can become 65536 + 512. > > With /dev/sdc as a throwaway USB block device: > > (cygwin_write.c is pasted below) > # gcc -O2 -Wall -o cygwin_write cygwin_write.c > # ./cygwin_write /dev/sdc > 66048 > > I am running 64-bit cygwin 2.9.0 on an updated Windows 8.1. I saw the > same results with an 8TB drive and a 512MB USB stick. In general, the writes on disk devices is sector-oriented. Howewver, in this case ftell should have returned 65536. The problem here is that the newlib implmentation of ftell/ftello performs an fflush when called on a write stream since about 2008 to adjust for appending streams. Given your example (thanks for the testcase!) this seems pretty wrong. Looking further it turns out that neither glibc nor BSD actually calls fflush in this case. There's only a special case for appending streams, but this calls lseek, not fflush. Looks like a patch is required. Stay tuned. Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-18 16:32 ` Corinna Vinschen @ 2017-12-19 5:03 ` Steven Penny 2017-12-19 10:20 ` Corinna Vinschen 2017-12-20 23:59 ` Kaz Kylheku 0 siblings, 2 replies; 10+ messages in thread From: Steven Penny @ 2017-12-19 5:03 UTC (permalink / raw) To: cygwin On Mon, 18 Dec 2017 14:10:35, Corinna Vinschen wrote: > In general, the writes on disk devices is sector-oriented. Howewver, > in this case ftell should have returned 65536. The problem here is > that the newlib implmentation of ftell/ftello performs an fflush > when called on a write stream since about 2008 to adjust for appending > streams. Given your example (thanks for the testcase!) this seems > pretty wrong. Looking further it turns out that neither glibc nor BSD > actually calls fflush in this case. There's only a special case for > appending streams, but this calls lseek, not fflush. > > Looks like a patch is required. Stay tuned. is it though? he says "write 65536 + 1 bytes", but as far as i can tell, you cant do that. quoting myself: > Seeking, reading and writing must all be done in multiples of sector size, in > my case 512 bytes http://web.archive.org/web/stackoverflow.com/questions/37228874/how-to-fwrite-to-removable-volume so it would make sense that the position becomes "65536 + 512" -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-19 5:03 ` Steven Penny @ 2017-12-19 10:20 ` Corinna Vinschen 2017-12-19 16:36 ` Ivan Kozik 2017-12-20 23:59 ` Kaz Kylheku 1 sibling, 1 reply; 10+ messages in thread From: Corinna Vinschen @ 2017-12-19 10:20 UTC (permalink / raw) To: cygwin [-- Attachment #1: Type: text/plain, Size: 2047 bytes --] On Dec 18 16:27, Steven Penny wrote: > On Mon, 18 Dec 2017 14:10:35, Corinna Vinschen wrote: > > In general, the writes on disk devices is sector-oriented. Howewver, > > in this case ftell should have returned 65536. The problem here is > > that the newlib implmentation of ftell/ftello performs an fflush > > when called on a write stream since about 2008 to adjust for appending > > streams. Given your example (thanks for the testcase!) this seems > > pretty wrong. Looking further it turns out that neither glibc nor BSD > > actually calls fflush in this case. There's only a special case for > > appending streams, but this calls lseek, not fflush. > > > > Looks like a patch is required. Stay tuned. > > is it though? he says "write 65536 + 1 bytes", but as far as i can tell, you > cant do that. quoting myself: > > > Seeking, reading and writing must all be done in multiples of sector size, in > > my case 512 bytes > > http://web.archive.org/web/stackoverflow.com/questions/37228874/how-to-fwrite-to-removable-volume > > so it would make sense that the position becomes "65536 + 512" Neither glibc nor FreeBSD show this behaviour. Keep in mind that stdio is designed for buffered I/O. What should happen, basically, is that a multiple of the stdio buffersize is written and the remainder is kept in the stdio buffer: fwrite(65537) -> write(65536) -> store 1 byte in FILE._buf ftell calls lseek which returns 65536. It adds the number of bytes still in the buffer, so it should return 65537. Further fwrite's seemlessly append to the bytes already written, as expected. ftell calling fflush and thus setting the current file position to the next sector boundary breaks this expectation. I pushed a patch yesterday and uploaded new developer snapshots to https://cygwin.com/snapshots/ Please test. Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-19 10:20 ` Corinna Vinschen @ 2017-12-19 16:36 ` Ivan Kozik 2017-12-19 17:43 ` Eric Blake 0 siblings, 1 reply; 10+ messages in thread From: Ivan Kozik @ 2017-12-19 16:36 UTC (permalink / raw) To: cygwin On Tue, Dec 19, 2017 at 9:14 AM, Corinna Vinschen <corinna-cygwin@cygwin.com> wrote: > Neither glibc nor FreeBSD show this behaviour. Keep in mind that stdio > is designed for buffered I/O. What should happen, basically, is that a > multiple of the stdio buffersize is written and the remainder is kept in > the stdio buffer: > > fwrite(65537) > -> write(65536) > -> store 1 byte in FILE._buf > > ftell calls lseek which returns 65536. It adds the number of bytes > still in the buffer, so it should return 65537. Further fwrite's > seemlessly append to the bytes already written, as expected. ftell > calling fflush and thus setting the current file position to the next > sector boundary breaks this expectation. > > I pushed a patch yesterday and uploaded new developer snapshots to > https://cygwin.com/snapshots/ > > Please test. Thanks, I can confirm that the 2017-12-18 snapshot fixed the test program I posted. What about the harder case where the program calls fflush, though? #include <stdio.h> int main(int argc, char *argv[]) { FILE *f = fopen(argv[1], "w"); char x[65536 + 1]; fwrite(x, 1, 65536 + 1, f); fflush(f); printf("%ld", ftell(f)); return 0; } cygwin reports 66048, while Linux reports 65537. In cygwin, if such a write is done in a loop, for example, you can get garbled output on disk. fflush can be visibly unnecessary when done from C, but python3 (where I originally observed the problem) appears to do implicit flushing. If this is annoying to fix and I am the only one who notices, please don't worry about it, I can just write in proper block sizes to block devices. Best regards, Ivan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-19 16:36 ` Ivan Kozik @ 2017-12-19 17:43 ` Eric Blake 2017-12-19 18:18 ` Ivan Kozik 0 siblings, 1 reply; 10+ messages in thread From: Eric Blake @ 2017-12-19 17:43 UTC (permalink / raw) To: cygwin On 12/19/2017 09:46 AM, Ivan Kozik wrote: > Thanks, I can confirm that the 2017-12-18 snapshot fixed the test > program I posted. > > What about the harder case where the program calls fflush, though? > > #include <stdio.h> > > int main(int argc, char *argv[]) { > FILE *f = fopen(argv[1], "w"); > char x[65536 + 1]; > fwrite(x, 1, 65536 + 1, f); > fflush(f); > printf("%ld", ftell(f)); Can block devices report an unaligned offset to lseek()? If not, then when writing an unaligned value to a block device, don't we have to do a read-modify-write of the larger aligned cluster, and then put lseek() back to the unaligned boundary, and have extra magic in ftell() to track that we are at an unaligned position within the block device? But that sounds like a lot of nasty overhead; and that it would be better to make sure that block devices can report unaligned lseek() locations (caveat: I haven't tested what Linux does in that regards). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-19 17:43 ` Eric Blake @ 2017-12-19 18:18 ` Ivan Kozik 2017-12-19 18:36 ` Corinna Vinschen 0 siblings, 1 reply; 10+ messages in thread From: Ivan Kozik @ 2017-12-19 18:18 UTC (permalink / raw) To: cygwin On Tue, Dec 19, 2017 at 4:13 PM, Eric Blake <eblake@redhat.com> wrote: > Can block devices report an unaligned offset to lseek()? If not, then when > writing an unaligned value to a block device, don't we have to do a > read-modify-write of the larger aligned cluster, and then put lseek() back > to the unaligned boundary, and have extra magic in ftell() to track that we > are at an unaligned position within the block device? But that sounds like > a lot of nasty overhead; and that it would be better to make sure that block > devices can report unaligned lseek() locations (caveat: I haven't tested > what Linux does in that regards). From what I observe on Linux, it supports writing at any offset to the block device because it does a read-modify-write behind the scenes, with accompanying nasty overhead (e.g. writes going at 64MB/s instead of an "expected" 180MB/s). I think you can observe this behavior on Linux by piping this program's stdout to a block device (note: must be python3, not python2): #!/usr/bin/python3 import sys block = b" " * 4096 while True: sys.stdout.buffer.write(block) sys.stdout.buffer.write(b" ") and watching the block device activity with `dstat -d -D sdN` - you should see a lot of reads. Ivan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-19 18:18 ` Ivan Kozik @ 2017-12-19 18:36 ` Corinna Vinschen 2017-12-20 5:47 ` Ivan Kozik 0 siblings, 1 reply; 10+ messages in thread From: Corinna Vinschen @ 2017-12-19 18:36 UTC (permalink / raw) To: cygwin [-- Attachment #1: Type: text/plain, Size: 1499 bytes --] On Dec 19 16:35, Ivan Kozik wrote: > On Tue, Dec 19, 2017 at 4:13 PM, Eric Blake <eblake@redhat.com> wrote: > > Can block devices report an unaligned offset to lseek()? If not, then when > > writing an unaligned value to a block device, don't we have to do a > > read-modify-write of the larger aligned cluster, and then put lseek() back > > to the unaligned boundary, and have extra magic in ftell() to track that we > > are at an unaligned position within the block device? But that sounds like > > a lot of nasty overhead; and that it would be better to make sure that block > > devices can report unaligned lseek() locations (caveat: I haven't tested > > what Linux does in that regards). > > >From what I observe on Linux, it supports writing at any offset to the > block device because it does a read-modify-write behind the scenes, > with accompanying nasty overhead (e.g. writes going at 64MB/s instead > of an "expected" 180MB/s). That's what Cygwin was trying to emulate as well. Debugging pointed out that it only works for reading, not for writing, because the latter neglected to fix up buffer pointers. Those are used in lseek to report the Linux-like byte-exact file position. I pushed a patch and uploaded new developer snapshots to https://cygwin.com/snapshts/ Please give them a test. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-19 18:36 ` Corinna Vinschen @ 2017-12-20 5:47 ` Ivan Kozik 0 siblings, 0 replies; 10+ messages in thread From: Ivan Kozik @ 2017-12-20 5:47 UTC (permalink / raw) To: cygwin On Tue, Dec 19, 2017 at 6:19 PM, Corinna Vinschen <corinna-cygwin@cygwin.com> wrote: > On Dec 19 16:35, Ivan Kozik wrote: >> From what I observe on Linux, it supports writing at any offset to the >> block device because it does a read-modify-write behind the scenes, >> with accompanying nasty overhead (e.g. writes going at 64MB/s instead >> of an "expected" 180MB/s). > > That's what Cygwin was trying to emulate as well. Debugging pointed out > that it only works for reading, not for writing, because the latter > neglected to fix up buffer pointers. Those are used in lseek to report > the Linux-like byte-exact file position. > > I pushed a patch and uploaded new developer snapshots to > https://cygwin.com/snapshts/ > > Please give them a test. Hi Corinna, It is writing correctly now, thank you for the fix! Ivan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Wrong file position after writing 65537 bytes to block device 2017-12-19 5:03 ` Steven Penny 2017-12-19 10:20 ` Corinna Vinschen @ 2017-12-20 23:59 ` Kaz Kylheku 1 sibling, 0 replies; 10+ messages in thread From: Kaz Kylheku @ 2017-12-20 23:59 UTC (permalink / raw) To: Cygwin On 18.12.2017 16:27, Steven Penny wrote: > On Mon, 18 Dec 2017 14:10:35, Corinna Vinschen wrote: >> In general, the writes on disk devices is sector-oriented. Howewver, >> in this case ftell should have returned 65536. The problem here is >> that the newlib implmentation of ftell/ftello performs an fflush >> when called on a write stream since about 2008 to adjust for appending >> streams. Given your example (thanks for the testcase!) this seems >> pretty wrong. Looking further it turns out that neither glibc nor BSD >> actually calls fflush in this case. There's only a special case for >> appending streams, but this calls lseek, not fflush. >> >> Looks like a patch is required. Stay tuned. > > is it though? he says "write 65536 + 1 bytes", but as far as i can > tell, you > cant do that. quoting myself: > >> Seeking, reading and writing must all be done in multiples of sector >> size, in >> my case 512 bytes > > http://web.archive.org/web/stackoverflow.com/questions/37228874/how-to-fwrite-to-removable-volume > > so it would make sense that the position becomes "65536 + 512" You can do that on a "block" device. It's "raw" devices that have transfer unit restrictions. A block device creates an abstraction over a disk, dividing it into blocks. Those blocks are not related to the underlying sector size; they could be larger (e.g. 4096 byte block size, versus 512 byte sectors) or even smaller (e.g. 4096 byte block size, versus 65536 byte flash erase block size). Unix block devices let you read, write and seek using byte offsets and sizes. The bytes which are affected by a write operation map to one or more ranges in one or more blocks. All of the blocks have to be read into memory (if they aren't already). The bytes are updated, and then the blocks are marked dirty and written out (eventually). More changes can take place before that happens. So for instance if we have a block device (4096 bytes) over a flash device with 64 kB erase blocks, we can write just one byte somewhere in a block. When this change is flushed, the entire erase block has to be erased and rewritten. Because of the abstract nature of block devices, it's largely pointless to use the "dd" utility; you can use "cp" to copy them. "dd" is required when you need to control the exact size of the read and write calls. Thus "cat /dev/zero > /dev/blockdevice" has the same effect as "dd if=/dev/zero of=/dev/blockdevice". -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-12-20 23:33 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-12-16 6:13 Wrong file position after writing 65537 bytes to block device Ivan Kozik 2017-12-18 16:32 ` Corinna Vinschen 2017-12-19 5:03 ` Steven Penny 2017-12-19 10:20 ` Corinna Vinschen 2017-12-19 16:36 ` Ivan Kozik 2017-12-19 17:43 ` Eric Blake 2017-12-19 18:18 ` Ivan Kozik 2017-12-19 18:36 ` Corinna Vinschen 2017-12-20 5:47 ` Ivan Kozik 2017-12-20 23:59 ` Kaz Kylheku
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).