public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Re: bash-3.1-7^[$B!!^[(BBUG
@ 2006-09-13  4:38 Eric Blake
  2006-09-13  5:25 ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2006-09-13  4:38 UTC (permalink / raw)
  To: cygwin

> At any rate, thanks for narrowing down your application
> to a smaller test case; I'll see what I can find with it.

Here's something interesting in the strace:
   30  518741 [main] bash 2084 readv: readv (255, 0x22C060, 1) blocking, sigcatchers 1
   30  518771 [main] bash 2084 readv: no need to call ready_for_read
   34  518805 [main] bash 2084 fhandler_base::read: read 0 bytes ()
   29  518834 [main] bash 2084 fhandler_base::read: returning 135, text mode
...
   33 1682871 [main] bash 2084 fhandler_base::lseek: lseek (/home/eblake/text/zzz.sh, -103, 1)
   32 1682903 [main] bash 2084 fhandler_base::lseek: setting file pointer to 429
4967295 (high), 4294967193 (low)
   34 1682937 [main] bash 2084 lseek64: 39 = lseek (255, -103, 1)

Seems like a text mode lseek bug (no surprise there, since cgf
predicted that there are still some corner cases).  The file is 142
bytes, but has seven \r\n pairs, so the readv correctly returned 135
characters read.  But the lseek back to the start of the third line
temporarily set the file pointer to -3, then settled on an offset of 39
(39 + 103 gives 142 bytes, as if in binary mode); the offset really
should have been 32 characters in (byte 34 is the start of the third
line, so two \r would be skipped at that point).  I'm still not sure if
it is bash's fault or cygwin's.  But the file definitely was read in
text mode when it resided in a text mount.

-- 
Eric Blake

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13  4:38 bash-3.1-7^[$B!!^[(BBUG Eric Blake
@ 2006-09-13  5:25 ` Christopher Faylor
  2006-09-13 14:33   ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
  0 siblings, 1 reply; 25+ messages in thread
From: Christopher Faylor @ 2006-09-13  5:25 UTC (permalink / raw)
  To: cygwin

On Wed, Sep 13, 2006 at 04:38:33AM +0000, Eric Blake wrote:
>> At any rate, thanks for narrowing down your application
>> to a smaller test case; I'll see what I can find with it.
>
>Here's something interesting in the strace:
>   30  518741 [main] bash 2084 readv: readv (255, 0x22C060, 1) blocking, sigcatchers 1
>   30  518771 [main] bash 2084 readv: no need to call ready_for_read
>   34  518805 [main] bash 2084 fhandler_base::read: read 0 bytes ()
>   29  518834 [main] bash 2084 fhandler_base::read: returning 135, text mode
>...
>   33 1682871 [main] bash 2084 fhandler_base::lseek: lseek (/home/eblake/text/zzz.sh, -103, 1)
>   32 1682903 [main] bash 2084 fhandler_base::lseek: setting file pointer to 429
>4967295 (high), 4294967193 (low)
>   34 1682937 [main] bash 2084 lseek64: 39 = lseek (255, -103, 1)
>
>Seems like a text mode lseek bug (no surprise there, since cgf
>predicted that there are still some corner cases).  The file is 142
>bytes, but has seven \r\n pairs, so the readv correctly returned 135
>characters read.  But the lseek back to the start of the third line
>temporarily set the file pointer to -3,

(actually -103, 4294967193U == -103)

>then settled on an offset of 39 (39 + 103 gives 142 bytes, as if in
>binary mode); the offset really should have been 32 characters in (byte
>34 is the start of the third line, so two \r would be skipped at that
>point).  I'm still not sure if it is bash's fault or cygwin's.  But the
>file definitely was read in text mode when it resided in a text mount.

Is bash assuming that it can read N characters and then subtract M
characters from the current position to get back to the beginning of a
line?  If so, hmm.  I guess this explains why it was reading a byte at a
time before.  It must be counting characters rather than calling lseek
to figure out where it is.

Fixing this is rarely as simple as just calling lseek, though, of
course.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13  5:25 ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
@ 2006-09-13 14:33   ` Eric Blake
  2006-09-13 20:07     ` bash-3.1-7^[$B!!^[(BBUG Shankar Unni
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2006-09-13 14:33 UTC (permalink / raw)
  To: cygwin

Christopher Faylor <cgf-no-personal-reply-please <at> cygwin.com> writes:

> Is bash assuming that it can read N characters and then subtract M
> characters from the current position to get back to the beginning of a
> line?  If so, hmm.  I guess this explains why it was reading a byte at a
> time before.  It must be counting characters rather than calling lseek
> to figure out where it is.

Yes, indeed, and it seems like reasonable semantics to expect as well 
(nevermind that it means that text mode on a seekable file involves a lot more 
processing, to consistently present the user with character count instead of 
byte offset).  When a file is seekable, bash reads a buffer at a time for 
speed, but then must reseek to the offset where it last processed input before 
invoking any subprocesses, since POSIX requires that seekable files be left in 
a consistent state when swapping between multiple handles to the same 
underlying file description (even if the multiple handles exist in separate 
processes).  When using stdio (such as fread and fseek), this works due to code 
in newlib (see __SCLE in stdio.h).  But bash uses low-level Unix I/O, and does 
not benefit from newlib's approach.  In a binary mount, seeking backwards by 
the character offset from where bash has processed to the end of the buffer it 
has read just works.  It is only in a text mount where having lseek report the 
binary offset within the file, rather than the character offset, is causing 
problems.  So I will probably end up reinstating a form of the previous #ifdef 
__CYGWIN__ check for is_seekable in bash 3.1-8 to chek whether a file is in 
text mode, in which case it is non-seekable; that is certainly a faster 
solution than waiting for cygwin to make a change for lseek on a text file to 
consistently use a character offset.  But I intend that on binary files, \r\n 
line endings will treat the \r as part of the line, so at least binary mounts 
won't suffer from the speed impact of treating a file as unseekable the way 
bash 3.1-6 does.

-- 
Eric Blake



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13 14:33   ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
@ 2006-09-13 20:07     ` Shankar Unni
  2006-09-13 20:37       ` bash-3.1-7^[$B!!^[(BBUG mwoehlke
  0 siblings, 1 reply; 25+ messages in thread
From: Shankar Unni @ 2006-09-13 20:07 UTC (permalink / raw)
  To: cygwin

Eric Blake wrote:

> But I intend that on binary files, \r\n 
> line endings will treat the \r as part of the line, so at least binary mounts 
> won't suffer from the speed impact of treating a file as unseekable the way 
> bash 3.1-6 does.

Would it be possible to do this dynamically (instead of keying off of 
mounts, etc.): if the first line of the file read by bash has a \r\n, 
use text-mode (1-char-at-a-time) semantics, else use binary semantics 
(lseek)?



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13 20:07     ` bash-3.1-7^[$B!!^[(BBUG Shankar Unni
@ 2006-09-13 20:37       ` mwoehlke
  2006-09-13 21:48         ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
  0 siblings, 1 reply; 25+ messages in thread
From: mwoehlke @ 2006-09-13 20:37 UTC (permalink / raw)
  To: cygwin

Shankar Unni wrote:
> Eric Blake wrote:
>> But I intend that on binary files, \r\n line endings will treat the \r 
>> as part of the line, so at least binary mounts won't suffer from the 
>> speed impact of treating a file as unseekable the way bash 3.1-6 does.
> 
> Would it be possible to do this dynamically (instead of keying off of 
> mounts, etc.): if the first line of the file read by bash has a \r\n, 
> use text-mode (1-char-at-a-time) semantics, else use binary semantics 
> (lseek)?

I hate to say this, but... if bash goes this route, could it be a shopt? 
I would rather know that my scripts are broken (DOS-format).

-- 
Matthew
62% of all statistics are made up on the spot.


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13 20:37       ` bash-3.1-7^[$B!!^[(BBUG mwoehlke
@ 2006-09-13 21:48         ` Eric Blake
  2006-09-13 22:08           ` bash-3.1-7^[$B!!^[(BBUG mwoehlke
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2006-09-13 21:48 UTC (permalink / raw)
  To: cygwin

mwoehlke <mwoehlke <at> tibco.com> writes:

> > Would it be possible to do this dynamically (instead of keying off of 
> > mounts, etc.): if the first line of the file read by bash has a \r\n, 
> > use text-mode (1-char-at-a-time) semantics, else use binary semantics 
> > (lseek)?
> 
> I hate to say this, but... if bash goes this route, could it be a shopt? 
> I would rather know that my scripts are broken (DOS-format).
> 

Thanks for the ideas; here's what I'll try.  Bash does indeed already scan the 
first line (I'm not sure if it is line or first 80 characters or what it is 
exactly, but I do know it scans) to see if it detects any NUL bytes, at which 
point it complains the file is binary and not a script.  So I can probably hack 
that scan to also look for \r.  So first I will open the file according to the 
mount point rules.  If the file is text mode, perform the scan in binary mode, 
and if any \r is seen, revert to text mode and no lseeks.  If the scan in 
binary mode succeeds, then leave the file in binary mode, assuming that the 
file is unix format even though it is on a text mount, and that lseeks will 
work.  If the file starts life binary mode (ie. was on a binary mount), skip 
the check for \r in the scan (under the assumption that on a binary mount, \r 
is intentional and not a line ending to be collapsed), and use lseeks.  No 
guarantees on whether this will pan out, or be bigger than I thought, but 
hopefully you will see a bash 3.1-8 with these semantics soon.

-- 
Eric Blake



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13 21:48         ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
@ 2006-09-13 22:08           ` mwoehlke
  2006-09-13 23:46             ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
  0 siblings, 1 reply; 25+ messages in thread
From: mwoehlke @ 2006-09-13 22:08 UTC (permalink / raw)
  To: cygwin

Eric Blake wrote:
> mwoehlke <mwoehlke <at> tibco.com> writes:
> 
>>> Would it be possible to do this dynamically (instead of keying off of 
>>> mounts, etc.): if the first line of the file read by bash has a \r\n, 
>>> use text-mode (1-char-at-a-time) semantics, else use binary semantics 
>>> (lseek)?
>> I hate to say this, but... if bash goes this route, could it be a shopt? 
>> I would rather know that my scripts are broken (DOS-format).
>>
> 
> Thanks for the ideas; here's what I'll try.  Bash does indeed already scan the 
> first line (I'm not sure if it is line or first 80 characters or what it is 
> exactly, but I do know it scans) to see if it detects any NUL bytes, at which 
> point it complains the file is binary and not a script.  So I can probably hack 
> that scan to also look for \r.  So first I will open the file according to the 
> mount point rules.  If the file is text mode, perform the scan in binary mode, 
> and if any \r is seen, revert to text mode and no lseeks.  If the scan in 
> binary mode succeeds, then leave the file in binary mode, assuming that the 
> file is unix format even though it is on a text mount, and that lseeks will 
> work.  If the file starts life binary mode (ie. was on a binary mount), skip 
> the check for \r in the scan (under the assumption that on a binary mount, \r 
> is intentional and not a line ending to be collapsed), and use lseeks.  No 
> guarantees on whether this will pan out, or be bigger than I thought, but 
> hopefully you will see a bash 3.1-8 with these semantics soon.

Sounds good! That will satisfy my request to not silently work on files 
that should be broken. :-)

Alternatively (and I kind-of hate to say this :-)), now that I think of 
it, you might want to talk to Rodney over at the Interix forums. I 
recall hearing that the Interix version of bash actually handles files 
with a mix of DOS and UNIX line endings (which may not be the best thing 
to do, but might be worth investigating). I would imagine that version 
is always reading in binary (I don't think Interix - like UNIX, but 
unlike Cygwin - ever had a 'text mode' concept). There might even be an 
official patch for this, that just needs to be flipped on for Cygwin (or 
maybe the two of you can petition to make it an official patch).

-- 
Matthew
41% of all statistics are made up on the spot.


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13 22:08           ` bash-3.1-7^[$B!!^[(BBUG mwoehlke
@ 2006-09-13 23:46             ` Volker Quetschke
  2006-09-13 23:58               ` bash-3.1-7^[$B!!^[(BBUG David Rothenberger
  2006-09-14  0:19               ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
  0 siblings, 2 replies; 25+ messages in thread
From: Volker Quetschke @ 2006-09-13 23:46 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

Hi,

mwoehlke wrote:
> Eric Blake wrote:
>> mwoehlke <mwoehlke <at> tibco.com> writes:
(snip)
>> ...  If the scan in binary mode
>> succeeds, then leave the file in binary mode, assuming that the file
>> is unix format even though it is on a text mount, and that lseeks will
>> work.  If the file starts life binary mode (ie. was on a binary
>> mount), skip the check for \r in the scan (under the assumption that
>> on a binary mount, \r is intentional and not a line ending to be
>> collapsed), and use lseeks.  No guarantees on whether this will pan
>> out, or be bigger than I thought, but hopefully you will see a bash
>> 3.1-8 with these semantics soon.
> 
> Sounds good! That will satisfy my request to not silently work on files
> that should be broken. :-)

I'm seeing the next "make doesn't work anymore with DOS ... feature" coming
up here, only that it is bash this time. Apparently a lot of people use
tools from cygwin to build Windows stuff in a *NIX build environment.

Not many people that just "use" the tools read this ml and therefore
test if their favorite application still builds. It is definitely in the
eye of the beholder if one calls shell scripts that worked so far as broken
just because they have /r/n line endings.

I'll try to build my favorite testcase OpenOffice.org to see if there are some
shell scripts with "broken" lineendings hidden in this 500MB sourcecode monster.

On a separate note, both gcc and Microsofts cl.exe don't care about
the lineendings, neither does tcsh, why should bash start to punish the
offenders?

Volker

-- 
PGP/GPG key  (ID: 0x9F8A785D)  available  from  wwwkeys.de.pgp.net
key-fingerprint 550D F17E B082 A3E9 F913  9E53 3D35 C9BA 9F8A 785D


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13 23:46             ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
@ 2006-09-13 23:58               ` David Rothenberger
  2006-09-14  0:30                 ` bash-3.1-7^[$B!!^[(BBUG Larry Hall (Cygwin)
  2006-09-14  0:19               ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
  1 sibling, 1 reply; 25+ messages in thread
From: David Rothenberger @ 2006-09-13 23:58 UTC (permalink / raw)
  To: cygwin

On 9/13/2006 4:46 PM, Volker Quetschke wrote:
> mwoehlke wrote:
>> Eric Blake wrote:
>>> mwoehlke <mwoehlke <at> tibco.com> writes:
> (snip)
>>> ... If the file starts life binary mode (ie. was on a binary
>>> mount), skip the check for \r in the scan (under the assumption that
>>> on a binary mount, \r is intentional and not a line ending to be
>>> collapsed), and use lseeks. 
>> Sounds good! That will satisfy my request to not silently work on files
>> that should be broken. :-)
> 
> I'm seeing the next "make doesn't work anymore with DOS ... feature" coming
> up here, only that it is bash this time. (snip)
> 
> ... It is definitely in the eye of the beholder if one calls shell
> scripts that worked so far as broken just because they have /r/n line
> endings.

I strongly agree with this. The users I support would be much happier if 
bash could continue to work correctly with \r\n in scripts on binary 
mounts.

It sounds like bash will have to scan the first line regardless of the 
mount type (to check for a binary file), so perhaps the decision to 
treat \r as intentional or not could be an option?

-- 
David Rothenberger                spammer? -> spam@daveroth.dyndns.org
GPG/PGP: 0x92D68FD8, DB7C 5146 1AB0 483A 9D27 DFBA FBB9 E328 92D6 8FD8

   If Roosevelt were alive, he'd turn over in his grave. -Samuel
   Goldwyn


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13 23:46             ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
  2006-09-13 23:58               ` bash-3.1-7^[$B!!^[(BBUG David Rothenberger
@ 2006-09-14  0:19               ` Christopher Faylor
  2006-09-14  1:09                 ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
  2006-09-21  3:37                 ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
  1 sibling, 2 replies; 25+ messages in thread
From: Christopher Faylor @ 2006-09-14  0:19 UTC (permalink / raw)
  To: cygwin

On Wed, Sep 13, 2006 at 04:46:28PM -0700, Volker Quetschke wrote:
>mwoehlke wrote:
>>Eric Blake wrote:
>>>mwoehlke <mwoehlke <at> tibco.com> writes:
>(snip)
>>>...  If the scan in binary mode succeeds, then leave the file in binary
>>>mode, assuming that the file is unix format even though it is on a text
>>>mount, and that lseeks will work.  If the file starts life binary mode
>>>(ie.  was on a binary mount), skip the check for \r in the scan (under
>>>the assumption that on a binary mount, \r is intentional and not a line
>>>ending to be collapsed), and use lseeks.  No guarantees on whether this
>>>will pan out, or be bigger than I thought, but hopefully you will see a
>>>bash 3.1-8 with these semantics soon.
>>
>>Sounds good! That will satisfy my request to not silently work on files
>>that should be broken.  :-)
>
>I'm seeing the next "make doesn't work anymore with DOS ...  feature"
>coming up here, only that it is bash this time.  Apparently a lot of
>people use tools from cygwin to build Windows stuff in a *NIX build
>environment.

Do I have to make the observation again that whether this is the case or
not, it is not a primary goal of the Cygwin project to support these
people?

As long as you have Corinna or myself in charge we are going to stick
with the whole "Linux on Windows" thing.  If bash doesn't like \r\n line
endings on Linux, if we purposely recommend against using text mode
files, and if we can see noticeable performance improvements in bash
from not honoring \r\n line endings, then bash should definitely be
using the "binmode" code.

If Eric cares enough to make \r\n shell scripts work in bash, then more
power to him.  If he doesn't then I have no problem with releasing a
bash that hiccups on files which use \r\n and informing people that they
should fix their scripts.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-13 23:58               ` bash-3.1-7^[$B!!^[(BBUG David Rothenberger
@ 2006-09-14  0:30                 ` Larry Hall (Cygwin)
  2006-09-18  2:48                   ` bash-3.1-7^[$B!!^[(BBUG Carlo Florendo
  0 siblings, 1 reply; 25+ messages in thread
From: Larry Hall (Cygwin) @ 2006-09-14  0:30 UTC (permalink / raw)
  To: cygwin

David Rothenberger wrote:
> On 9/13/2006 4:46 PM, Volker Quetschke wrote:
>> mwoehlke wrote:
>>> Eric Blake wrote:
>>>> mwoehlke <mwoehlke <at> tibco.com> writes:
>> (snip)
>>>> ... If the file starts life binary mode (ie. was on a binary
>>>> mount), skip the check for \r in the scan (under the assumption that
>>>> on a binary mount, \r is intentional and not a line ending to be
>>>> collapsed), and use lseeks. 
>>> Sounds good! That will satisfy my request to not silently work on files
>>> that should be broken. :-)
>>
>> I'm seeing the next "make doesn't work anymore with DOS ... feature" 
>> coming
>> up here, only that it is bash this time. (snip)
>>
>> ... It is definitely in the eye of the beholder if one calls shell
>> scripts that worked so far as broken just because they have /r/n line
>> endings.
> 
> I strongly agree with this. The users I support would be much happier if 
> bash could continue to work correctly with \r\n in scripts on binary 
> mounts.
> 
> It sounds like bash will have to scan the first line regardless of the 
> mount type (to check for a binary file), so perhaps the decision to 
> treat \r as intentional or not could be an option?
> 

I think this is getting a little off-track.  Another option just means another
area for people who don't understand what's going on to trip and fall and then
come and bug the list as a result.  IMO, there's already such an option, even
without changing bash.  It's called 'd2u'.  Let's not over-think and over-do
things here.  There's pain involved in that too.  I'm not convinced that it's
worth the pain to all for the benefit of a few.  Of course, I firmly believe
that what can and will be done here is Eric's call.  He's the one that will be
maintaining it.

-- 
Larry Hall                              http://www.rfk.com
RFK Partners, Inc.                      (508) 893-9779 - RFK Office
216 Dalton Rd.                          (508) 893-9889 - FAX
Holliston, MA 01746

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14  0:19               ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
@ 2006-09-14  1:09                 ` Volker Quetschke
  2006-09-14  2:07                   ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
  2006-09-21  3:37                 ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
  1 sibling, 1 reply; 25+ messages in thread
From: Volker Quetschke @ 2006-09-14  1:09 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1750 bytes --]

Christopher Faylor wrote:
> On Wed, Sep 13, 2006 at 04:46:28PM -0700, Volker Quetschke wrote:
(snip)
> 
> Do I have to make the observation again that whether this is the case or
> not, it is not a primary goal of the Cygwin project to support these
> people?

Yes. Did it ever cross your mind that the whole "Linux on Windows" thing
is pretty useless if it cannot be used in the "real world". I mean, if people
want to have a plain vanilla Linux thingy they can just install it. Grab
a Redhat, Suse or Debian DVD and everything works fine.

As there is no new information in this thread I'll shut up until a real
problem comes up. This whole discussion is academic anyway.

   Volker

> As long as you have Corinna or myself in charge we are going to stick
> with the whole "Linux on Windows" thing.  If bash doesn't like \r\n line
> endings on Linux, if we purposely recommend against using text mode
> files, and if we can see noticeable performance improvements in bash
> from not honoring \r\n line endings, then bash should definitely be
> using the "binmode" code.
> 
> If Eric cares enough to make \r\n shell scripts work in bash, then more
> power to him.  If he doesn't then I have no problem with releasing a
> bash that hiccups on files which use \r\n and informing people that they
> should fix their scripts.
> 
> cgf
> 
> --
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
> Problem reports:       http://cygwin.com/problems.html
> Documentation:         http://cygwin.com/docs.html
> FAQ:                   http://cygwin.com/faq/
> 

-- 
PGP/GPG key  (ID: 0x9F8A785D)  available  from  wwwkeys.de.pgp.net
key-fingerprint 550D F17E B082 A3E9 F913  9E53 3D35 C9BA 9F8A 785D


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14  1:09                 ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
@ 2006-09-14  2:07                   ` Christopher Faylor
  2006-09-14 11:13                     ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
                                       ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Christopher Faylor @ 2006-09-14  2:07 UTC (permalink / raw)
  To: cygwin

On Wed, Sep 13, 2006 at 06:09:03PM -0700, Volker Quetschke wrote:
>Christopher Faylor wrote:
>> On Wed, Sep 13, 2006 at 04:46:28PM -0700, Volker Quetschke wrote:
>(snip)
>> 
>>Do I have to make the observation again that whether this is the case
>>or not, it is not a primary goal of the Cygwin project to support these
>>people?
>
>Yes.  Did it ever cross your mind that the whole "Linux on Windows"
>thing is pretty useless if it cannot be used in the "real world".

Death of Cygwin predicted?  Everybody panic and/or sip?

>I mean, if people want to have a plain vanilla Linux thingy they can
>just install it.  Grab a Redhat, Suse or Debian DVD and everything
>works fine.

I suspect that the "Well, they can just install Linux (floppies, CDs,
DVDs) if they feel like it" observation has been made several times a
year for the last ten years.  It's obviously not a very powerful
argument since Cygwin is still here and you can't really assert that the
only reason it is here is because make understood MS-DOS paths or bash
dealt properly with \r\n line endings.

I doubt that Eric will want to deal with the fallout of having bash not
understand \r\n line endings but, if he does, it would be his decision
and, again, I would support it 100%.  I am very eager to see things like
configure scripts work faster and if we have to drop a few scared or
lazy people along the way to accomplish that goal, that's fine with me.
I have no problem at all with being a part of a smaller community which
doesn't need to use notepad to edit their bash scripts.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14  2:07                   ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
@ 2006-09-14 11:13                     ` Eric Blake
  2006-09-14 16:58                       ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
  2006-09-14 15:21                     ` bash-3.1-7 bug mwoehlke
  2006-09-21  3:48                     ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
  2 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2006-09-14 11:13 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Christopher Faylor on 9/13/2006 8:07 PM:
> 
> I doubt that Eric will want to deal with the fallout of having bash not
> understand \r\n line endings but, if he does, it would be his decision
> and, again, I would support it 100%.  I am very eager to see things like
> configure scripts work faster and if we have to drop a few scared or
> lazy people along the way to accomplish that goal, that's fine with me.
> I have no problem at all with being a part of a smaller community which
> doesn't need to use notepad to edit their bash scripts.

Here's the difference between 3.1-7 and 3.1-8:

diff -u bash-3.1/input.c bash-3.1/input.c
- --- bash-3.1/input.c    2006-09-08 16:58:58.703125000 -0600
+++ bash-3.1/input.c    2006-09-14 04:13:11.359375000 -0600
@@ -166,6 +166,10 @@
   bp->b_used = bp->b_inputp = bp->b_flag = 0;
   if (bufsize == 1)
     bp->b_flag |= B_UNBUFF;
+#ifdef __CYGWIN__
+  if ((fcntl (fd, F_GETFL) & O_TEXT) != 0)
+    bp->b_flag |= B_TEXT;
+#endif
   return (bp);
 }

@@ -442,6 +446,25 @@
 {
   ssize_t nr;

+#ifdef __CYGWIN__
+  /* lseek'ing on text files is problematic; lseek reports the true
+     file offset, but read collapses \r\n and returns a character
+     count.  We cannot reliably seek backwards if nr is smaller than
+     the seek offset encountered during the read, and must instead
+     treat the stream as unbuffered.  */
+  if ((bp->b_flag & (B_TEXT | B_UNBUFF)) == B_TEXT)
+    {
+      off_t offset = lseek (bp->b_fd, 0, SEEK_CUR);
+      nr = zread (bp->b_fd, bp->b_buffer, bp->b_size);
+      if (nr > 0 && nr < lseek (bp->b_fd, 0, SEEK_CUR) - offset)
+       {
+         lseek (bp->b_fd, offset, SEEK_SET);
+         bp->b_flag |= B_UNBUFF;
+         nr = zread (bp->b_fd, bp->b_buffer, bp->b_size = 1);
+       }
+    }
+  else
+#endif
   nr = zread (bp->b_fd, bp->b_buffer, bp->b_size);
   if (nr <= 0)
     {
@@ -454,15 +477,6 @@
       return (EOF);
     }

- -#if defined (__CYGWIN__)
- -  /* If on cygwin, translate \r\n to \n. */
- -  if (nr >= 2 && bp->b_buffer[nr - 2] == '\r' && bp->b_buffer[nr - 1] ==
'\n')
- -    {
- -      bp->b_buffer[nr - 2] = '\n';
- -      nr--;
- -    }
- -#endif
- -
   bp->b_used = nr;
   bp->b_inputp = 0;
   return (bp->b_buffer[bp->b_inputp++] & 0xFF);
only in patch2:
unchanged:
- --- bash-3.1-orig/input.h       2002-01-30 07:11:47.000000000 -0700
+++ bash-3.1/input.h    2006-09-14 03:29:05.484375000 -0600
@@ -47,6 +47,7 @@
 #define B_ERROR                0x02
 #define B_UNBUFF       0x04
 #define B_WASBASHINPUT 0x08
+#define B_TEXT         0x10 /* Text stream, when O_BINARY is nonzero */

 /* A buffered stream.  Like a FILE *, but with our own buffering and
    synchronization.  Look in input.c for the implementation. */


My thoughts on the matter are that if you use binary mounts (and I highly
recommend them), then every character in your file is important.  Since
bash on Linux does not ignore \r, and POSIX does not allow bash to ignore
\r by default (although you can set IFS to include \r as a whitespace
character to ignore), then neither should bash on a binary cygwin file.
If you use text mounts, then this patch is smart enough to buffer data up
until the point that an \r\n pair is converted by the text mode file into
a single character, at which point the lseek optimization breaks down and
the text mode file is subsequently processed a byte at a time.  If you
need DOS line endings, use a text mount.  If you need speed, use UNIX line
endings on a binary mount, although even UNIX line endings on a text mount
will be faster than DOS line endings.  Case closed, since I'm the
maintainer, and I really don't want to bother with anything larger than
the above patch (and also plan on submitting the above patch upstream,
where it is less likely to be accepted if it is larger).

- --
Life is short - so eat dessert first!

Eric Blake             ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFCTlx84KuGfSFAYARArO1AKDE7x39iX74iMoG8Sr8In2V+HgKgwCdGoNd
LCtH7JfK+6MNue1KjRlbMvE=
=KYu0
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7 bug
  2006-09-14  2:07                   ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
  2006-09-14 11:13                     ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
@ 2006-09-14 15:21                     ` mwoehlke
  2006-09-21  3:48                     ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
  2 siblings, 0 replies; 25+ messages in thread
From: mwoehlke @ 2006-09-14 15:21 UTC (permalink / raw)
  To: cygwin

Christopher Faylor wrote:
> On Wed, Sep 13, 2006 at 06:09:03PM -0700, Volker Quetschke wrote:
>> Christopher Faylor wrote:
>>> On Wed, Sep 13, 2006 at 04:46:28PM -0700, Volker Quetschke wrote:
>> (snip)
>>> Do I have to make the observation again that whether this is the case
>>> or not, it is not a primary goal of the Cygwin project to support these
>>> people?
>> Yes.  Did it ever cross your mind that the whole "Linux on Windows"
>> thing is pretty useless if it cannot be used in the "real world".
> 
> Death of Cygwin predicted?  Everybody panic and/or sip?

...As much as I agree with Volker's assertion (one user using Cygwin as 
a unix-like front-end for cl.exe, right here - although you'll recall 
that I'm also one of the ones that understands what Cygwin wants to be 
and never got tripped up by make (though in all honesty I haven't 
upgraded yet ;-) but only because I really don't like messing with a 
working build machine, and if 3.81 broke it would certainly be *my* 
fault))...

>> I mean, if people want to have a plain vanilla Linux thingy they can
>> just install it.  Grab a Redhat, Suse or Debian DVD and everything
>> works fine.
> 
> I suspect that the "Well, they can just install Linux (floppies, CDs,
> DVDs) if they feel like it" observation has been made several times a
> year for the last ten years.  It's obviously not a very powerful
> argument since Cygwin is still here and you can't really assert that the
> only reason it is here is because make understood MS-DOS paths or bash
> dealt properly with \r\n line endings.
> 
> I doubt that Eric will want to deal with the fallout of having bash not
> understand \r\n line endings but, if he does, it would be his decision
> and, again, I would support it 100%.  I am very eager to see things like
> configure scripts work faster and if we have to drop a few scared or
> lazy people along the way to accomplish that goal, that's fine with me.
> I have no problem at all with being a part of a smaller community which
> doesn't need to use notepad to edit their bash scripts.

...I have to agree that this is a different case. Changing makefiles 
that used DOS paths is one thing; you can make them work (like I do, by 
doing things 'right' in the first place), but if you've built a system 
on makefiles relying on DOS paths, fixing them can be painful and error 
prone. Whereas 'dos2unix <script>' is easy and reliable. So anyone that 
can't bring themselves to type that little line would definitely qualify 
as 'lazy' in my book. :-) Especially when Eric's other suggestion (add 
'\r' to IFS) is also available.

And just in case anyone *can't* live with Eric's latest patch, I still 
recommend finding out what Rodney is doing in the Interix version of bash.

-- 
Matthew
We apologize for the inconvenience.


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14 11:13                     ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
@ 2006-09-14 16:58                       ` Volker Quetschke
  2006-09-14 17:15                         ` bash-3.1-7^[$B!!^[(BBUG Dave Korn
                                           ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Volker Quetschke @ 2006-09-14 16:58 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 4466 bytes --]

Hi!

Eric Blake wrote:
> According to Christopher Faylor on 9/13/2006 8:07 PM:
>>> I doubt that Eric will want to deal with the fallout of having bash not
>>> understand \r\n line endings but, if he does, it would be his decision
>>> and, again, I would support it 100%.  I am very eager to see things like
>>> configure scripts work faster and if we have to drop a few scared or
>>> lazy people along the way to accomplish that goal, that's fine with me.
>>> I have no problem at all with being a part of a smaller community which
>>> doesn't need to use notepad to edit their bash scripts.
Hey, don't shoot at me, I'm only voicing my opinion and am perfectly fine
with your decision. Maybe I'm lacking some coffee, but this ...

> Here's the difference between 3.1-7 and 3.1-8:
> 
(snip)
> +#ifdef __CYGWIN__
> +  /* lseek'ing on text files is problematic; lseek reports the true
> +     file offset, but read collapses \r\n and returns a character
> +     count.  We cannot reliably seek backwards if nr is smaller than
> +     the seek offset encountered during the read, and must instead
> +     treat the stream as unbuffered.  */
> +  if ((bp->b_flag & (B_TEXT | B_UNBUFF)) == B_TEXT)
------------------------^^^^^^^^^^^^^^^^^      ^^^^^^
part of the patch looks suspicious to me. You probably just want to test
if the LHS expression is true.

  Volker

> +    {
> +      off_t offset = lseek (bp->b_fd, 0, SEEK_CUR);
> +      nr = zread (bp->b_fd, bp->b_buffer, bp->b_size);
> +      if (nr > 0 && nr < lseek (bp->b_fd, 0, SEEK_CUR) - offset)
> +       {
> +         lseek (bp->b_fd, offset, SEEK_SET);
> +         bp->b_flag |= B_UNBUFF;
> +         nr = zread (bp->b_fd, bp->b_buffer, bp->b_size = 1);
> +       }
> +    }
> +  else
> +#endif
>    nr = zread (bp->b_fd, bp->b_buffer, bp->b_size);
>    if (nr <= 0)
>      {
> @@ -454,15 +477,6 @@
>        return (EOF);
>      }
> 
> -#if defined (__CYGWIN__)
> -  /* If on cygwin, translate \r\n to \n. */
> -  if (nr >= 2 && bp->b_buffer[nr - 2] == '\r' && bp->b_buffer[nr - 1] ==
> '\n')
> -    {
> -      bp->b_buffer[nr - 2] = '\n';
> -      nr--;
> -    }
> -#endif
> -
>    bp->b_used = nr;
>    bp->b_inputp = 0;
>    return (bp->b_buffer[bp->b_inputp++] & 0xFF);
> only in patch2:
> unchanged:
> --- bash-3.1-orig/input.h       2002-01-30 07:11:47.000000000 -0700
> +++ bash-3.1/input.h    2006-09-14 03:29:05.484375000 -0600
> @@ -47,6 +47,7 @@
>  #define B_ERROR                0x02
>  #define B_UNBUFF       0x04
>  #define B_WASBASHINPUT 0x08
> +#define B_TEXT         0x10 /* Text stream, when O_BINARY is nonzero */
> 
>  /* A buffered stream.  Like a FILE *, but with our own buffering and
>     synchronization.  Look in input.c for the implementation. */
> 
> 
> My thoughts on the matter are that if you use binary mounts (and I highly
> recommend them), then every character in your file is important.  Since
> bash on Linux does not ignore \r, and POSIX does not allow bash to ignore
> \r by default (although you can set IFS to include \r as a whitespace
> character to ignore), then neither should bash on a binary cygwin file.
> If you use text mounts, then this patch is smart enough to buffer data up
> until the point that an \r\n pair is converted by the text mode file into
> a single character, at which point the lseek optimization breaks down and
> the text mode file is subsequently processed a byte at a time.  If you
> need DOS line endings, use a text mount.  If you need speed, use UNIX line
> endings on a binary mount, although even UNIX line endings on a text mount
> will be faster than DOS line endings.  Case closed, since I'm the
> maintainer, and I really don't want to bother with anything larger than
> the above patch (and also plan on submitting the above patch upstream,
> where it is less likely to be accepted if it is larger).
> 
> --
> Life is short - so eat dessert first!
> 
> Eric Blake             ebb9@byu.net

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


-- 
= http://wiki.services.openoffice.org/wiki/Debug_Build_Problems  =
PGP/GPG key  (ID: 0x9F8A785D)  available  from  wwwkeys.de.pgp.net
key-fingerprint 550D F17E B082 A3E9 F913  9E53 3D35 C9BA 9F8A 785D


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14 16:58                       ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
@ 2006-09-14 17:15                         ` Dave Korn
  2006-09-14 17:22                           ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
  2006-09-14 17:26                         ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
  2006-09-21  3:50                         ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
  2 siblings, 1 reply; 25+ messages in thread
From: Dave Korn @ 2006-09-14 17:15 UTC (permalink / raw)
  To: cygwin

On 14 September 2006 17:59, Volker Quetschke wrote:

> Hi!

> (snip)
>> +#ifdef __CYGWIN__
>> +  /* lseek'ing on text files is problematic; lseek reports the true
>> +     file offset, but read collapses \r\n and returns a character
>> +     count.  We cannot reliably seek backwards if nr is smaller than
>> +     the seek offset encountered during the read, and must instead
>> +     treat the stream as unbuffered.  */
>> +  if ((bp->b_flag & (B_TEXT | B_UNBUFF)) == B_TEXT)
> ------------------------^^^^^^^^^^^^^^^^^      ^^^^^^
> part of the patch looks suspicious to me. You probably just want to test
> if the LHS expression is true.

  You reckon?  That looks to me like a test for B_TEXT is set *and* B_UNBUFF
is cleared.  Since the action we're going to take if this test succeeds is to
set the stream unbuffered, there's no need to do it for a stream that already
/is/ unbuffered.  That's how it looks to me at first glance, anyway.

  I agree it's a /slightly/ unclear construct that should /perhaps/ have an
explicit comment to clarify the way in which it's been worded.


>   Volker
> 
>> +    {
>> +      off_t offset = lseek (bp->b_fd, 0, SEEK_CUR);

  ... Could have made good use of another one of your "(snip)"s here...!

[ ... snip ... ]


    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14 17:15                         ` bash-3.1-7^[$B!!^[(BBUG Dave Korn
@ 2006-09-14 17:22                           ` Volker Quetschke
  0 siblings, 0 replies; 25+ messages in thread
From: Volker Quetschke @ 2006-09-14 17:22 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]

Dave Korn wrote:
> On 14 September 2006 17:59, Volker Quetschke wrote:
> 
>> Hi!
> 
>> (snip)
>>> +#ifdef __CYGWIN__
>>> +  /* lseek'ing on text files is problematic; lseek reports the true
>>> +     file offset, but read collapses \r\n and returns a character
>>> +     count.  We cannot reliably seek backwards if nr is smaller than
>>> +     the seek offset encountered during the read, and must instead
>>> +     treat the stream as unbuffered.  */
>>> +  if ((bp->b_flag & (B_TEXT | B_UNBUFF)) == B_TEXT)
>> ------------------------^^^^^^^^^^^^^^^^^      ^^^^^^
>> part of the patch looks suspicious to me. You probably just want to test
>> if the LHS expression is true.
> 
>   You reckon?  That looks to me like a test for B_TEXT is set *and* B_UNBUFF
> is cleared.  Since the action we're going to take if this test succeeds is to
> set the stream unbuffered, there's no need to do it for a stream that already
> /is/ unbuffered.  That's how it looks to me at first glance, anyway.

See, it was the lack of coffee ;) Sorry for the noise.

  Volker

(snap)

-- 
PGP/GPG key  (ID: 0x9F8A785D)  available  from  wwwkeys.de.pgp.net
key-fingerprint 550D F17E B082 A3E9 F913  9E53 3D35 C9BA 9F8A 785D


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14 16:58                       ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
  2006-09-14 17:15                         ` bash-3.1-7^[$B!!^[(BBUG Dave Korn
@ 2006-09-14 17:26                         ` Eric Blake
  2006-09-21  3:50                         ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
  2 siblings, 0 replies; 25+ messages in thread
From: Eric Blake @ 2006-09-14 17:26 UTC (permalink / raw)
  To: cygwin

Volker Quetschke <quetschke <at> scytek.de> writes:

> > 
> (snip)
> > +#ifdef __CYGWIN__
> > +  /* lseek'ing on text files is problematic; lseek reports the true
> > +     file offset, but read collapses \r\n and returns a character
> > +     count.  We cannot reliably seek backwards if nr is smaller than
> > +     the seek offset encountered during the read, and must instead
> > +     treat the stream as unbuffered.  */
> > +  if ((bp->b_flag & (B_TEXT | B_UNBUFF)) == B_TEXT)
> ------------------------^^^^^^^^^^^^^^^^^      ^^^^^^
> part of the patch looks suspicious to me. You probably just want to test
> if the LHS expression is true.

That part is correct as presented - I really did mean to check with bitwise AND 
if we are dealing with a text file which has not previously been marked 
unbuffered...

> 
>   Volker
> 
> > +    {
> > +      off_t offset = lseek (bp->b_fd, 0, SEEK_CUR);
> > +      nr = zread (bp->b_fd, bp->b_buffer, bp->b_size);

...as the condition to perform extra lseeks and make sure that lseek and the 
unbuffered text file are still consistent; if not...

> > +      if (nr > 0 && nr < lseek (bp->b_fd, 0, SEEK_CUR) - offset)
> > +       {
> > +         lseek (bp->b_fd, offset, SEEK_SET);
> > +         bp->b_flag |= B_UNBUFF;

... we change the flags to mark the stream unbuffered, and never fall into this 
if-block again for the rest of the life of the file.

> > +         nr = zread (bp->b_fd, bp->b_buffer, bp->b_size = 1);
> > +       }
> > +    }
> > +  else
> > +#endif
> >    nr = zread (bp->b_fd, bp->b_buffer, bp->b_size);

And the else-block works equally well whether a file is non-text (reading 
multiple bytes), or is unbuffered (reading just one byte).  The other thing to 
remember is that a file will be marked unbuffered if you cannot seek on it (as 
in a pipe), or if it is a text file that failed the lseek consistency checks 
above.  And it does mean that even with \n line endings on a text mount, that 
although the file is read in the same number of buffers as the corresponding 
binary mount, the text mount is penalized with 2 additional lseeks per buffer, 
but that is a smaller penalty than doing one-byte reads.

> >    if (nr <= 0)
> >      {

-- 
Eric Blake




--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14  0:30                 ` bash-3.1-7^[$B!!^[(BBUG Larry Hall (Cygwin)
@ 2006-09-18  2:48                   ` Carlo Florendo
  2006-09-18  2:54                     ` bash-3.1-7^[$B!!^[(BBUG Carlo Florendo
  0 siblings, 1 reply; 25+ messages in thread
From: Carlo Florendo @ 2006-09-18  2:48 UTC (permalink / raw)
  To: cygwin

Larry Hall (Cygwin) wrote:
>
> I think this is getting a little off-track.  Another option just means 
> another
> area for people who don't understand what's going on to trip and fall 
> and then
> come and bug the list as a result.  IMO, there's already such an 
> option, even
> without changing bash.  It's called 'd2u'.  Let's not over-think and 
> over-do
> things here.  There's pain involved in that too.  I'm not convinced 
> that it's
> worth the pain to all for the benefit of a few.  Of course, I firmly 
> believe
> that what can and will be done here is Eric's call.  He's the one that 
> will be
> maintaining it.
>

You've got the keyverb:  overdo

We've already got d2u.  There is a point in making bash scripts work 
with CRLF endings but at the expense of a slower and a little bit larger 
bash, I think.  (If I'm wrong here, I'm sorry, I haven't seen bash's code.)

d2u is much simpler.  For the sake of simplicity and elegance, let's go d2u.

Best Regards,

Carlo

-- 
Carlo Florendo
Network Administrator
Astra Philippines Inc. (www.astra.ph)
Member of the Astra Group (www.astra.co.jp)


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-18  2:48                   ` bash-3.1-7^[$B!!^[(BBUG Carlo Florendo
@ 2006-09-18  2:54                     ` Carlo Florendo
  0 siblings, 0 replies; 25+ messages in thread
From: Carlo Florendo @ 2006-09-18  2:54 UTC (permalink / raw)
  To: cygwin

Carlo Florendo wrote:
> Larry Hall (Cygwin) wrote:
>>
>> I think this is getting a little off-track.  Another option just 
>> means another
>> area for people who don't understand what's going on to trip and fall 
>> and then
>> come and bug the list as a result.  IMO, there's already such an 
>> option, even
>> without changing bash.  It's called 'd2u'.  Let's not over-think and 
>> over-do
>> things here.  There's pain involved in that too.  I'm not convinced 
>> that it's
>> worth the pain to all for the benefit of a few.  Of course, I firmly 
>> believe
>> that what can and will be done here is Eric's call.  He's the one 
>> that will be
>> maintaining it.
>>
>
> You've got the keyverb:  overdo
>
> We've already got d2u.  There is a point in making bash scripts work 
> with CRLF endings but at the expense of a slower and a little bit 
> larger bash, I think.  (If I'm wrong here, I'm sorry, I haven't seen 
> bash's code.)
>
> d2u is much simpler.  For the sake of simplicity and elegance, let's 
> go d2u.
>
> Best Regards,
>
> Carlo
>
This was an old thread.  Sorry for the late reply.  I didn't check the 
dates.  

-- 
Carlo Florendo
Network Administrator
Astra Philippines Inc. (www.astra.ph)
Member of the Astra Group (www.astra.co.jp)


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14  0:19               ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
  2006-09-14  1:09                 ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
@ 2006-09-21  3:37                 ` Christopher Layne
  1 sibling, 0 replies; 25+ messages in thread
From: Christopher Layne @ 2006-09-21  3:37 UTC (permalink / raw)
  To: cygwin

On Wed, Sep 13, 2006 at 08:19:02PM -0400, Christopher Faylor wrote:
> As long as you have Corinna or myself in charge we are going to stick
> with the whole "Linux on Windows" thing.  If bash doesn't like \r\n line
> endings on Linux, if we purposely recommend against using text mode
> files, and if we can see noticeable performance improvements in bash
> from not honoring \r\n line endings, then bash should definitely be
> using the "binmode" code.

Yep. Ridiculous performance improvements at that.

> 
> If Eric cares enough to make \r\n shell scripts work in bash, then more
> power to him.  If he doesn't then I have no problem with releasing a
> bash that hiccups on files which use \r\n and informing people that they
> should fix their scripts.
> 
> cgf

Totally agreed here too.

I've been using my splintered off bash build for weeks now. Haven't really
had any issues - however I also explicitly do not use text mode mounts or
any related nonsense.

-cl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14  2:07                   ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
  2006-09-14 11:13                     ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
  2006-09-14 15:21                     ` bash-3.1-7 bug mwoehlke
@ 2006-09-21  3:48                     ` Christopher Layne
  2 siblings, 0 replies; 25+ messages in thread
From: Christopher Layne @ 2006-09-21  3:48 UTC (permalink / raw)
  To: cygwin

> I suspect that the "Well, they can just install Linux (floppies, CDs,
> DVDs) if they feel like it" observation has been made several times a
> year for the last ten years.  It's obviously not a very powerful
> argument since Cygwin is still here and you can't really assert that the
> only reason it is here is because make understood MS-DOS paths or bash
> dealt properly with \r\n line endings.

I personally use Cygwin to make full use of a unix shell and environ along
side windows - typically with around 8-10 puttycygs open along with NX clients
to Linux machines running NX. Since it's inevitable that windows will creep
into things, there's not much one can do to get around it - aside from making
a personal pact 'not going to use windows' - which we've all most likely done
and eventually failed out. That being said, if I didn't require just a few
windows applications that I use aside from Cygwin, I'd be running KDE/Linux 2.6
on this box in a heartbeat. That being said, I think cygwin is great, and I
find it invaluable under windows. I really couldn't imagine using MSW without
cygwin.

> I doubt that Eric will want to deal with the fallout of having bash not
> understand \r\n line endings but, if he does, it would be his decision
> and, again, I would support it 100%.  I am very eager to see things like
> configure scripts work faster and if we have to drop a few scared or
> lazy people along the way to accomplish that goal, that's fine with me.
> I have no problem at all with being a part of a smaller community which
> doesn't need to use notepad to edit their bash scripts.

Yep - and what section of the bell curve do people who write in bash/sh *on*
cygwin *also* use notepad to edit the files? I would think 95% are using
vi/vim.

-cl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
  2006-09-14 16:58                       ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
  2006-09-14 17:15                         ` bash-3.1-7^[$B!!^[(BBUG Dave Korn
  2006-09-14 17:26                         ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
@ 2006-09-21  3:50                         ` Christopher Layne
  2 siblings, 0 replies; 25+ messages in thread
From: Christopher Layne @ 2006-09-21  3:50 UTC (permalink / raw)
  To: cygwin

On Thu, Sep 14, 2006 at 12:58:36PM -0400, Volker Quetschke wrote:
> > +#ifdef __CYGWIN__
> > +  /* lseek'ing on text files is problematic; lseek reports the true
> > +     file offset, but read collapses \r\n and returns a character
> > +     count.  We cannot reliably seek backwards if nr is smaller than
> > +     the seek offset encountered during the read, and must instead
> > +     treat the stream as unbuffered.  */
> > +  if ((bp->b_flag & (B_TEXT | B_UNBUFF)) == B_TEXT)
> ------------------------^^^^^^^^^^^^^^^^^      ^^^^^^
> part of the patch looks suspicious to me. You probably just want to test
> if the LHS expression is true.
> 
>   Volker

It's called a mask.

-cl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: bash-3.1-7^[$B!!^[(BBUG
@ 2006-09-13  4:13 Eric Blake
  0 siblings, 0 replies; 25+ messages in thread
From: Eric Blake @ 2006-09-13  4:13 UTC (permalink / raw)
  To: Kazuyuki Hagiwara, cygwin

> I have extracted very short script which fails by bash-3.1-7,
> while runs successfully 3.1-6 ofcorse.
> 
> While the attached script zzz.sh has \r\n style end of line format,
> it shoud run normally since accessed through "text mount" point.

Does it work if you convert it to \n line endings, using d2u?

> [~/work] mount | grep tmp
> D:\users\hagiwara\tmp on /tmp type user (binmode)
> D:\users\hagiwara\tmp on /tmp2 type system (textmode)

Why are you mounting the same physical directory to two
different posix paths with different options?  I'm not sure
that will always work the way you expect, although I doubt
it explains the problems you are seeing.

At any rate, thanks for narrowing down your application
to a smaller test case; I'll see what I can find with it.  Any
chance you can convince your mailer to send text files
with a text mime type?  In the meantime, since 3.1-7 is
marked experimental and was not working for you, you
are under no obligation to use it.

-- 
Eric Blake
volunteer cygwin bash maintainer

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2006-09-21  3:50 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-13  4:38 bash-3.1-7^[$B!!^[(BBUG Eric Blake
2006-09-13  5:25 ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
2006-09-13 14:33   ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
2006-09-13 20:07     ` bash-3.1-7^[$B!!^[(BBUG Shankar Unni
2006-09-13 20:37       ` bash-3.1-7^[$B!!^[(BBUG mwoehlke
2006-09-13 21:48         ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
2006-09-13 22:08           ` bash-3.1-7^[$B!!^[(BBUG mwoehlke
2006-09-13 23:46             ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
2006-09-13 23:58               ` bash-3.1-7^[$B!!^[(BBUG David Rothenberger
2006-09-14  0:30                 ` bash-3.1-7^[$B!!^[(BBUG Larry Hall (Cygwin)
2006-09-18  2:48                   ` bash-3.1-7^[$B!!^[(BBUG Carlo Florendo
2006-09-18  2:54                     ` bash-3.1-7^[$B!!^[(BBUG Carlo Florendo
2006-09-14  0:19               ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
2006-09-14  1:09                 ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
2006-09-14  2:07                   ` bash-3.1-7^[$B!!^[(BBUG Christopher Faylor
2006-09-14 11:13                     ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
2006-09-14 16:58                       ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
2006-09-14 17:15                         ` bash-3.1-7^[$B!!^[(BBUG Dave Korn
2006-09-14 17:22                           ` bash-3.1-7^[$B!!^[(BBUG Volker Quetschke
2006-09-14 17:26                         ` bash-3.1-7^[$B!!^[(BBUG Eric Blake
2006-09-21  3:50                         ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
2006-09-14 15:21                     ` bash-3.1-7 bug mwoehlke
2006-09-21  3:48                     ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
2006-09-21  3:37                 ` bash-3.1-7^[$B!!^[(BBUG Christopher Layne
  -- strict thread matches above, loose matches on Subject: below --
2006-09-13  4:13 bash-3.1-7^[$B!!^[(BBUG Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).