public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* RE: Why text=binary mounts
@ 1998-01-08 17:20 Gary R. Van Sickle
  1998-01-09 13:40 ` Larry Hall (RFK Partners Inc)
  1998-01-09 13:40 ` Tomas Fasth
  0 siblings, 2 replies; 33+ messages in thread
From: Gary R. Van Sickle @ 1998-01-08 17:20 UTC (permalink / raw)
  To: gnu-win32

This whole UNIX/DOS/text/binary situation drives me nuts.  Why can't this 
problem be solved once and for all by everybody for all time?  We are 
talking about one '\r', for crissake.  What's wrong with this solution:

1. If your program is opening a file that you want to get lines of text 
from (eg a compiler opening a source file), give fopen a "t"
2. If your program is opening a file that you want the 'binary image of' 
(eg TAR opening its input files), give fopen a "b"
3. Any crusty old program that doesn't conform to 1 & 2 gets fixed, 
replaced, or canned
4. fread, fgets, fgetc, etc get written so that when used on a "t" mode 
file, they strip out '\r's before a '\n' and any ctrl-z at the end of the 
file.
5. fopen is written so that you *must* give it a "b" or "t" or it abort()s. 
 This weeds out the crusty old programs mentioned in 3.  (I know it isn't 
ANSI.  What have they done for us lately? :) )
6. cat to the screen or a printer is binary.  Someone writes a filter to 
convert from text to a format which will look right on the screen or 
printer and you have to 'cat stdout << filter << textfile.txt'.  (I'm 
obviously not up on my UNIX so please forgive me of this is laughably 
wrong)

With this solution you have two equally valid text file formats, one with 
\n indicating end-of-line, one with \r\n indicating EOL and ctrl-z possibly 
indicating EOF.  To the program reading lines of text, they both look the 
same.  To the program not reading lines of text, they don't care what the 
file looks like, and they get the whole 'binary image'.  No 'mount mode' is 
needed.

Let me address one sure-to-come-up complaint right now: the notion that it 
would be too much work to 'fix' all the existing code.  How much time and 
effort is wasted on 'working around' the current situation?  Certainly more 
time than it would take to search-and-replace "w" with "wt", etc.

Gary R. Van Sickle (tiberius@braemarinc.com)
Electrical Design Engineer
Braemar Inc.
11481 Rupp Dr.
Burnsville, MN 55337
(612) 890-5135 Ext. 144
Fax: (612) 882-6550


-----Original Message-----
From:	marcus@bighorn.dr.lucent.com [SMTP:marcus@bighorn.dr.lucent.com]
Sent:	Thursday, January 08, 1998 10:29 AM
To:	gnu-win32@cygnus.com
Subject:	Re: Why text=binary mounts

Jeff Fried writes:
> Porting code from Unix to the PC should NOT require the same line
> termination mode since most Unix code which reads text uses fread/getc
> which automatically handle the end-of-line.  And from the replies of most
> people i would argue that most of us would prefer to work in the native
> mode of the operating system in which we are running rather than having 
to
> constantly convert files between the two models simply because we use 
tools
> from both operating systems under NT/95.  For examples of this
> compatibility look at many of the GNU tools which handle text, the file
> handling will work under both operating systems without any change 
because
> they use text mode I/O which is platform independent once all files have
> been converted to the form of the native OS.

This is true as long as you are considering text files only.  The problem
comes in when you also want to deal with binary files.  On Unix systems,
of course, there is no difference in operations on either, so most Unix
programs open all files using the same open() or fopen() calls.  On systems
that differentiate between these files, it is important to add O_BIARY or
O_TEXT to the second argument of open(), and "b" for binary files to the
second argument of fopen().  This tells the underlying routines whether to
apply any translation to the file.  If nothing is specified, the OS must
choose whether or not to make translations, and that is where the text=/!=
binary mounting comes in, as this specifies the default mode.

Now, there are some difficulties in this implementation.  First, since 
there
is no "t" that can be passed to fopen(), it is impossible to tell if a call
to fopen() wants a text mode open, or the default (blame POSIX/ANSI for 
that,
I guess).  If you know that all programs have conciously made a choice 
about
things, there would not be any need for a default, so we could assume that
the fopen() without a "b" wants a text mode open and mount things as
text!=binary.  However, if there exist Unix programs that call fopen() 
without
the "b" for binary files (since it isn't needed on Unix and was added to 
the
standard much later than the program may have been written), then these
programs won't run correctly without some additional porting effort.  The
same goes for programs that call open() without the O_BINARY bit set in the
second argument when opening binary files.

To compound this, there are times when it is extremely difficult to 
impossible
to tell if a file should be opened as text or binary.  For instance, should
TAR open the files that it is writing to an archive as binary or text 
files?
How can it determine which to use?

So, to avoid these issues, many people on this list try to avoid using 
anything
from the Microsoft world (except for NT/95 itself) and use only cygwin32
programs with text=binary so that any file is just like any other file just
like in Unix systems.  Since their text files are marginally exchangable
with other NT/95 users (or other NT/95 applications).  So, it seems to me
that this gives a slow, incomplete, and buggy (well, it is a Beta release!)
emulation of Unix with no advantages over Linux except that their boss has
declared that they must run NT (in true pointy-haired boss fashon).

Sure, it's fun to play with cygwin32, but to me it doesn't seem reasonable 
to
try to develop it as a Linux replacement.  I think that if it is to be 
truely
useful, cygwin32 must encourage interoperating with the native world that 
it
exists in.  Part of that is running well in a text!=binary mounted world.
Sure, that means that porting programs to Cygwin32 means that you have to
install an awareness of binary v.s. text files, and that does mean more 
work
to port the programs, but it also produces more useful programs as well.

This discussion keeps coming up, which I believe supports my feeling that 
it
is a major issue with cygwin32.  I know that the previous iteration I ended
with just agreeing to disagree and I said that I wouldn't say any more in 
it,
but I just wanted to give some support to this side in this iteration and
that'll be it (this time around, at least).

marcus hall


  Unfortunately, there is no "t" that
can be supplied to fopen() to fully disambiguate the three cases that may
occur, so we have the following situation:
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-08 17:20 Why text=binary mounts Gary R. Van Sickle
  1998-01-09 13:40 ` Larry Hall (RFK Partners Inc)
@ 1998-01-09 13:40 ` Tomas Fasth
  1998-01-11 23:40   ` Fergus Henderson
  1998-01-12 17:26   ` Guy Gascoigne - Piggford
  1 sibling, 2 replies; 33+ messages in thread
From: Tomas Fasth @ 1998-01-09 13:40 UTC (permalink / raw)
  To: gnu-win32; +Cc: Gary R. Van Sickle

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2988 bytes --]

Gary R. Van Sickle wrote:
> 
> This whole UNIX/DOS/text/binary situation drives me nuts.  Why can't this
> problem be solved once and for all by everybody for all time?  We are
> talking about one '\r', for crissake.  What's wrong with this solution:

Gary,

Your solution is wrong because it promotes an out-dated file system
concept originated from digital stoneage operating systems. Back then
fopen("t") ment character (byte) oriented i/o, while fopen("b") ment
block oriented i/o. Back then the choice had an effect on i/o
performance. Now? Heck, modern i/o subsystems are _so_ much more
efficient and clever. Also, memory (good for i/o buffering among other
things) are now-a-days so much more cheap and virtual.

What's wrong with this solution: Do it the Unix way. A file is a file is
a file. Textual end-of-line is not a business of the i/o subsystem. In
Unix the character sequence for end-of-line ('\n' == 012 == 0x10 ==
0b00001100)
is nothing more exciting than a mutual agreement between tools that want
to share text information. How simple!

Ultimately, as a programmer you might want to use a library to share
commonly used text processing. As a bonus the details of certain strange
text processing characteristics (like what sequence of characters to
represent end-of-line) can be hidden from the programmer. Good!

So, a text processing library is the exactly right place for fixes of os
design flaws and differencies such as the use of end-of-line sequence
taking twice as much space as necessary.

Voila! We're back to where we started. GnuWin32.

Please, please, please. Do what you like, but do NOT try to break the
Unix way of computing in the GnuWin32 distribution. If you do, then
what's the point the whole project?

Maybe you're only interested in some groovy tools to filter your poor
DOS text files? My advice: get native ports of those tools. There is no
port? Sad, but you might have to live with that. If you can't, do the
port yourself or switch to a REAL OS (hint: ends with nix :-)

> Let me address one sure-to-come-up complaint right now: the notion that it
> would be too much work to 'fix' all the existing code.  How much time and
> effort is wasted on 'working around' the current situation?  Certainly more
> time than it would take to search-and-replace "w" with "wt", etc.

Oh no. Not in this universe. For reasons too many to list.
It may come as a surprise for you that the DOS way is not the right way.
Life can be cruel sometimes...

-- 
Tomas Fasth                     mailto:tomas.fasth@euronetics.com
EuroNetics Operation            http://euronetics.com
Mjärdevi Science Park           Office tel: +46 13 218 181
Teknikringen 1 E                Office fax: +46 13 218 182
58330 Linköping                 Mobile tel: +46 708 870 957
Sweden                          Mobile fax: +46 708 870 258
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
  1998-01-08 17:20 Why text=binary mounts Gary R. Van Sickle
@ 1998-01-09 13:40 ` Larry Hall (RFK Partners Inc)
  1998-01-09 13:40 ` Tomas Fasth
  1 sibling, 0 replies; 33+ messages in thread
From: Larry Hall (RFK Partners Inc) @ 1998-01-09 13:40 UTC (permalink / raw)
  To: Gary R. Van Sickle, gnu-win32

At 07:15 PM 1/8/98 -0600, Gary R. Van Sickle wrote:
>Let me address one sure-to-come-up complaint right now: the notion that it 
>would be too much work to 'fix' all the existing code.  How much time and 
>effort is wasted on 'working around' the current situation?  Certainly more 
>time than it would take to search-and-replace "w" with "wt", etc.

OK, you've come up with yet ANOTHER solution to this problem.  You wouldn't
be the first or probably the last.  So what are YOU going to do about it?  
You raise the issue that the "fix" is generally "dismissed" as a result of 
being too much "work".  From my perspective, the people with this view 
probably agree with the notion that the "fix" must occur but they have found 
work-arounds for their particular environments which are suitable and don't 
require much personal time investment.  Those people who have not and need 
something else complain loudly but seem just as unwilling to take on the 
herculean task.  For the moment, this is still largely a GNU-cenric project 
in spirit, without a large, committed development team behind it.  That said,
let's all acknowledge that while the users of this software may agree in 
general that a particular course of action may be beneficial, unless people 
VOLUNTEER to undertake the task, changes are NOT going to happen quickly.  I
personally don't feel like I've invested any large amount of time to work-
around my text/binary issues, although I don't have many.  Certainly it 
doesn't add up to anywhere near the time investment that would be necessary 
for me to go into even some of the source of these tools and make changes to
alleviate the difficulties.  Certainly one could argue that collectively all
users have spent some significant time working out these issues for themselves
and that maybe if that time was spent, collectively and in an organized 
fashion, fixing the tools, we'd all benefit.  However, as I said, unless 
someone organizes a volunteer effort, things aren't going to change quickly. 
So, if someone wants to pick up and organize that effort, great.  MAYBE I'd
even be willing to help.  However, I'm not certain that having all sorts 
of individuals posting to this list with general algorithms for fixing the
problem is useful, unless the one who posts actually might entertain the 
thought of making the changes, organizing a group to do so, or maybe even
somehow sponsoring Cygnus to do it for them.  I'm not trying to downplay the
issue here and I certainly don't want to discourage people from discussing
issues and solutions.  But this particular issue comes up frequently and 
always ends up being debated in a vacuum.  Admonishing others on this list or
the list in general for not fixing problems one finds intolerable in the 
current software is unfair.  Its not productive.  Perhaps we can find a 
different approach in regard to dealing with the text/binary issue and 
others like it?  I can see some overall benefit from this.


Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      (781) 239-1053
8 Grove Street                          (781) 239-1655 - FAX
Wellesley, MA  02181                             
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-09 13:40 ` Tomas Fasth
@ 1998-01-11 23:40   ` Fergus Henderson
  1998-01-12  5:03     ` Tomas Fasth
  1998-01-12 17:26   ` Guy Gascoigne - Piggford
  1 sibling, 1 reply; 33+ messages in thread
From: Fergus Henderson @ 1998-01-11 23:40 UTC (permalink / raw)
  To: Tomas Fasth; +Cc: gnu-win32, Gary R. Van Sickle

On 09-Jan-1998, Tomas Fasth <tomas.fasth@twinspot.net> wrote:
> What's wrong with this solution: Do it the Unix way. A file is a file is
> a file. Textual end-of-line is not a business of the i/o subsystem. In
> Unix the character sequence for end-of-line ('\n' == 012 == 0x10 ==
> 0b00001100)
> is nothing more exciting than a mutual agreement between tools that want
> to share text information. How simple!

That solution would be fine, if you were designing a new OS.
But we're not!  We're trying to be compatible with an existing OS.

> Please, please, please. Do what you like, but do NOT try to break the
> Unix way of computing in the GnuWin32 distribution. If you do, then
> what's the point the whole project?

I don't think anyone has suggested that support for binary-mode mounts
should be abandoned.  I would like to see *both* text-mode and binary-mode
mounts supported.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: < http://www.cs.mu.oz.au/~fjh >   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-12  5:03     ` Tomas Fasth
@ 1998-01-12  4:45       ` Fergus Henderson
  0 siblings, 0 replies; 33+ messages in thread
From: Fergus Henderson @ 1998-01-12  4:45 UTC (permalink / raw)
  To: Tomas Fasth; +Cc: gnu-win32, Gary R. Van Sickle

On 12-Jan-1998, Tomas Fasth <tomas.fasth@twinspot.net> wrote:
> Fergus Henderson wrote:
> > That solution would be fine, if you were designing a new OS.
> > But we're not!  We're trying to be compatible with an existing OS.
> 
> Yes and no. As far as I understand, one of the goals of gnuwin32 is to
> minimize the burden of porting unix tools to windos.  From that
> perspective, gnuwin32 have to provide a unix-like way of computing for
> these tools.

Yes, indeed.  And it does.  The use of text=binary mode mounts does
minimize porting effort.  But it also harms interoperability with DOS
and Windows tools.

> There are a waste of unix text tools that have '\n' hardcoded.
> I'm not saying this is a good thing, it's just how it is today.

Having '\n' hardcoded is not a problem -- the ANSI C standard requires
C implementations to convert "end-of-line", however it is represented
for that implementation, to '\n'.  The problem is slightly more
complicated: (1) missing "b" or O_BINARY flags in fopen() and open()
calls; and also occaisionally (2) assuming that file offsets,
return values from read(), etc. for text files are equal to the
number of characters.

> Further on, I guess we don't want to unnecessarily contaminate
> the stdio with a filthy text/binary i/o paradigm.

Both text and binary modes are necessary (ANSI C requires it), and both
text=binary and text!=binary mode mounts are necessary (to achieve
different aims).  Of course we don't want to _unnecessarily_ introduce
complexity, but in this case the complexity is necessary.  Without it,
there is no way to achieve the differing goals of gnu-win32: ease of
porting, and possibility of full interoperability.

> Therefore, if we want
> to have access to these tools on the windos platform, and have better
> things to do than rewriting existing unix tools, we have to find a way
> to serve these tools with under-the-cover end-of-line translations. The
> question is: HOW?

I think that we already have a design that can solve that problem...

> text=binary/text!=binary is one possible (and existing) solution.

Indeed.

> Another possible solution could be to map file extensions to the
> appropriate mode.

That doesn't work, because no database of file extensions can ever be
complete, because file extensions do not uniquely identify file types,
and because it is often necessary to distinguish between text and binary
mode even for files with no extension.

> And I'm sure we all have our own favorite solution :)

Sure, but unless they are significantly better than the existing solution
of text=binary/text!=binary mode mounts (and I have not heard any such
solutions mentioned), they are just irrelevant distractions.

> Somewhere along the line someone has to decide what the primary task of
> gnuwin32 is. It cannot possibly solve every problem that occur when
> trying to merge two incompatible programming platforms.

No, but the two (contradictory) goals of minimizing porting effort
and achieving full interoperability with native tools are both
important enough that they deserve to be supported.

> I was hoping
> that one primary goal of gnuwin32 would be to compile and run unix tools
> on the windos platform without modifications.

Yes, agreed.  That's why we should support text=binary mode mounts.
But we should also support text!=binary mode mounts, so that those
who are willing to put a bit more porting effort can achieve full
interoperability with native tools.

> We ought to be thankful to Cygnus for their honorable initiative, and to
> all of you contributing to making gnuwin32 even better.

Agreed.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: < http://www.cs.mu.oz.au/~fjh >   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-11 23:40   ` Fergus Henderson
@ 1998-01-12  5:03     ` Tomas Fasth
  1998-01-12  4:45       ` Fergus Henderson
  0 siblings, 1 reply; 33+ messages in thread
From: Tomas Fasth @ 1998-01-12  5:03 UTC (permalink / raw)
  To: gnu-win32; +Cc: Fergus Henderson, Gary R. Van Sickle

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2110 bytes --]

Fergus Henderson wrote:
> That solution would be fine, if you were designing a new OS.
> But we're not!  We're trying to be compatible with an existing OS.

Yes and no. As far as I understand, one of the goals of gnuwin32 is to
minimize the burden of porting unix tools to windos. From that
perspective, gnuwin32 have to provide a unix-like way of computing for
these tools. There are a waste of unix text tools that have '\n'
hardcoded. I'm not saying this is a good thing, it's just how it is
today. Further on, I guess we don't want to unnecessarily contaminate
the stdio with a filthy text/binary i/o paradigm. Therefore, if we want
to have access to these tools on the windos platform, and have better
things to do than rewriting existing unix tools, we have to find a way
to serve these tools with under-the-cover end-of-line translations. The
question is: HOW?

text=binary/text!=binary is one possible (and existing) solution.
Another possible solution could be to map file extensions to the
appropriate mode. And I'm sure we all have our own favorite solution :)

Somewhere along the line someone has to decide what the primary task of
gnuwin32 is. It cannot possibly solve every problem that occur when
trying to merge two incompatible programming platforms. I was hoping
that one primary goal of gnuwin32 would be to compile and run unix tools
on the windos platform without modifications. That would be something,
and we're pretty close already!

We ought to be thankful to Cygnus for their honorable initiative, and to
all of you contributing to making gnuwin32 even better.

Live in peace!
-- 
Tomas Fasth                     mailto:tomas.fasth@euronetics.com
EuroNetics Operation            http://euronetics.com
Mjärdevi Science Park           Office tel: +46 13 218 181
Teknikringen 1 E                Office fax: +46 13 218 182
58330 Linköping                 Mobile tel: +46 708 870 957
Sweden                          Mobile fax: +46 708 870 258
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-09 13:40 ` Tomas Fasth
  1998-01-11 23:40   ` Fergus Henderson
@ 1998-01-12 17:26   ` Guy Gascoigne - Piggford
  1 sibling, 0 replies; 33+ messages in thread
From: Guy Gascoigne - Piggford @ 1998-01-12 17:26 UTC (permalink / raw)
  To: Tomas Fasth, gnu-win32

At 08:49 PM 1/9/98 +0100, you wrote:
>Oh no. Not in this universe. For reasons too many to list.
>It may come as a surprise for you that the DOS way is not the right way.
>Life can be cruel sometimes...

But the point is that the UNIX way isn't the one true way either, we have
to cope with both, period.  Ignoring the difference is certainly one
approach, but not necessarily the only one.

Guy

-- 
Guy Gascoigne - Piggford (ggp@informix.com)
Software Engineer, Informix Software, Inc. (Portland, Oregon)

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
  1998-01-16  2:56   ` Jeffrey C. Fried
@ 1998-01-19 10:32     ` Guy Gascoigne - Piggford
  0 siblings, 0 replies; 33+ messages in thread
From: Guy Gascoigne - Piggford @ 1998-01-19 10:32 UTC (permalink / raw)
  To: Jeffrey C. Fried, Richard Thomas, gnu-win32

At 11:55 PM 1/15/98 -0700, Jeffrey C. Fried wrote:
>That is, if we can agree that certain tools are always applied solely to
>text, then automatic conversion of the native EOL to the single '\n'
>character is the solution which will make these tools work in any
>environment.  If we don't agree, then we must always construct 'text' files
>according to the UNIX description of a text file when using the gnuwin32
>tools and so it is not easy to use tools of mixed origin on a non-UNIX
>system (for example emacs under NT will rewrite files with CRLF while bash
>will need these files in LF format).  I prefer to use the native format so
>that i don't have to remember which tools produce which results and when i
>have to convert between the two formats.

Good points, I agree with them completely.

Guy


-- 
Guy Gascoigne - Piggford (ggp@informix.com)
Software Engineer, Informix Software, Inc. (Portland, Oregon)

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-15 15:16 Richard Thomas
@ 1998-01-18 16:07 ` Steven R. Newcomb
  0 siblings, 0 replies; 33+ messages in thread
From: Steven R. Newcomb @ 1998-01-18 16:07 UTC (permalink / raw)
  To: richard; +Cc: gnu-win32

> >If you want to read a line of 
> >text, it seems to me that the most logical thing to do would be to use a 
> >library which gave you access to functions such as fscanf() etc. which have 
> >no meaning for generic (binary) files.  This library then would be the 
> >place to do things like making all text files look the same to the 
> >programmer whether they're DOS/UNIX/Mac/whatever, in the same way that a 
> >PCX library might 'gloss over' the differences between the different PCX 
> >versions.
> 
> Good point. It's also important to remember that not all text is ASCII or
> ANSI, there's EBDIC (?) and a whole bunch of others too. Maybe a decent
> text library could even handle unicode files or something (I know nothing
> about unicode so dont flame me please) as well. Personally, when I open a
> file, I expect to get what's there. That *should* be the default. A file is
> just a bunch of bytes and that's the way it should be treated. If you want
> some kind of filter or interpretation, get a library.
> 
> A well written text processing program should recognise any combination of
> <cr> and <lf> as an end-of line marker and should write either the
> operating system default (But the OS should have no concept of "text"
> files) or ansi standard (if there is such a beast) or maybe even a format
> selected by the user.

I like the way both of you think.  Sounds to me like you should both
take a look at SGML, ISO 8879:1986.  And particularly at the SGML
Extended Facilities found in ISO/IEC 10744:1997 (see
http://www.hytime.org for pointers including the standard itself).
You will be surprised and pleased, I think, to discover that there is
such a beast, and, marvelous to say, it's already internationally
standardized.  Of course, the paradigm assumes that there are
documents (SGML documents, of course) that declare the notations of
information resources, and that optionally declare the libraries
and/or applications that understand notations of resources.  There are
also storage manager declarations that handle such things as
encryption, sealing, alternative character sets, compression,
containerizations such as tar, multimedia interleaving, etc.  Check it
out.  Some of us, at least, think it's the future of content
management.


-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-15 15:16 Gary R. Van Sickle
@ 1998-01-17 23:47 ` Fergus Henderson
  0 siblings, 0 replies; 33+ messages in thread
From: Fergus Henderson @ 1998-01-17 23:47 UTC (permalink / raw)
  To: Gary R. Van Sickle; +Cc: gnu-win32

On 15-Jan-1998, Gary R. Van Sickle <tiberius@braemarinc.com> wrote:
> > Have you tried reading the manual?
> > GNU make actually comes with quite good documentation.
> 
> Yeah, my problem was trying to get automatic dependency checking combined 
> with lack of patience.

To get automatic dependency checking using gcc and GNU Make,
all you need to do is to include the following in your Makefile:

	CC = gcc
	CFLAGS = -MD
	-include *.d

-- 
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: < http://www.cs.mu.oz.au/~fjh >   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
  1998-01-13 10:42 Gary R. Van Sickle
  1998-01-15  7:26 ` Peter Dalgaard BSA
@ 1998-01-17 11:00 ` John A. Turner
  1 sibling, 0 replies; 33+ messages in thread
From: John A. Turner @ 1998-01-17 11:00 UTC (permalink / raw)
  To: gnu-win32

Gary R. Van Sickle writes:

 > Yeah, you don't have the GNU make utility (I'll be deep in the cold, cold 
 > ground before I figure out how to use that one, and not for lack of 
 > trying).  Yeah, you don't have EMACS (how do you open a file with it?). 
 >  Yeah, no vi.

All I can say is that I consider those events (learning to use make and
learning Emacs) two of the most important in my development.

In fact, aside from learning my first programming language, I'm having
trouble thinking of events of more significance.

--
John A. Turner                       mailto:turner@blueskystudios.com
Senior Research Associate            http://www.blueskystudios.com
Blue Sky | VIFX                      http://www.vifx.com
One South Road, Harrison, NY 10528   http://www.lanl.gov/home/turner
Phone: 914-381-8400                  Fax: 914-381-9790/1

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
  1998-01-14  3:39 ` Richard Thomas
@ 1998-01-16  2:56   ` Jeffrey C. Fried
  1998-01-19 10:32     ` Guy Gascoigne - Piggford
  0 siblings, 1 reply; 33+ messages in thread
From: Jeffrey C. Fried @ 1998-01-16  2:56 UTC (permalink / raw)
  To: Richard Thomas, gnu-win32

I understand your logic, however, it belies the fundamental problem, namely
that no one is writing their tools with such well constructed text
libraries.  And more practically speaking there are simpler solutions once
we recognize that most of us use the unix tools on files of specific types.
 That is, i use tools like sed and awk only on text files.  I cannot think
of a specific example where i have mixed the file type (text vs. binary)
usage of any of the unix tools, except possibly cmp.  

While this may be a distinction which you don't feel i should have to draw,
practically speaking most of us do and it works well for us.  That is, when
i use tar, i use it on a file of a specific format which tar generates.
While i could use sed or awk on a tar file, i would never expect it to work
properly, simply because i don't expect the authors of these tools to have
given any thought to having sed or awk used on anything except a file which
has a notion of LINE delimited by some End-Of-Line character, or
characters.  This perspective, which you have pointed out should not be
necessary, in practice is necessary and practically speaking useful.  And
it is this perspective i am attempting to address when i discuss the notion
of native text mode handling.  

That is, if we can agree that certain tools are always applied solely to
text, then automatic conversion of the native EOL to the single '\n'
character is the solution which will make these tools work in any
environment.  If we don't agree, then we must always construct 'text' files
according to the UNIX description of a text file when using the gnuwin32
tools and so it is not easy to use tools of mixed origin on a non-UNIX
system (for example emacs under NT will rewrite files with CRLF while bash
will need these files in LF format).  I prefer to use the native format so
that i don't have to remember which tools produce which results and when i
have to convert between the two formats.

... jeff

At 11:39 AM 1/14/98 +0000, Richard Thomas wrote:
>>I guess the point I was trying to make is that it doesn't seem to me that 
>>there is a good argument for there to be text processing functionality in 
>>the fopen() family of functions (I know, it's a little late now!).  The 
>>difference I see between 'modes' and 'formats' is this:  we don't have a 
>>JPG mode or a WAV mode in fopen(), so why do we have a text mode?  When 
>>somebody wants to open up and manipulate a JPEG file, they use a JPEG 
>>library that gives them access to methods that are meaningful only on JPEG 
>>files.  I see text files in the same way.  If you want to read a line of 
>>text, it seems to me that the most logical thing to do would be to use a 
>>library which gave you access to functions such as fscanf() etc. which have 
>>no meaning for generic (binary) files.  This library then would be the 
>>place to do things like making all text files look the same to the 
>>programmer whether they're DOS/UNIX/Mac/whatever, in the same way that a 
>>PCX library might 'gloss over' the differences between the different PCX 
>>versions.
>
>Good point. It's also important to remember that not all text is ASCII or
>ANSI, there's EBDIC (?) and a whole bunch of others too. Maybe a decent
>text library could even handle unicode files or something (I know nothing
>about unicode so dont flame me please) as well. Personally, when I open a
>file, I expect to get what's there. That *should* be the default. A file is
>just a bunch of bytes and that's the way it should be treated. If you want
>some kind of filter or interpretation, get a library.
>
>A well written text processing program should recognise any combination of
><cr> and <lf> as an end-of line marker and should write either the
>operating system default (But the OS should have no concept of "text"
>files) or ansi standard (if there is such a beast) or maybe even a format
>selected by the user.
>
>Even better would be that your program could register a callback function
>with the text processing library allowing complete control. For example, I
>define a text file format with each line being a field of 81 characters,
>the first byte representing the length of text on the line, subsequent
>characters being represented by 2*the alphabet position +1 if upperrcase
>(a=2, A=3, b=4, etc.....). How does fopen (fname, "rt") handle this? It is
>a text file. It doesnt use ANSI characters but it could and it still
>wouldnt be handled correctly. So how is this "text mode"? It's not, it's
>"let's kludge the end-of-line" mode. Text mode should imply that there's no
>post-processing to be done on the input, you open the file with the proper
>format filter and treat it as text from there-on-in.
>
>Other advantages: Opening binary files with a hex-mode filter or even
>executables with a disassembly/assembly filter and, of course, using
>whichever editor you prefer as long as it's compiled with the text-library.
>
>Of course, it could be done by patching fopen but the behaviour for that is
>already standardised and it would be a cludge. What's needed is a properly
>designed library
>
>Rich
>
>-
>For help on using this list (especially unsubscribing), send a message to
>"gnu-win32-request@cygnus.com" with one line of text: "help".
>
>
--
Jeffrey C. Fried      Jeff@Fried.net

   Because a liar tells the truth does not mean the truth is a lie.

NOTICE: I charge $500.00 for each unsolicited advertisement i receive as email
to cover the cost of my time to review and possibly respond to your
advertisement.
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-12 14:09 Gary R. Van Sickle
  1998-01-14  3:39 ` Richard Thomas
@ 1998-01-16  2:56 ` Benjamin Riefenstahl
  1 sibling, 0 replies; 33+ messages in thread
From: Benjamin Riefenstahl @ 1998-01-16  2:56 UTC (permalink / raw)
  To: gnu-win32

Hi Gary,


Gary R. Van Sickle wrote:
> I guess the point I was trying to make is that it doesn't seem to me that
> there is a good argument for there to be text processing functionality in
> the fopen() family of functions (I know, it's a little late now!).

Yeah, that's history, isn't it? fopen() is used for text directly. That
seems to be part of the Unix heritage of C and I would think it does
have it's advantages. After all the vast majority of Unix tools and
Unix-style tools are dealing with text only.

> How about this:  we say, "we're stuck with
> text mode, we'll do better next time, but to alleviate the current
> situation, we take the TAL and put it in the fopen() functions, so that
> when somebody does a fopen(???, "rt"), they are getting the TAL sitting on
> top of a binary stream"?  That way all the old programs which are
> ANSI-compliant (we've canned those that aren't) get the cool new 'any text
> file' functionality *automatically*.  Plus, since the TAL is portable, any
> new GNUWin's or DJGPP's or whatever also get it automatically.  This looks
> to me like it solves the problem in the way you were describing, but
> doesn't require the addition of another mode.

That's basically what I thought. I would try to keep this text mode
layer very simple and small though. When you talk about "TAL" in my mind
I get the image of some library that does all kind of things even
including hooks for stuff like HTML, SGML, and I-don't-know-what-else.
Nice, but I would not want it in the basic RTL. OTOH if we are talking
about ASCII text, with only support for Unix, DOS and Mac line ends,
that would be what I need for ported tools right now. And it's very easy
to implement in just a few lines.


so long, benny
======================================
Benjamin Riefenstahl (benny@crocodial.de)
Crocodial Communications EntwicklungsGmbH
Ophagen 16a, D-20257 Hamburg, Germany
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
@ 1998-01-15 15:16 Richard Thomas
  1998-01-18 16:07 ` Steven R. Newcomb
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Thomas @ 1998-01-15 15:16 UTC (permalink / raw)
  To: gnu-win32

>I guess the point I was trying to make is that it doesn't seem to me that 
>there is a good argument for there to be text processing functionality in 
>the fopen() family of functions (I know, it's a little late now!).  The 
>difference I see between 'modes' and 'formats' is this:  we don't have a 
>JPG mode or a WAV mode in fopen(), so why do we have a text mode?  When 
>somebody wants to open up and manipulate a JPEG file, they use a JPEG 
>library that gives them access to methods that are meaningful only on JPEG 
>files.  I see text files in the same way.  If you want to read a line of 
>text, it seems to me that the most logical thing to do would be to use a 
>library which gave you access to functions such as fscanf() etc. which have 
>no meaning for generic (binary) files.  This library then would be the 
>place to do things like making all text files look the same to the 
>programmer whether they're DOS/UNIX/Mac/whatever, in the same way that a 
>PCX library might 'gloss over' the differences between the different PCX 
>versions.

Good point. It's also important to remember that not all text is ASCII or
ANSI, there's EBDIC (?) and a whole bunch of others too. Maybe a decent
text library could even handle unicode files or something (I know nothing
about unicode so dont flame me please) as well. Personally, when I open a
file, I expect to get what's there. That *should* be the default. A file is
just a bunch of bytes and that's the way it should be treated. If you want
some kind of filter or interpretation, get a library.

A well written text processing program should recognise any combination of
<cr> and <lf> as an end-of line marker and should write either the
operating system default (But the OS should have no concept of "text"
files) or ansi standard (if there is such a beast) or maybe even a format
selected by the user.

Even better would be that your program could register a callback function
with the text processing library allowing complete control. For example, I
define a text file format with each line being a field of 81 characters,
the first byte representing the length of text on the line, subsequent
characters being represented by 2*the alphabet position +1 if upperrcase
(a=2, A=3, b=4, etc.....). How does fopen (fname, "rt") handle this? It is
a text file. It doesnt use ANSI characters but it could and it still
wouldnt be handled correctly. So how is this "text mode"? It's not, it's
"let's kludge the end-of-line" mode. Text mode should imply that there's no
post-processing to be done on the input, you open the file with the proper
format filter and treat it as text from there-on-in.

Other advantages: Opening binary files with a hex-mode filter or even
executables with a disassembly/assembly filter and, of course, using
whichever editor you prefer as long as it's compiled with the text-library.

Of course, it could be done by patching fopen but the behaviour for that is
already standardised and it would be a cludge. What's needed is a properly
designed library

Rich

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
@ 1998-01-15 15:16 Gary R. Van Sickle
  1998-01-17 23:47 ` Fergus Henderson
  0 siblings, 1 reply; 33+ messages in thread
From: Gary R. Van Sickle @ 1998-01-15 15:16 UTC (permalink / raw)
  To: gnu-win32

> Yeah, you don't have the GNU make utility (I'll be deep in the cold, cold 
> ground before I figure out how to use that one, and not for lack of
> trying).

Have you tried reading the manual?
GNU make actually comes with quite good documentation.

Yeah, my problem was trying to get automatic dependency checking combined 
with lack of patience.  I finally gave up and went to batch files.  I 
agree, the GNU make documentation is good. Maybe if I had a full Unix 
setup, everything would end up working better.

Gary R. Van Sickle (tiberius@braemarinc.com)
Electrical Design Engineer
Braemar Inc.
11481 Rupp Dr.
Burnsville, MN 55337


-----Original Message-----
From:	Fergus Henderson [SMTP:fjh@cs.mu.OZ.AU]
Sent:	Wednesday, January 14, 1998 8:56 PM
To:	Gary R. Van Sickle
Subject:	Re: Why text=binary mounts

On 13-Jan-1998, Gary R. Van Sickle <tiberius@braemarinc.com> wrote:
> Yeah, you don't have the GNU make utility (I'll be deep in the cold, cold 
> ground before I figure out how to use that one, and not for lack of
> trying).

Have you tried reading the manual?
GNU make actually comes with quite good documentation.

> So what is Unix'
> advantage over MS when it comes to software development?  Is it simply 
that
> the GNU stuff is available free?

The advantage is the number of existing tools and the ease of
building your own tools, by combining scripts, existing tools, etc.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the 
pursuit
WWW: < http://www.cs.mu.oz.au/~fjh >   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
@ 1998-01-15  7:26 Immanuel Litzroth
  0 siblings, 0 replies; 33+ messages in thread
From: Immanuel Litzroth @ 1998-01-15  7:26 UTC (permalink / raw)
  To: gnu-win32, tiberius

> Yeah, you don't have EMACS (how do you open a file with it?).
Opening a file in emacs:
     for windows users : Go to the files menu (on the top of the emacs
     screen) and click Open File.
     For Unix users : Press C-x C-f.
You have to know which file you want to open, the system has no way of
knowing
which file you want to open. While this is arguably a shortcoming of Emacs,
this can
be remedied by giving it a filename when it prompts you for one.
>  So what is Unix' advantage over MS when it comes to software
development?

Those who can figure out how to open a file in Emacs are fit for Unix
development.
Emacs is not an editor, it is just a testcase for selecting likely unix
developers.

> Is it simply that the GNU stuff is available free?

The fact that most of the GNU/Linux stuff is free, stable and well
documented does contribute
to the partiality of "UNIX" developers to this "stuff", although it is more
likely they use it just to spite
the normal developers .
Immanuel





-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-13 10:42 Gary R. Van Sickle
@ 1998-01-15  7:26 ` Peter Dalgaard BSA
  1998-01-17 11:00 ` John A. Turner
  1 sibling, 0 replies; 33+ messages in thread
From: Peter Dalgaard BSA @ 1998-01-15  7:26 UTC (permalink / raw)
  To: Gary R. Van Sickle; +Cc: gnu-win32

"Gary R. Van Sickle" <tiberius@braemarinc.com> writes:

> Yeah, you don't have the GNU make utility (I'll be deep in the cold, cold 
> ground before I figure out how to use that one, and not for lack of 
> trying).  Yeah, you don't have EMACS (how do you open a file with it?). 
>  Yeah, no vi.
> 
> My point here is that it seems to me that MS/Borland have the distinct 
> usability advantage, and like it or not, we're all users.  So what is Unix' 
> advantage over MS when it comes to software development?  Is it simply that 
> the GNU stuff is available free?

For large projects, and especially when porting Unix programs(!), the
advantage it quite striking. It's very much one of these "XX can do it
easier IF it can do it"-situations. "Make" can be used for many
things, not just programs but also documentation, testing, etc.
The whole Unix approach is summarized as "Just Type Make", i.e.
automate as much as you can, even if costs a bit of head scratching,
you'll only need to do it once.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
  1998-01-12 14:09 Gary R. Van Sickle
@ 1998-01-14  3:39 ` Richard Thomas
  1998-01-16  2:56   ` Jeffrey C. Fried
  1998-01-16  2:56 ` Benjamin Riefenstahl
  1 sibling, 1 reply; 33+ messages in thread
From: Richard Thomas @ 1998-01-14  3:39 UTC (permalink / raw)
  To: gnu-win32

>I guess the point I was trying to make is that it doesn't seem to me that 
>there is a good argument for there to be text processing functionality in 
>the fopen() family of functions (I know, it's a little late now!).  The 
>difference I see between 'modes' and 'formats' is this:  we don't have a 
>JPG mode or a WAV mode in fopen(), so why do we have a text mode?  When 
>somebody wants to open up and manipulate a JPEG file, they use a JPEG 
>library that gives them access to methods that are meaningful only on JPEG 
>files.  I see text files in the same way.  If you want to read a line of 
>text, it seems to me that the most logical thing to do would be to use a 
>library which gave you access to functions such as fscanf() etc. which have 
>no meaning for generic (binary) files.  This library then would be the 
>place to do things like making all text files look the same to the 
>programmer whether they're DOS/UNIX/Mac/whatever, in the same way that a 
>PCX library might 'gloss over' the differences between the different PCX 
>versions.

Good point. It's also important to remember that not all text is ASCII or
ANSI, there's EBDIC (?) and a whole bunch of others too. Maybe a decent
text library could even handle unicode files or something (I know nothing
about unicode so dont flame me please) as well. Personally, when I open a
file, I expect to get what's there. That *should* be the default. A file is
just a bunch of bytes and that's the way it should be treated. If you want
some kind of filter or interpretation, get a library.

A well written text processing program should recognise any combination of
<cr> and <lf> as an end-of line marker and should write either the
operating system default (But the OS should have no concept of "text"
files) or ansi standard (if there is such a beast) or maybe even a format
selected by the user.

Even better would be that your program could register a callback function
with the text processing library allowing complete control. For example, I
define a text file format with each line being a field of 81 characters,
the first byte representing the length of text on the line, subsequent
characters being represented by 2*the alphabet position +1 if upperrcase
(a=2, A=3, b=4, etc.....). How does fopen (fname, "rt") handle this? It is
a text file. It doesnt use ANSI characters but it could and it still
wouldnt be handled correctly. So how is this "text mode"? It's not, it's
"let's kludge the end-of-line" mode. Text mode should imply that there's no
post-processing to be done on the input, you open the file with the proper
format filter and treat it as text from there-on-in.

Other advantages: Opening binary files with a hex-mode filter or even
executables with a disassembly/assembly filter and, of course, using
whichever editor you prefer as long as it's compiled with the text-library.

Of course, it could be done by patching fopen but the behaviour for that is
already standardised and it would be a cludge. What's needed is a properly
designed library

Rich

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-12 10:42 marcus
@ 1998-01-13 10:53 ` Benjamin Riefenstahl
  0 siblings, 0 replies; 33+ messages in thread
From: Benjamin Riefenstahl @ 1998-01-13 10:53 UTC (permalink / raw)
  To: marcus; +Cc: gnu-win32

marcus@bighorn.dr.lucent.com wrote:
> ... I
> always thought that the original intent was that \n would represent whatever
> the local newline character was, whether it was LF or something else. I have
> never seen this pairing broken, and I'm sure it was wreck havoc on many
> programs if it was changed.

While we are at the topic of history, actually Apple's own C compiler
for MacOS switches those two so that in that dialect \n *is* the text
file line delimiter. And yes this *does* "wreck havoc on many programs".
======================================
Benjamin Riefenstahl (benny@crocodial.de)
Crocodial Communications EntwicklungsGmbH
Ophagen 16a, D-20257 Hamburg, Germany
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
@ 1998-01-13 10:42 Gary R. Van Sickle
  1998-01-15  7:26 ` Peter Dalgaard BSA
  1998-01-17 11:00 ` John A. Turner
  0 siblings, 2 replies; 33+ messages in thread
From: Gary R. Van Sickle @ 1998-01-13 10:42 UTC (permalink / raw)
  To: gnu-win32

Several people have made comments to the effect that MS development 
environments are somehow not as good as Unix for developing software.  I 
would be interested in hearing why that perception exists.  I do not 
believe that it is entirely accurate.  MS's Visual C++ 4.0 and 5.0 (the 
environments I use on a daily basis) are quite good, even for developing 
command-line programs, and I've heard a lot of people say Borland's is even 
better.

Yeah, you don't have the GNU make utility (I'll be deep in the cold, cold 
ground before I figure out how to use that one, and not for lack of 
trying).  Yeah, you don't have EMACS (how do you open a file with it?). 
 Yeah, no vi.

My point here is that it seems to me that MS/Borland have the distinct 
usability advantage, and like it or not, we're all users.  So what is Unix' 
advantage over MS when it comes to software development?  Is it simply that 
the GNU stuff is available free?

Gary R. Van Sickle (tiberius@braemarinc.com)
Electrical Design Engineer
Braemar Inc.
11481 Rupp Dr.
Burnsville, MN 55337
(612) 890-5135 Ext. 144
Fax: (612) 882-6550


-----Original Message-----
From:	marcus@bighorn.dr.lucent.com [SMTP:marcus@bighorn.dr.lucent.com]
Sent:	Monday, January 12, 1998 10:59 AM
To:	gnu-win32@cygnus.com
Subject:	Re: Why text=binary mounts

Tomas Fasth <tomas.fasth@twinspot.net> writes:
> Microsoft Windows dominates as a desktop platform. This does not
> necessarily mean that it's a preferred platform in general for program
> development. Note that I said 'preferred'. A great majority of the
> programmers today have to make their living developing for MS Windows.
> But that does not mean that all these programmers are putting their vote
> on it as the preferred environment for program development. As an
> example, the steady growth of installed Linux and FreeBSD systems says
> otherwise.

It is an unfortunate fact of life now that Microsoft software controls 85%
of the computers in the world.  That's pretty dominant, and unfortunately,
it will likely have a strong hold on that market share for quite some time
to come.  Unix may have had a chance back in the late 80s when there was
a big attempt to unify the Unix market to combat Microsoft, but alas the
effort was doomed to failure by the dark forces of entropy.  Since then, 
the
world has been being taken over by Microsoft...

If you are writing software that you want to sell to the largest group of
users, you have to address it to Microsoft.  Sure, the remaining 15% of the
market can buy some product, but not all that much, plus that 15% is split
between all of the non-MS world, so only a very small portion is actually
Unix, MacOS, etc.  So, if you want to appeal to the largest user base, you
really have to work in the MS world.  That's a pretty good reason to 
"prefer"
MS.  Not that it's the nicest environment to write code in, but because it
gives the better return on your programming investment.

> ... I'm talking about Mac ('\r'), Unix ('\n') and DOS ('\r\n'),
> where the Mac choice is by far most hostile to the programmer community,
> since it seem to have been made most recently...

I guess it is possible...  The MacOS CR delimiter certainly dates back to
the beginnings of AppleDOS on the Apple ][, around 1976 or so, I think. 
 The
CR LF of DOS comes by way of CP/M, which came from RT-11 (and other DEC 
PDP-11
OSs), so it came from the late 60s.  Unix actually seems to be the odd-ball
by using LF as the delimiter.  How many times have you gotten into raw 
mode,
where the return key no longer terminates you input lines?  Using CR as
the delimiter seems most intuitive, since that's actually what you usually
type to terminate a line.  CR LF makes sense because that's actually what
you send to a TTY or CRT terminal to go to the next line.  There were some
chain printers that used to use LF to move to the start of the next line, 
but
this was hardly common.

BTW, I think that it is Unix-centric to think of CR as \r and LF as \n.  I
always thought that the original intent was that \n would represent 
whatever
the local newline character was, whether it was LF or something else. I 
have
never seen this pairing broken, and I'm sure it was wreck havoc on many
programs if it was changed.

Gary R. Van Sickle wrote:
} And while we're at it, is JPG the right way, or is PNG the right way? To
}people, text is text.  Why should it not be the same for a 300MHz Pentium
}II.  Or your SparcStation.  Or the Mac.

> Be real. If text is text, then why can't images just be images? You are
> comparing different end-of-line schemes with different formats of image
> encoding. I'm sure you can find imaging tools on DOS, Mac and Unix that
> do not support each other's formats of encoding.

That is his point, that he is comparing different formats of text encoding
with different formats of image encoding.

While I do think that Gary's view of a text file manipulation library is a
good one, it is not the path that POSIX/ANSI has taken.  I'm actualy not 
sure
who originally invented the idea of adding "b" to fopen calls to select
binary files (and leave the default as a text mode open), but everybody 
seems
to have blessed it.

There might be more milage in working up a word processing abstract format,
similar to BFD, but for text processing documents.  Of course, since I have
spent some time using BFD, maybe I will retract this suggestion...

marcus hall
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
  1998-01-09 18:22 Gary R. Van Sickle
@ 1998-01-12 20:11 ` Guy Gascoigne - Piggford
  0 siblings, 0 replies; 33+ messages in thread
From: Guy Gascoigne - Piggford @ 1998-01-12 20:11 UTC (permalink / raw)
  To: Gary R. Van Sickle, gnu-win32

At 10:45 AM 1/9/98 -0600, Gary R. Van Sickle wrote:
>Here is one unrelated issue that I would be willing to spend some time on: 
> What is with all the K&R-style functions I see in the GNU C library?  Is 
>there a reason for them not to be ANSI?  I can't believe there are a 
>significant number of non-ANSI compilers still in use for development, 
>maybe I'm wrong...

Actually, this is an area where much of the UNIX world is lagging behind
PCs, K&R is definitely alive and kicking and living on UNIX.  ANSI is there
as well, but by no means dominant. (Now whether this is a side effect of
how much more longevity of code there has been on UNIX when compared with
the PC I don't know).

Guy

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
@ 1998-01-12 14:09 Gary R. Van Sickle
  1998-01-14  3:39 ` Richard Thomas
  1998-01-16  2:56 ` Benjamin Riefenstahl
  0 siblings, 2 replies; 33+ messages in thread
From: Gary R. Van Sickle @ 1998-01-12 14:09 UTC (permalink / raw)
  To: gnu-win32

Hi Benjamin,

I guess the point I was trying to make is that it doesn't seem to me that 
there is a good argument for there to be text processing functionality in 
the fopen() family of functions (I know, it's a little late now!).  The 
difference I see between 'modes' and 'formats' is this:  we don't have a 
JPG mode or a WAV mode in fopen(), so why do we have a text mode?  When 
somebody wants to open up and manipulate a JPEG file, they use a JPEG 
library that gives them access to methods that are meaningful only on JPEG 
files.  I see text files in the same way.  If you want to read a line of 
text, it seems to me that the most logical thing to do would be to use a 
library which gave you access to functions such as fscanf() etc. which have 
no meaning for generic (binary) files.  This library then would be the 
place to do things like making all text files look the same to the 
programmer whether they're DOS/UNIX/Mac/whatever, in the same way that a 
PCX library might 'gloss over' the differences between the different PCX 
versions.

'Doesn't work' may have been a little harsh.  As you point out, it's been 
working for a lot of people for a long time now.  It just seems to me that 
there's got to be a better way, and that the GNU world is the only place 
where it could actually be done.  Just think of it - millions of 
programmers everywhere feeding \r\n's, \r's, \n's, HTML, Unicode, etc., 
into gcc's running on Linux, NT, Rhapsody, and yes, even DOS, and having no 
trouble in any configuration.  Sounds like a magical dream world.  Sorry, 
I'm getting carried away.

You've got me thinking, though.  How about this:  we say, "we're stuck with 
text mode, we'll do better next time, but to alleviate the current 
situation, we take the TAL and put it in the fopen() functions, so that 
when somebody does a fopen(???, "rt"), they are getting the TAL sitting on 
top of a binary stream"?  That way all the old programs which are 
ANSI-compliant (we've canned those that aren't) get the cool new 'any text 
file' functionality *automatically*.  Plus, since the TAL is portable, any 
new GNUWin's or DJGPP's or whatever also get it automatically.  This looks 
to me like it solves the problem in the way you were describing, but 
doesn't require the addition of another mode.

Gary R. Van Sickle (tiberius@braemarinc.com)
Electrical Design Engineer
Braemar Inc.
11481 Rupp Dr.
Burnsville, MN 55337
(612) 890-5135 Ext. 144
Fax: (612) 882-6550


-----Original Message-----
From:	Benjamin Riefenstahl [SMTP:benny@crocodial.de]
Sent:	Monday, January 12, 1998 3:54 AM
To:	gnu-win32@cygnus.com
Subject:	Re: Why text=binary mounts

Hi Gary,


Benjamin Riefenstahl wrote:
[proposal to add new modes to fopen(), open(), iostream to cater for
Unix/DOS/Mac text files]

Gary R. Van Sickle wrote:
> The real problem here is that files as they exist on disk don't have
> 'modes', they have formats.  Adding 'modes' to a system that really 
doesn't
> work already will only make the situation worse.

?? Sounds nice, but what does that really mean ;-) (no offence
intended). As I see it "file format" vs "stream translation mode" is
just a matter of point of view. Folks may not be used to file streams
doing translations on Unix but on other systems it's a fact of life and
seen as a feature.

I'm also not sure what "doesn't work" means. Text mode works very well
in DOS with other compilers. Porting simple Unix tools dealing with
binary files usually amounts to putting in the O_BINARY or "b" flags and
that's that. It's an easy patch (bug fix actually) and no problem at all
as far as I can see, except that text mode does not cover Mac files.

> What I think is really needed is a Text Access Library (TAL) that sits 
*on
> top* of a *binary* stdio file and reads and writes lines from UNIX, DOS,
> Mac, maybe HTML, etc., etc., text files.  Instead of fopen(???, "rt"),
> you'd use the library and then *not care* what the text file format is,
> only that it contains lines of text.  This TAL would become part of the
> standard C library (or the GNU library at least, which would make it a
> de-facto standard), all the tools that were dealing with text would use 
it,
> and eventually the "t" functionality of stdio would be deprecated and the
> problem would be solved.

There is already at least one library framework that can do this and
more, I'm thinking of I/O Streams. With I/O Streams I would just have to
define my special UniversalTextInputFilterStream, plug it into the
program where it used to declare it's text input streams, and I would be
done. Yeah sure, it's C++ only but for a new design that would not
bother me.

The problem with both approaches (I/O Streams as well as your TAL) is
that they don't cover existing tools written in C on the basis of
fopen() or even open(). There would be significant amount of work to be
done to port these. And that was where I thought a simple re-definition
of the existing library feature "text mode" (which I otherwise accept as
given) would help a lot to simplify this work.

BTW I'm not dismissing the idea of TAL in general. In theory I would be
interested in an extensible stream/filter library for C as a GNU project
or even in the C RTL. In practice though I program in C++ and already
have what I need there.


so long, benny
======================================
Benjamin Riefenstahl (benny@crocodial.de)
Crocodial Communications EntwicklungsGmbH
Ophagen 16a, D-20257 Hamburg, Germany
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
@ 1998-01-12 10:42 marcus
  1998-01-13 10:53 ` Benjamin Riefenstahl
  0 siblings, 1 reply; 33+ messages in thread
From: marcus @ 1998-01-12 10:42 UTC (permalink / raw)
  To: gnu-win32

Tomas Fasth <tomas.fasth@twinspot.net> writes:
> Microsoft Windows dominates as a desktop platform. This does not
> necessarily mean that it's a preferred platform in general for program
> development. Note that I said 'preferred'. A great majority of the
> programmers today have to make their living developing for MS Windows.
> But that does not mean that all these programmers are putting their vote
> on it as the preferred environment for program development. As an
> example, the steady growth of installed Linux and FreeBSD systems says
> otherwise.

It is an unfortunate fact of life now that Microsoft software controls 85%
of the computers in the world.  That's pretty dominant, and unfortunately,
it will likely have a strong hold on that market share for quite some time
to come.  Unix may have had a chance back in the late 80s when there was
a big attempt to unify the Unix market to combat Microsoft, but alas the
effort was doomed to failure by the dark forces of entropy.  Since then, the
world has been being taken over by Microsoft...

If you are writing software that you want to sell to the largest group of
users, you have to address it to Microsoft.  Sure, the remaining 15% of the
market can buy some product, but not all that much, plus that 15% is split
between all of the non-MS world, so only a very small portion is actually
Unix, MacOS, etc.  So, if you want to appeal to the largest user base, you
really have to work in the MS world.  That's a pretty good reason to "prefer"
MS.  Not that it's the nicest environment to write code in, but because it
gives the better return on your programming investment.

> ... I'm talking about Mac ('\r'), Unix ('\n') and DOS ('\r\n'),
> where the Mac choice is by far most hostile to the programmer community,
> since it seem to have been made most recently...

I guess it is possible...  The MacOS CR delimiter certainly dates back to
the beginnings of AppleDOS on the Apple ][, around 1976 or so, I think.  The
CR LF of DOS comes by way of CP/M, which came from RT-11 (and other DEC PDP-11
OSs), so it came from the late 60s.  Unix actually seems to be the odd-ball
by using LF as the delimiter.  How many times have you gotten into raw mode,
where the return key no longer terminates you input lines?  Using CR as
the delimiter seems most intuitive, since that's actually what you usually
type to terminate a line.  CR LF makes sense because that's actually what
you send to a TTY or CRT terminal to go to the next line.  There were some
chain printers that used to use LF to move to the start of the next line, but
this was hardly common.

BTW, I think that it is Unix-centric to think of CR as \r and LF as \n.  I
always thought that the original intent was that \n would represent whatever
the local newline character was, whether it was LF or something else. I have
never seen this pairing broken, and I'm sure it was wreck havoc on many
programs if it was changed.

Gary R. Van Sickle wrote:
} And while we're at it, is JPG the right way, or is PNG the right way? To
}people, text is text.  Why should it not be the same for a 300MHz Pentium
}II.  Or your SparcStation.  Or the Mac.

> Be real. If text is text, then why can't images just be images? You are
> comparing different end-of-line schemes with different formats of image
> encoding. I'm sure you can find imaging tools on DOS, Mac and Unix that
> do not support each other's formats of encoding.

That is his point, that he is comparing different formats of text encoding
with different formats of image encoding.

While I do think that Gary's view of a text file manipulation library is a
good one, it is not the path that POSIX/ANSI has taken.  I'm actualy not sure
who originally invented the idea of adding "b" to fopen calls to select
binary files (and leave the default as a text mode open), but everybody seems
to have blessed it.

There might be more milage in working up a word processing abstract format,
similar to BFD, but for text processing documents.  Of course, since I have
spent some time using BFD, maybe I will retract this suggestion...

marcus hall
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
@ 1998-01-12  9:17 marcus
  0 siblings, 0 replies; 33+ messages in thread
From: marcus @ 1998-01-12  9:17 UTC (permalink / raw)
  To: gnu-win32

?? originally wrote:
}Maybe the reading operations could track what kind of
}line termination is being used on a file, then succeeding write operations
}could use the same style (unless overridden by the open() flags)?  That
}sounds weird...and probably unworkable...just food for thought.

"Larry Hall (RFK Partners Inc)" <lhall@rfk.com> writes:
> Not necessarily.  Various programs do this, including vim and, I think,
>NTEmacs.

So long as the program knows that it is a text file (or that it is going
to treat it as such), then it could just as easily pass some indication of
that to the {f}open() routine.  Given a reliable indication of text/binary
content of a file, cygwin32 could do a fine job of translating line endings
from the native system to the \n termination expectation of the program.

The problem, though, seems to be in getting all the programs to reliably
pass this information to the {f}open() call, so cygwin32 does not know
if the file is binary or not.

Trying to tell which of several likely line endings is used in a file is
not too difficult, once you know it is a text file.  Trying to tell if
it is a text file or a binary file, however, is not nearly so easy.

marcus hall
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-09 18:22 Gary R. Van Sickle
@ 1998-01-12  2:57 ` Benjamin Riefenstahl
  0 siblings, 0 replies; 33+ messages in thread
From: Benjamin Riefenstahl @ 1998-01-12  2:57 UTC (permalink / raw)
  To: gnu-win32

Hi Gary,


Benjamin Riefenstahl wrote:
[proposal to add new modes to fopen(), open(), iostream to cater for
Unix/DOS/Mac text files]

Gary R. Van Sickle wrote:
> The real problem here is that files as they exist on disk don't have
> 'modes', they have formats.  Adding 'modes' to a system that really doesn't
> work already will only make the situation worse.

?? Sounds nice, but what does that really mean ;-) (no offence
intended). As I see it "file format" vs "stream translation mode" is
just a matter of point of view. Folks may not be used to file streams
doing translations on Unix but on other systems it's a fact of life and
seen as a feature.

I'm also not sure what "doesn't work" means. Text mode works very well
in DOS with other compilers. Porting simple Unix tools dealing with
binary files usually amounts to putting in the O_BINARY or "b" flags and
that's that. It's an easy patch (bug fix actually) and no problem at all
as far as I can see, except that text mode does not cover Mac files.

> What I think is really needed is a Text Access Library (TAL) that sits *on
> top* of a *binary* stdio file and reads and writes lines from UNIX, DOS,
> Mac, maybe HTML, etc., etc., text files.  Instead of fopen(???, "rt"),
> you'd use the library and then *not care* what the text file format is,
> only that it contains lines of text.  This TAL would become part of the
> standard C library (or the GNU library at least, which would make it a
> de-facto standard), all the tools that were dealing with text would use it,
> and eventually the "t" functionality of stdio would be deprecated and the
> problem would be solved.

There is already at least one library framework that can do this and
more, I'm thinking of I/O Streams. With I/O Streams I would just have to
define my special UniversalTextInputFilterStream, plug it into the
program where it used to declare it's text input streams, and I would be
done. Yeah sure, it's C++ only but for a new design that would not
bother me.

The problem with both approaches (I/O Streams as well as your TAL) is
that they don't cover existing tools written in C on the basis of
fopen() or even open(). There would be significant amount of work to be
done to port these. And that was where I thought a simple re-definition
of the existing library feature "text mode" (which I otherwise accept as
given) would help a lot to simplify this work.

BTW I'm not dismissing the idea of TAL in general. In theory I would be
interested in an extensible stream/filter library for C as a GNU project
or even in the C RTL. In practice though I program in C++ and already
have what I need there.


so long, benny
======================================
Benjamin Riefenstahl (benny@crocodial.de)
Crocodial Communications EntwicklungsGmbH
Ophagen 16a, D-20257 Hamburg, Germany
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-09 18:22 Gary R. Van Sickle
@ 1998-01-11 10:01 ` Tomas Fasth
  0 siblings, 0 replies; 33+ messages in thread
From: Tomas Fasth @ 1998-01-11 10:01 UTC (permalink / raw)
  To: gnu-win32; +Cc: Gary R. Van Sickle

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 6588 bytes --]

Gary R. Van Sickle wrote:
> I don't want to start yet another religious war, but the 'get a real OS'
> refrain sounds a lot like the same refrain we used to hear from the Mac
> users.  And the Apple II users.  And the C64 users.  And the Amiga users.

Gary, I admit 'get a real OS' seldom is a meaningful argument. I just
couldn't refrain myself from using it. One of my weak spots I guess :)

> Microsoft is winning (in fact has already won) because people hear
> 'get a real OS' and immediately turn off.

What's all this talk about win and loose?
I say, the real winner in the long run is the Internet community.
Windows will pass, the Internet will remain. (My very own prediction ;-)

Microsoft Windows dominates as a desktop platform. This does not
necessarily mean that it's a preferred platform in general for program
development. Note that I said 'preferred'. A great majority of the
programmers today have to make their living developing for MS Windows.
But that does not mean that all these programmers are putting their vote
on it as the preferred environment for program development. As an
example, the steady growth of installed Linux and FreeBSD systems says
otherwise.

There is already many fine tools available for MS Windows that tries to
overcome some of the design flaws in that OS. The point is; don't blame
Unix for making it difficult to port Unix tools to such OS'es. Unix is
simply a great platform for program development. I guess that's why so
many great tools for program development comes from Unix. And that's why
this issue is an issue. Programmers stuck with the poorer DOS
environment have recognized that the grass is greener on the Unix side.
They want greener grass too! Give us those tools now, they say. But
wait, why are those tools so stupid when applied on our precious DOS
text files? Fix it, they say.

Your frustration about textual end-of-line differencies can easily be
morphed into a stunning fascination of the fact that the designers of
three major OS each choosed three different combinations out of four
possible when picking a representation of end-of-line character
sequence. I'm talking about Mac ('\r'), Unix ('\n') and DOS ('\r\n'),
where the Mac choice is by far most hostile to the programmer community,
since it seem to have been made most recently. I haven't heard yet of an
OS using ('\n\r'), but it sure is a valid sequence so I wouldn't be
surprised if it has been used somewhere.

I think a reason for why 'get a REAL OS' turns people off is because
people in general don't give a damn whether programmers are in heaven or
hell when trying to do their job.
Also, companies developing propriatary costware never have cared much
about portability. Hell, they even actively oppose portability,
compatibility and interoperability as a marketing tool for competitor
oppression. At least in the past. The big success of the Internet
hopefully have forced those companies to rethink.

> Furthermore if UNIX is a 'real' OS, why can't the UNIX tools accomodate
> more than one text file format, including my 'poor DOS text files'?

Because Unix also means Simple (note that I didn't say easy :-). And a
Unix end-of-line is simply just a single character. And because of the
same reason many DOS programs can't understand Unix end-of-lines, why
should Unix programmers have cared about DOS perculiarities? You can't
think of everything, can you? ;-)

> What did I miss here?  GnuWin32 does not include any sort of text proces
> sing library as far as I know.  It deals with the problem by putting in a
> layer *between* the fopen stuff and the OS calls AFAIK, when the solution I
> think you are proposing is a layer *over* the fopen stuff.

Oh, I think I ment 'back to gnuwin32, which this list is about', or
something.
Anyway, a possible one-for-all solution, at least for future
developments, could be a text processing layer over stdio. Too bad this
would not help much when porting existing software, would it?

> I am in no way thinking of breaking the 'UNIX way of computing'!  I just
> don't see why a compiler/make util/etc. shouldn't be able to take any text
> file you throw at it, regardless of which operating system it is running
> on.

You have a point there. But wishes does not solve problems. Someone has
to come up with a liable solution. Changing stdio behavior is not a
liable solution IMHO. We need more like a holy grail of porting text
processing tool.

> No surprise here.  But please tell me why the UNIX way is the right way.

Note that I didn't really said that. But Unix sure has some qualities
that DOS don't have.
One may have an opinion that Unix desktops sucks. And that's probably
what really is missing in Unix, a standard state-of-the-art graphical
desktop. But desktops are not about OS functionality, it's about
application services, which preferably should be layered on top of OS
services, as X-Windows have proved to work. Microsoft evidently have
very skillful user interface engineers. What they have done on top of
DOS can also be done on top of Unix. A good example of such an efford is
the K desktop environment. Looks very promising indeed.

>  And while we're at it, is JPG the right way, or is PNG the right way? To
> people, text is text.  Why should it not be the same for a 300MHz Pentium
> II.  Or your SparcStation.  Or the Mac.

Be real. If text is text, then why can't images just be images? You are
comparing different end-of-line schemes with different formats of image
encoding. I'm sure you can find imaging tools on DOS, Mac and Unix that
do not support each other's formats of encoding.

Never the less, I think most programmers share your frustration.

> I propose that I will write a 'text access library' with the following
> features:

An honorable proposal indeed. I say; have peace in your mind Gary and go
ahead! I for one surely will make use of such a promising set of tools.
You can bet it will be appreciated in the programming community.

Are you sure such a library not already exist somewhere out there? I
could be worth looking before you begin ...

-- 
Tomas Fasth                     mailto:tomas.fasth@euronetics.com
EuroNetics Operation            http://euronetics.com
Mjärdevi Science Park           Office tel: +46 13 218 181
Teknikringen 1 E                Office fax: +46 13 218 182
58330 Linköping                 Mobile tel: +46 708 870 957
Sweden                          Mobile fax: +46 708 870 258
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-10  6:19 Tony Pires
@ 1998-01-10 13:49 ` Christopher Faylor
  0 siblings, 0 replies; 33+ messages in thread
From: Christopher Faylor @ 1998-01-10 13:49 UTC (permalink / raw)
  To: gnu-win32

In article < 1.5.4.16.19980110041337.3a17522c@fox.nstn.ca >,
Tony Pires  <t_pires@fox.nstn.ca> wrote:
>I would like to add my voice to the growing chorus.  Native mode text files
>is the only choice that makes sense.  I get the impression that the gurus
>hate to convert their text files before starting to do their magic.

It is not that we hate converting text files.  It is really that many of
us have an unaccountable fear of the \r character.

Brrrr...  I'm getting chills just typing it.
-- 
http://www.bbc.com/	cgf@bbc.com			"Strange how unreal
VMS=>UNIX Solutions	Boston Business Computing	 the real can be."
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
@ 1998-01-10  6:19 Tony Pires
  1998-01-10 13:49 ` Christopher Faylor
  0 siblings, 1 reply; 33+ messages in thread
From: Tony Pires @ 1998-01-10  6:19 UTC (permalink / raw)
  To: gnu-win32

I would like to add my voice to the growing chorus.  Native mode text files
is the only choice that makes sense.  I get the impression that the gurus
hate to convert their text files before starting to do their magic.

At 09:45 PM 1/7/98 -0700, Jeffrey C. Fried wrote:

>Porting code from Unix to the PC should NOT require the same line
>termination mode since most Unix code which reads text uses fread/getc
>which automatically handle the end-of-line. 
snip
> And from the replies of most
> people i would argue that most of us would prefer to work in the native
> mode of the operating system in which we are running rather than having to
> constantly convert files between the two models simply because we use tools
> from both operating systems under NT/95.

Tony Pires,  Engineering Systems Supervisor
Kvaerner Chemetics Inc. Ltd, Vancouver, B.C.
Email: t_pires@fox.nstn.ca or pires.tony@traf.com   Fax: 604-737-0340

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
@ 1998-01-09 18:22 Gary R. Van Sickle
  1998-01-11 10:01 ` Tomas Fasth
  0 siblings, 1 reply; 33+ messages in thread
From: Gary R. Van Sickle @ 1998-01-09 18:22 UTC (permalink / raw)
  To: gnu-win32

Let me start by saying this:  It looks to me we agree almost 100%.  But let 
me address the 'almost':

>> What's wrong with this solution: Do it the Unix way. A file is a file is
a file. Textual end-of-line is not a business of the i/o subsystem. In
Unix the character sequence for end-of-line ('\n' == 012 == 0x10 ==
0b00001100)
is nothing more exciting than a mutual agreement between tools that want
to share text information. How simple!

I agree.  Let's do away with the "b" and the "t" entirely.  All files 
opened by fopen are binary - what you put in is what you get out.

>> Maybe you're only interested in some groovy tools to filter your poor
DOS text files? My advice: get native ports of those tools. There is no
port? Sad, but you might have to live with that. If you can't, do the
port yourself or switch to a REAL OS (hint: ends with nix :-)

I don't want to start yet another religious war, but the 'get a real OS' 
refrain sounds a lot like the same refrain we used to hear from the Mac 
users.  And the Apple II users.  And the C64 users.  And the Amiga users. 
 (NOTE:  I am not comparing UNIX to any of these systems).  Microsoft is 
winning (in fact has already won) because people hear 'get a real OS' and 
immediately turn off.  People love the underdog, and 'get a real OS' just 
helps the underdog image, regardless of the fact that Microsoft is in no 
real way an underdog.  Furthermore if UNIX is a 'real' OS, why can't the 
UNIX tools accomodate more than one text file format, including my 'poor 
DOS text files'?  PaintShop Pro can manipulate probably 50 different image 
file formats, from PCX for God's sake to PNG, and the user hardly knows the 
difference.

>> So, a text processing library is the exactly right place for fixes of os
design flaws and differencies such as the use of end-of-line sequence
taking twice as much space as necessary.

Voila! We're back to where we started. GnuWin32.

What did I miss here?  GnuWin32 does not include any sort of text proces  
sing library as far as I know.  It deals with the problem by putting in a 
layer *between* the fopen stuff and the OS calls AFAIK, when the solution I 
think you are proposing is a layer *over* the fopen stuff.

>> Please, please, please. Do what you like, but do NOT try to break the
Unix way of computing in the GnuWin32 distribution. If you do, then
what's the point the whole project?

I am in no way thinking of breaking the 'UNIX way of computing'!  I just 
don't see why a compiler/make util/etc. shouldn't be able to take any text 
file you throw at it, regardless of which operating system it is running 
on.

>> It may come as a surprise for you that the DOS way is not the right way.

No surprise here.  But please tell me why the UNIX way is the right way. 
 And while we're at it, is JPG the right way, or is PNG the right way? To 
people, text is text.  Why should it not be the same for a 300MHz Pentium 
II.  Or your SparcStation.  Or the Mac.

To badly paraphrase somebody, "The only requirement for evil to succeed is 
for good people to do nothing."  So, gosh darn it, I'm going to try to do 
something.  I propose that I will write a 'text access library' with the 
following features:

1. Written in portable ANSI C (no K&R compilers need apply)
2. Provides all the fscanf, fprintf, etc. (i.e. line reading and writing) 
functionality for text-containing files only
3. Provides some extended, cool features TBD
4. Reads and writes at least UNIX & DOS, maybe HTML, etc. formats later
5. Operates kind of in this wise:  Opens any supported format and they 
behave the same (i.e. 'read line', 'read next char', all retrieve the same 
text regardless of format),  writes

I obviously don't want to do this if nobody will use it, so would/could 
something like this get included in the GNU distribution?  Would somebody 
commit to porting the GNU tools to use it?  Could we get people to actually 
start to deprecate the use of the stdio text-file processing facilities and 
use this library?

Gary R. Van Sickle (tiberius@braemarinc.com)
Electrical Design Engineer
Braemar Inc.
11481 Rupp Dr.
Burnsville, MN 55337
(612) 890-5135 Ext. 144
Fax: (612) 882-6550


-----Original Message-----
From:	Tomas Fasth [SMTP:tomas.fasth@twinspot.net]
Sent:	Friday, January 09, 1998 1:50 PM
To:	gnu-win32@cygnus.com
Cc:	Gary R. Van Sickle
Subject:	Re: Why text=binary mounts

Gary R. Van Sickle wrote:
>
> This whole UNIX/DOS/text/binary situation drives me nuts.  Why can't this
> problem be solved once and for all by everybody for all time?  We are
> talking about one '\r', for crissake.  What's wrong with this solution:

Gary,

Your solution is wrong because it promotes an out-dated file system
concept originated from digital stoneage operating systems. Back then
fopen("t") ment character (byte) oriented i/o, while fopen("b") ment
block oriented i/o. Back then the choice had an effect on i/o
performance. Now? Heck, modern i/o subsystems are _so_ much more
efficient and clever. Also, memory (good for i/o buffering among other
things) are now-a-days so much more cheap and virtual.

What's wrong with this solution: Do it the Unix way. A file is a file is
a file. Textual end-of-line is not a business of the i/o subsystem. In
Unix the character sequence for end-of-line ('\n' == 012 == 0x10 ==
0b00001100)
is nothing more exciting than a mutual agreement between tools that want
to share text information. How simple!

Ultimately, as a programmer you might want to use a library to share
commonly used text processing. As a bonus the details of certain strange
text processing characteristics (like what sequence of characters to
represent end-of-line) can be hidden from the programmer. Good!

So, a text processing library is the exactly right place for fixes of os
design flaws and differencies such as the use of end-of-line sequence
taking twice as much space as necessary.

Voila! We're back to where we started. GnuWin32.

Please, please, please. Do what you like, but do NOT try to break the
Unix way of computing in the GnuWin32 distribution. If you do, then
what's the point the whole project?

Maybe you're only interested in some groovy tools to filter your poor
DOS text files? My advice: get native ports of those tools. There is no
port? Sad, but you might have to live with that. If you can't, do the
port yourself or switch to a REAL OS (hint: ends with nix :-)

> Let me address one sure-to-come-up complaint right now: the notion that 
it
> would be too much work to 'fix' all the existing code.  How much time and
> effort is wasted on 'working around' the current situation?  Certainly 
more
> time than it would take to search-and-replace "w" with "wt", etc.

Oh no. Not in this universe. For reasons too many to list.
It may come as a surprise for you that the DOS way is not the right way.
Life can be cruel sometimes...

--
Tomas Fasth                     mailto:tomas.fasth@euronetics.com
EuroNetics Operation            http://euronetics.com
Mjardevi Science Park           Office tel: +46 13 218 181
Teknikringen 1 E                Office fax: +46 13 218 182
58330 Linkoping                 Mobile tel: +46 708 870 957
Sweden                          Mobile fax: +46 708 870 258

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
@ 1998-01-09 18:22 Gary R. Van Sickle
  1998-01-12 20:11 ` Guy Gascoigne - Piggford
  0 siblings, 1 reply; 33+ messages in thread
From: Gary R. Van Sickle @ 1998-01-09 18:22 UTC (permalink / raw)
  To: gnu-win32

Touche.  Let me clarify one point though: I had no intention to "admonish" 
anyone on this list or anybody else for that matter.  My post was simply an 
expression of the frustration I feel when simple, nagging problems go 
unsolved.

Here is one unrelated issue that I would be willing to spend some time on: 
 What is with all the K&R-style functions I see in the GNU C library?  Is 
there a reason for them not to be ANSI?  I can't believe there are a 
significant number of non-ANSI compilers still in use for development, 
maybe I'm wrong...

Gary R. Van Sickle (tiberius@braemarinc.com)
Electrical Design Engineer
Braemar Inc.
11481 Rupp Dr.
Burnsville, MN 55337
(612) 890-5135 Ext. 144
Fax: (612) 882-6550


-----Original Message-----
From:	Larry Hall (RFK Partners Inc) [SMTP:lhall@rfk.com]
Sent:	Friday, January 09, 1998 10:12 AM
To:	Gary R. Van Sickle; gnu-win32@cygnus.com
Subject:	RE: Why text=binary mounts

At 07:15 PM 1/8/98 -0600, Gary R. Van Sickle wrote:
>Let me address one sure-to-come-up complaint right now: the notion that it 
>would be too much work to 'fix' all the existing code.  How much time and
>effort is wasted on 'working around' the current situation?  Certainly 
more
>time than it would take to search-and-replace "w" with "wt", etc.

OK, you've come up with yet ANOTHER solution to this problem.  You wouldn't
be the first or probably the last.  So what are YOU going to do about it?
You raise the issue that the "fix" is generally "dismissed" as a result of
being too much "work".  From my perspective, the people with this view
probably agree with the notion that the "fix" must occur but they have 
found
work-arounds for their particular environments which are suitable and don't 
require much personal time investment.  Those people who have not and need
something else complain loudly but seem just as unwilling to take on the
herculean task.  For the moment, this is still largely a GNU-cenric project 
in spirit, without a large, committed development team behind it.  That 
said,
let's all acknowledge that while the users of this software may agree in
general that a particular course of action may be beneficial, unless people 
VOLUNTEER to undertake the task, changes are NOT going to happen quickly. 
 I
personally don't feel like I've invested any large amount of time to work-
around my text/binary issues, although I don't have many.  Certainly it
doesn't add up to anywhere near the time investment that would be necessary 
for me to go into even some of the source of these tools and make changes 
to
alleviate the difficulties.  Certainly one could argue that collectively 
all
users have spent some significant time working out these issues for 
themselves
and that maybe if that time was spent, collectively and in an organized
fashion, fixing the tools, we'd all benefit.  However, as I said, unless
someone organizes a volunteer effort, things aren't going to change 
quickly.
So, if someone wants to pick up and organize that effort, great.  MAYBE I'd
even be willing to help.  However, I'm not certain that having all sorts
of individuals posting to this list with general algorithms for fixing the
problem is useful, unless the one who posts actually might entertain the
thought of making the changes, organizing a group to do so, or maybe even
somehow sponsoring Cygnus to do it for them.  I'm not trying to downplay 
the
issue here and I certainly don't want to discourage people from discussing
issues and solutions.  But this particular issue comes up frequently and
always ends up being debated in a vacuum.  Admonishing others on this list 
or
the list in general for not fixing problems one finds intolerable in the
current software is unfair.  Its not productive.  Perhaps we can find a
different approach in regard to dealing with the text/binary issue and
others like it?  I can see some overall benefit from this.


Larry Hall                              lhall@rfk.com
RFK Partners, Inc.                      (781) 239-1053
8 Grove Street                          (781) 239-1655 - FAX
Wellesley, MA  02181

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: Why text=binary mounts
@ 1998-01-09 18:22 Gary R. Van Sickle
  1998-01-12  2:57 ` Benjamin Riefenstahl
  0 siblings, 1 reply; 33+ messages in thread
From: Gary R. Van Sickle @ 1998-01-09 18:22 UTC (permalink / raw)
  To: gnu-win32

>> I'm not sure how to do it though. One could just change the text mode.
That would be o.k. for me but I'm not sure everybody would be happy with
that. Another thought would be to invent another mode like "extended
text mode" e.g. with an fopen() specifier "T", an open() flag O_ETEXT
and an iostream mode ios::etext that could implement this. That way one
could port tools to this mode by simply adding the flags just like you
port binary tools now by adding O_BINARY, "b" or ios::bin.

Does anbody else have an opinion on that problem?

The real problem here is that files as they exist on disk don't have 
'modes', they have formats.  Adding 'modes' to a system that really doesn't 
work already will only make the situation worse.

What I think is really needed is a Text Access Library (TAL) that sits *on 
top* of a *binary* stdio file and reads and writes lines from UNIX, DOS, 
Mac, maybe HTML, etc., etc., text files.  Instead of fopen(???, "rt"), 
you'd use the library and then *not care* what the text file format is, 
only that it contains lines of text.  This TAL would become part of the 
standard C library (or the GNU library at least, which would make it a 
de-facto standard), all the tools that were dealing with text would use it, 
and eventually the "t" functionality of stdio would be deprecated and the 
problem would be solved.

I volunteer to write this library if someone else volunteers to get the GNU 
tools to use it.  I propose the following features:

1. Written in portable ANSI C (no K&R compilers need apply)
2. Provides all the fscanf, fprintf, etc. (i.e. line reading and writing) 
functionality for text-containing files only
3. Provides some extended, cool features TBD
4. Reads and writes at least UNIX, DOS, and Mac, with maybe HTML, etc. 
formats coming later
5. Operates kind of in this wise:  Opens for reading any supported format 
and they behave the same (i.e. 'read line', 'read next char', all retrieve 
the same text regardless of format),  writes in the format selected by the 
programmer (i.e. the fopen equivalent would require a format specifier if a 
file is opened for write)

Gary R. Van Sickle (tiberius@braemarinc.com)
Electrical Design Engineer
Braemar Inc.
11481 Rupp Dr.
Burnsville, MN 55337
(612) 890-5135 Ext. 144
Fax: (612) 882-6550


-----Original Message-----
From:	Benjamin Riefenstahl [SMTP:benny@crocodial.de]
Sent:	Friday, January 09, 1998 6:51 AM
To:	gnu-win32@cygnus.com
Subject:	Re: Why text=binary mounts

Hi All,


I'm new here so please forgive if I'm missing something. I also have not
yet a lot of experience with gnu-win32. I do have some experience with
porting C and C++ and with the rules of these languages and how they
affect porting. So this post that I'm replying to got my attention.


marcus@bighorn.dr.lucent.com wrote:
> This is true as long as you are considering text files only.  The problem
> comes in when you also want to deal with binary files.  On Unix systems,
> of course, there is no difference in operations on either, so most Unix
> programs open all files using the same open() or fopen() calls.  On 
systems
> that differentiate between these files, it is important to add O_BIARY or
> O_TEXT to the second argument of open(), and "b" for binary files to the
> second argument of fopen().  This tells the underlying routines whether 
to
> apply any translation to the file.

So far I agree.

> If nothing is specified, the OS must
> choose whether or not to make translations, and that is where the 
text=/!=
> binary mounting comes in, as this specifies the default mode.

No. At least for fopen() there is no choice. If you don't specify "b"
you get text mode and that's that. An application that opens a binary
file without the "b" has a bug. I don't think that fiddling with this
(like "binary" mounts) actually helps. Fix the buggy source code
instead, that seems to me is bound to be *much* more efficient in terms
of developer and user time spent on the problem. BTW on DOS-like systems
(DOS, Windows, OS/2) the RTL does the translation, not the OS. The OS
just sets the guidelines how text should be represented and of course
the OS tools enforce these guidelines.

> Now, there are some difficulties in this implementation.  First, since 
there
> is no "t" that can be passed to fopen(), it is impossible to tell if a 
call
> to fopen() wants a text mode open, or the default (blame POSIX/ANSI for 
that,
> I guess).

See above. The default is unambigously specified as text mode by the ISO
C language standard.

> ... However, if there exist Unix programs that call fopen() without
> the "b" for binary files (since it isn't needed on Unix and was added to 
the
> standard much later than the program may have been written), then these
> programs won't run correctly without some additional porting effort.

I'd prefer to invest a little time in porting the code instead of
investing a lot of time in users tweaking their system.

> The
> same goes for programs that call open() without the O_BINARY bit set in 
the
> second argument when opening binary files.

Being that open() is a Unix call and Unix doesn't have the distinction
between text and binary, it can be argued that the rules for Unix
compatibility libraries can be made whatever one wants. It has been
common practice though - and with good reason - to go by the same rules
as C and C++ go with fopen() and iostreams: The default is text mode and
you need the extra O_BINARY flag to get binary mode. This is done this
way in all compilers that I know.

> To compound this, there are times when it is extremely difficult to 
impossible
> to tell if a file should be opened as text or binary.  For instance, 
should
> TAR open the files that it is writing to an archive as binary or text 
files?
> How can it determine which to use?

Some applications have a design problem here. AFAIK most ports that are
designed for this allow the user to specify that all operations are to
be done in binary, which is what I prefer always. I can always convert
DOS text files to Unix text and back again. I can not convert a garbled
binary file back to it's original form.

> Sure, it's fun to play with cygwin32, but to me it doesn't seem 
reasonable to
> try to develop it as a Linux replacement.  I think that if it is to be 
truely
> useful, cygwin32 must encourage interoperating with the native world that 
it
> exists in.  Part of that is running well in a text!=binary mounted world.
> Sure, that means that porting programs to Cygwin32 means that you have to
> install an awareness of binary v.s. text files, and that does mean more 
work
> to port the programs, but it also produces more useful programs as well.

Here we agree again ;-)


Let me add another nit to the problem. I am actually using not only Unix
and DOS but also Mac files. This means another variation in line ends:
Unix uses <LF>, DOS uses <CR><LF> and Macs use <CR>. In my world these
are the prominent formats and most of my tools (editors, compilers and
other commercial tools) agree with that.

In DOS the translation for text mode works rather simple: On input all
<CR><LF> combinations are replaced by <LF> and on output all <LF> are
replaced with <CR><LF>. This means not only that DOS files are read
correctly but also that Unix files are automatically read correctly. The
coincidence is rather usefull, because in most simple tools one rarely
ever needs to translate explicitly from Unix to DOS, most DOS tools get
along with Unix files just fine.

For my own programs I often implement an extension to this behaviour.
Instead of only treating only <LF> and <CR><LF> as line ends I also
treat single <CR> the same. This means I loose the ability to use singe
<CR>s for formatting but than the only files thus formatted that I have
those are intended directly for a line printer. OTOH as I said I often
have Mac files and with this arrangement these are read correctly.

For my own programs this is done easy enough but when porting tools from
Unix it's a lot more diffcult. Porting Unix tools to this mode would be
a lot easier if this behaviour could be somehow included in the RTL
itself (like ordinary text mode is now).

I'm not sure how to do it though. One could just change the text mode.
That would be o.k. for me but I'm not sure everybody would be happy with
that. Another thought would be to invent another mode like "extended
text mode" e.g. with an fopen() specifier "T", an open() flag O_ETEXT
and an iostream mode ios::etext that could implement this. That way one
could port tools to this mode by simply adding the flags just like you
port binary tools now by adding O_BINARY, "b" or ios::bin.

Does anbody else have an opinion on that problem?


so long, benny
======================================
Benjamin Riefenstahl (benny@crocodial.de)
Crocodial Communications EntwicklungsGmbH
Ophagen 16a, D-20257 Hamburg, Germany
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
  1998-01-08  8:31 marcus
@ 1998-01-09  5:51 ` Benjamin Riefenstahl
  0 siblings, 0 replies; 33+ messages in thread
From: Benjamin Riefenstahl @ 1998-01-09  5:51 UTC (permalink / raw)
  To: gnu-win32

Hi All,


I'm new here so please forgive if I'm missing something. I also have not
yet a lot of experience with gnu-win32. I do have some experience with
porting C and C++ and with the rules of these languages and how they
affect porting. So this post that I'm replying to got my attention.


marcus@bighorn.dr.lucent.com wrote:
> This is true as long as you are considering text files only.  The problem
> comes in when you also want to deal with binary files.  On Unix systems,
> of course, there is no difference in operations on either, so most Unix
> programs open all files using the same open() or fopen() calls.  On systems
> that differentiate between these files, it is important to add O_BIARY or
> O_TEXT to the second argument of open(), and "b" for binary files to the
> second argument of fopen().  This tells the underlying routines whether to
> apply any translation to the file.

So far I agree.

> If nothing is specified, the OS must
> choose whether or not to make translations, and that is where the text=/!=
> binary mounting comes in, as this specifies the default mode.

No. At least for fopen() there is no choice. If you don't specify "b"
you get text mode and that's that. An application that opens a binary
file without the "b" has a bug. I don't think that fiddling with this
(like "binary" mounts) actually helps. Fix the buggy source code
instead, that seems to me is bound to be *much* more efficient in terms
of developer and user time spent on the problem. BTW on DOS-like systems
(DOS, Windows, OS/2) the RTL does the translation, not the OS. The OS
just sets the guidelines how text should be represented and of course
the OS tools enforce these guidelines.

> Now, there are some difficulties in this implementation.  First, since there
> is no "t" that can be passed to fopen(), it is impossible to tell if a call
> to fopen() wants a text mode open, or the default (blame POSIX/ANSI for that,
> I guess).  

See above. The default is unambigously specified as text mode by the ISO
C language standard.

> ... However, if there exist Unix programs that call fopen() without
> the "b" for binary files (since it isn't needed on Unix and was added to the
> standard much later than the program may have been written), then these
> programs won't run correctly without some additional porting effort.

I'd prefer to invest a little time in porting the code instead of
investing a lot of time in users tweaking their system.

> The
> same goes for programs that call open() without the O_BINARY bit set in the
> second argument when opening binary files.

Being that open() is a Unix call and Unix doesn't have the distinction
between text and binary, it can be argued that the rules for Unix
compatibility libraries can be made whatever one wants. It has been
common practice though - and with good reason - to go by the same rules
as C and C++ go with fopen() and iostreams: The default is text mode and
you need the extra O_BINARY flag to get binary mode. This is done this
way in all compilers that I know.

> To compound this, there are times when it is extremely difficult to impossible
> to tell if a file should be opened as text or binary.  For instance, should
> TAR open the files that it is writing to an archive as binary or text files?
> How can it determine which to use?

Some applications have a design problem here. AFAIK most ports that are
designed for this allow the user to specify that all operations are to
be done in binary, which is what I prefer always. I can always convert
DOS text files to Unix text and back again. I can not convert a garbled
binary file back to it's original form.

> Sure, it's fun to play with cygwin32, but to me it doesn't seem reasonable to
> try to develop it as a Linux replacement.  I think that if it is to be truely
> useful, cygwin32 must encourage interoperating with the native world that it
> exists in.  Part of that is running well in a text!=binary mounted world.
> Sure, that means that porting programs to Cygwin32 means that you have to
> install an awareness of binary v.s. text files, and that does mean more work
> to port the programs, but it also produces more useful programs as well.

Here we agree again ;-)


Let me add another nit to the problem. I am actually using not only Unix
and DOS but also Mac files. This means another variation in line ends:
Unix uses <LF>, DOS uses <CR><LF> and Macs use <CR>. In my world these
are the prominent formats and most of my tools (editors, compilers and
other commercial tools) agree with that.

In DOS the translation for text mode works rather simple: On input all
<CR><LF> combinations are replaced by <LF> and on output all <LF> are
replaced with <CR><LF>. This means not only that DOS files are read
correctly but also that Unix files are automatically read correctly. The
coincidence is rather usefull, because in most simple tools one rarely
ever needs to translate explicitly from Unix to DOS, most DOS tools get
along with Unix files just fine.

For my own programs I often implement an extension to this behaviour.
Instead of only treating only <LF> and <CR><LF> as line ends I also
treat single <CR> the same. This means I loose the ability to use singe
<CR>s for formatting but than the only files thus formatted that I have
those are intended directly for a line printer. OTOH as I said I often
have Mac files and with this arrangement these are read correctly.

For my own programs this is done easy enough but when porting tools from
Unix it's a lot more diffcult. Porting Unix tools to this mode would be
a lot easier if this behaviour could be somehow included in the RTL
itself (like ordinary text mode is now).

I'm not sure how to do it though. One could just change the text mode.
That would be o.k. for me but I'm not sure everybody would be happy with
that. Another thought would be to invent another mode like "extended
text mode" e.g. with an fopen() specifier "T", an open() flag O_ETEXT
and an iostream mode ios::etext that could implement this. That way one
could port tools to this mode by simply adding the flags just like you
port binary tools now by adding O_BINARY, "b" or ios::bin.

Does anbody else have an opinion on that problem?


so long, benny
======================================
Benjamin Riefenstahl (benny@crocodial.de)
Crocodial Communications EntwicklungsGmbH
Ophagen 16a, D-20257 Hamburg, Germany
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Why text=binary mounts
@ 1998-01-08  8:31 marcus
  1998-01-09  5:51 ` Benjamin Riefenstahl
  0 siblings, 1 reply; 33+ messages in thread
From: marcus @ 1998-01-08  8:31 UTC (permalink / raw)
  To: gnu-win32

Jeff Fried writes:
> Porting code from Unix to the PC should NOT require the same line
> termination mode since most Unix code which reads text uses fread/getc
> which automatically handle the end-of-line.  And from the replies of most
> people i would argue that most of us would prefer to work in the native
> mode of the operating system in which we are running rather than having to
> constantly convert files between the two models simply because we use tools
> from both operating systems under NT/95.  For examples of this
> compatibility look at many of the GNU tools which handle text, the file
> handling will work under both operating systems without any change because
> they use text mode I/O which is platform independent once all files have
> been converted to the form of the native OS.

This is true as long as you are considering text files only.  The problem
comes in when you also want to deal with binary files.  On Unix systems,
of course, there is no difference in operations on either, so most Unix
programs open all files using the same open() or fopen() calls.  On systems
that differentiate between these files, it is important to add O_BIARY or
O_TEXT to the second argument of open(), and "b" for binary files to the
second argument of fopen().  This tells the underlying routines whether to
apply any translation to the file.  If nothing is specified, the OS must
choose whether or not to make translations, and that is where the text=/!=
binary mounting comes in, as this specifies the default mode.

Now, there are some difficulties in this implementation.  First, since there
is no "t" that can be passed to fopen(), it is impossible to tell if a call
to fopen() wants a text mode open, or the default (blame POSIX/ANSI for that,
I guess).  If you know that all programs have conciously made a choice about
things, there would not be any need for a default, so we could assume that
the fopen() without a "b" wants a text mode open and mount things as
text!=binary.  However, if there exist Unix programs that call fopen() without
the "b" for binary files (since it isn't needed on Unix and was added to the
standard much later than the program may have been written), then these
programs won't run correctly without some additional porting effort.  The
same goes for programs that call open() without the O_BINARY bit set in the
second argument when opening binary files.

To compound this, there are times when it is extremely difficult to impossible
to tell if a file should be opened as text or binary.  For instance, should
TAR open the files that it is writing to an archive as binary or text files?
How can it determine which to use?

So, to avoid these issues, many people on this list try to avoid using anything
from the Microsoft world (except for NT/95 itself) and use only cygwin32
programs with text=binary so that any file is just like any other file just
like in Unix systems.  Since their text files are marginally exchangable
with other NT/95 users (or other NT/95 applications).  So, it seems to me
that this gives a slow, incomplete, and buggy (well, it is a Beta release!)
emulation of Unix with no advantages over Linux except that their boss has
declared that they must run NT (in true pointy-haired boss fashon).

Sure, it's fun to play with cygwin32, but to me it doesn't seem reasonable to
try to develop it as a Linux replacement.  I think that if it is to be truely
useful, cygwin32 must encourage interoperating with the native world that it
exists in.  Part of that is running well in a text!=binary mounted world.
Sure, that means that porting programs to Cygwin32 means that you have to
install an awareness of binary v.s. text files, and that does mean more work
to port the programs, but it also produces more useful programs as well.

This discussion keeps coming up, which I believe supports my feeling that it
is a major issue with cygwin32.  I know that the previous iteration I ended
with just agreeing to disagree and I said that I wouldn't say any more in it,
but I just wanted to give some support to this side in this iteration and
that'll be it (this time around, at least).

marcus hall


  Unfortunately, there is no "t" that
can be supplied to fopen() to fully disambiguate the three cases that may
occur, so we have the following situation:
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~1998-01-19 10:32 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-01-08 17:20 Why text=binary mounts Gary R. Van Sickle
1998-01-09 13:40 ` Larry Hall (RFK Partners Inc)
1998-01-09 13:40 ` Tomas Fasth
1998-01-11 23:40   ` Fergus Henderson
1998-01-12  5:03     ` Tomas Fasth
1998-01-12  4:45       ` Fergus Henderson
1998-01-12 17:26   ` Guy Gascoigne - Piggford
  -- strict thread matches above, loose matches on Subject: below --
1998-01-15 15:16 Richard Thomas
1998-01-18 16:07 ` Steven R. Newcomb
1998-01-15 15:16 Gary R. Van Sickle
1998-01-17 23:47 ` Fergus Henderson
1998-01-15  7:26 Immanuel Litzroth
1998-01-13 10:42 Gary R. Van Sickle
1998-01-15  7:26 ` Peter Dalgaard BSA
1998-01-17 11:00 ` John A. Turner
1998-01-12 14:09 Gary R. Van Sickle
1998-01-14  3:39 ` Richard Thomas
1998-01-16  2:56   ` Jeffrey C. Fried
1998-01-19 10:32     ` Guy Gascoigne - Piggford
1998-01-16  2:56 ` Benjamin Riefenstahl
1998-01-12 10:42 marcus
1998-01-13 10:53 ` Benjamin Riefenstahl
1998-01-12  9:17 marcus
1998-01-10  6:19 Tony Pires
1998-01-10 13:49 ` Christopher Faylor
1998-01-09 18:22 Gary R. Van Sickle
1998-01-11 10:01 ` Tomas Fasth
1998-01-09 18:22 Gary R. Van Sickle
1998-01-12  2:57 ` Benjamin Riefenstahl
1998-01-09 18:22 Gary R. Van Sickle
1998-01-12 20:11 ` Guy Gascoigne - Piggford
1998-01-08  8:31 marcus
1998-01-09  5:51 ` Benjamin Riefenstahl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).