public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* filename with HASH
@ 2011-11-17  0:17 pen
  2011-11-17  0:29 ` pen
  0 siblings, 1 reply; 5+ messages in thread
From: pen @ 2011-11-17  0:17 UTC (permalink / raw)
  To: cygwin


Need help, Cant access html file from lynx, even though its readable by other
commands.

$ ls -l "test bay#, wwid" 
-rw-r--r--+ 1 test Users 50999 Nov 17 03:22 test bay#, wwid 
$ file "test bay#, wwid" 
test bay#, wwid: HTML document, ASCII text, with very long lines 
$ head -2 "test bay#, wwid" 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" 
> 
$ lynx -dump "test bay#, wwid" 

Can't Access `file://localhost/cygdrive/e/test%20bay#,%20wwid' 
Alert!: Unable to access document. 

lynx: Can't access startfile 

Played around and found that if i create any file with "#, " lynx cant
understand it.
Its on win-7; do not know if there is any other special character that lynx
has problem with.
Can't even check in the script for a valid filename since shell commands
recognize this file well. 

Appreciate your help.
-- 
View this message in context: http://old.nabble.com/filename-with-HASH-tp32858658p32858658.html
Sent from the Cygwin list mailing list archive at Nabble.com.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: filename with HASH
  2011-11-17  0:17 filename with HASH pen
@ 2011-11-17  0:29 ` pen
  2011-11-17  9:57   ` Dave Korn
  0 siblings, 1 reply; 5+ messages in thread
From: pen @ 2011-11-17  0:29 UTC (permalink / raw)
  To: cygwin


Few more tests: seems lynx dont like #

$ mv "test bay#, wwid" "test # abc"
$ lynx -dump "test # abc"

Can't Access `file://localhost/cygdrive/e/test%20#%20abc'
Alert!: Unable to access document.

lynx: Can't access startfile

$ mv  "test # abc" "test# a"
$ lynx -dump "test# a"

Looking up test
Making HTTP connection to test
Alert!: Unable to connect to remote host.  <<

lynx: Can't access startfile http://test/#%20a
$ head -1 "test# a"
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
>

So, just changing the filename gives completely different results.

-- 
View this message in context: http://old.nabble.com/filename-with-HASH-tp32858658p32858723.html
Sent from the Cygwin list mailing list archive at Nabble.com.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: filename with HASH
  2011-11-17  0:29 ` pen
@ 2011-11-17  9:57   ` Dave Korn
  2011-11-17 10:28     ` Corinna Vinschen
  2011-11-17 13:33     ` pen
  0 siblings, 2 replies; 5+ messages in thread
From: Dave Korn @ 2011-11-17  9:57 UTC (permalink / raw)
  To: cygwin

On 17/11/2011 00:29, pen wrote:
> Few more tests: seems lynx dont like #
> 
> $ mv "test bay#, wwid" "test # abc"
> $ lynx -dump "test # abc"
> 
> Can't Access `file://localhost/cygdrive/e/test%20#%20abc'
> Alert!: Unable to access document.
> 
> lynx: Can't access startfile
> 
> $ mv  "test # abc" "test# a"
> $ lynx -dump "test# a"
> 
> Looking up test
> Making HTTP connection to test
> Alert!: Unable to connect to remote host.  <<
> 
> lynx: Can't access startfile http://test/#%20a

  A "#" marks a separator in a URL between the URI part and the anchor within
the page to load up the display at.  I think lynx is applying the same syntax
to local file URLs; for example, if you have a local file "index.html", you
can append any arbitrary # anchor to it:

> $ wget 'http://www.bbc.co.uk/'
> --2011-11-17 09:47:02--  http://www.bbc.co.uk/
> Resolving www.bbc.co.uk (www.bbc.co.uk)... 212.58.246.94
> Connecting to www.bbc.co.uk (www.bbc.co.uk)|212.58.246.94|:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 135886 (133K) [text/html]
> Saving to: `index.html'
> 
> 100%[======================================>] 135,886      446K/s   in 0.3s
> 
> 2011-11-17 09:47:03 (446 KB/s) - `index.html' saved [135886/135886]
> 
> 
> $ ls index*
> index.html
> 
> $ lynx -dump "index.html#foobar"
> 
>    #[1]A to Z [2]BBC Help [3]Terms of Use
> 
>    [4]British Broadcasting Corporation BBC Home
               [ ... snip ... ]

  So I think it's a limitation of the URL format that it's ambiguous between a
filename with an actual # in it and a filename followed by "#" and an anchor,
and there's probably not much lynx can do about it.  Your best bet, if you
absolutely have to use lynx on files with hash signs in their names, would be
to use lynx's -stdin option and redirect the input from the file, so that lynx
doesn't ever see the filename at all:

> $ lynx -dump "index.html # abc"
> 
> Can't Access `file://localhost/tmp/lynx/index.html%20#%20abc'
> Alert!: Unable to access document.
> 
> lynx: Can't access startfile
> 
> $ lynx -stdin -dump < "index.html # abc"
> 
>    #[1]A to Z [2]BBC Help [3]Terms of Use
> 
>    [4]British Broadcasting Corporation BBC Home
               [ ... snip ... ]


    cheers,
      DaveK


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: filename with HASH
  2011-11-17  9:57   ` Dave Korn
@ 2011-11-17 10:28     ` Corinna Vinschen
  2011-11-17 13:33     ` pen
  1 sibling, 0 replies; 5+ messages in thread
From: Corinna Vinschen @ 2011-11-17 10:28 UTC (permalink / raw)
  To: cygwin

On Nov 17 09:56, Dave Korn wrote:
> On 17/11/2011 00:29, pen wrote:
> > Few more tests: seems lynx dont like #
> > 
> > $ mv "test bay#, wwid" "test # abc"
> > $ lynx -dump "test # abc"
> > 
> > Can't Access `file://localhost/cygdrive/e/test%20#%20abc'
> > Alert!: Unable to access document.
> > 
> > lynx: Can't access startfile
> > 
> > $ mv  "test # abc" "test# a"
> > $ lynx -dump "test# a"
> > 
> > Looking up test
> > Making HTTP connection to test
> > Alert!: Unable to connect to remote host.  <<
> > 
> > lynx: Can't access startfile http://test/#%20a
> 
>   A "#" marks a separator in a URL between the URI part and the anchor within
> the page to load up the display at.  I think lynx is applying the same syntax
> to local file URLs; for example, if you have a local file "index.html", you
> can append any arbitrary # anchor to it:

Apart from that, all non-graphical characters in an URL are converted to
the %xx syntax.  Therefore, the URL "test # abc" is converted to
"file://[$PWD]/test%20#%20abc" *before* trying to find the file.  So
lynx tries to access the file "test%20#%20abc" in the current directory,
which obviously doesn't exist.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: filename with HASH
  2011-11-17  9:57   ` Dave Korn
  2011-11-17 10:28     ` Corinna Vinschen
@ 2011-11-17 13:33     ` pen
  1 sibling, 0 replies; 5+ messages in thread
From: pen @ 2011-11-17 13:33 UTC (permalink / raw)
  To: cygwin


Thanks Dave. 
That was fantastic!!! Now my scripts will become much lighter.
=)


Dave Korn-9 wrote:
> 
> On 17/11/2011 00:29, pen wrote:
>> Few more tests: seems lynx dont like #
>> 
>> $ lynx -dump "index.html # abc"
>> 
>> Can't Access `file://localhost/tmp/lynx/index.html%20#%20abc'
>> Alert!: Unable to access document.
>> 
>> lynx: Can't access startfile
>> 
>> $ lynx -stdin -dump < "index.html # abc"
>> 
>>    #[1]A to Z [2]BBC Help [3]Terms of Use
>> 
>>    [4]British Broadcasting Corporation BBC Home
>                [ ... snip ... ]
> 
>     cheers,
>       DaveK
> --
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
> 
> 
> 
=)
-- 
View this message in context: http://old.nabble.com/filename-with-HASH-tp32858658p32861926.html
Sent from the Cygwin list mailing list archive at Nabble.com.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-11-17 13:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-17  0:17 filename with HASH pen
2011-11-17  0:29 ` pen
2011-11-17  9:57   ` Dave Korn
2011-11-17 10:28     ` Corinna Vinschen
2011-11-17 13:33     ` pen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).