public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Re: wget not behaving correctly
       [not found] <999820851.27921.ezmlm@sources.redhat.com>
@ 2001-09-08  2:51 ` Hack Kampbjørn
  2001-09-08  8:47   ` Charles Wilson
  2001-09-08 13:42   ` C. Porter Bassett
  0 siblings, 2 replies; 7+ messages in thread
From: Hack Kampbjørn @ 2001-09-08  2:51 UTC (permalink / raw)
  To: C. Porter Bassett; +Cc: cygwin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2225 bytes --]

Standard disclaimer about only reading the digest list ...


cygwin-digest-help@sources.redhat.com wrote:
> Subject: Re: wget not behaving correctly
> Date: Thu, 06 Sep 2001 15:41:18 -0400
> From: Charles Wilson <cwilson@ece.gatech.edu>
> To: "C. Porter Bassett" <cporter@byu.edu>
> CC: cygwin@cygwin.com
> 
> C. Porter Bassett wrote:
> 
> > When I run
> >  wget -r -A gif,jpg,png,bmp
> > http://fan.theonering.net/rolozo/galleries.php?mode=2
> > it cannot find the file
> > http://fan.theonering.net/rolozo/galleries.php?mode=2 , but it can when run
> > in linux.  Perhaps this has something to do with the ? in the URL?

Yes this is the '?' in the URL. There can be two problems here depending
on which shell you use. First your shell may not send the '?' to wget
but directly complain that it cannot expand the globbing character '?'
to a filename, I cannot tell if this is the case as you did not include
full output of running the wget command.

Second, wget will save the first retrieved page as
'galleries.php?mode=2' which is an illegal filename on Windows file
systems (because of the '?') again I cannot tell if this is the case.
There are a couple of patches to workaround this in the wget archives
( http://sunsite.dk/wget/ ) but none has been accept by the Wget
mantainers (too Windows specific), nor has the Cygwin mantainer used it
in the cygwin version.

In the future consider including full debug output when reporting a
problem `wget -d ...`

> 
> I think it's a version thing.  Cygwin's wget is 1.6.1, the latest
> version is 1.7.  I'll update wget once cygwin-1.3.3 is out (and no, I
> don't know when that will be).

Chuck, is there any extra functionality in cygwin-1.3.3 or is this just
a resource problem (limited time) ?
In the second case should I take over mantainership for cygwin wget (a
version 1.7.1 is expected any time now) ?

> 
> --Chuck
> 

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn               hack@hackdata.com
HackLine                     +45 2031 7799

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wget not behaving correctly
  2001-09-08  2:51 ` wget not behaving correctly Hack Kampbjørn
@ 2001-09-08  8:47   ` Charles Wilson
  2001-09-08 13:42   ` C. Porter Bassett
  1 sibling, 0 replies; 7+ messages in thread
From: Charles Wilson @ 2001-09-08  8:47 UTC (permalink / raw)
  To: Hack Kampbjørn; +Cc: C. Porter Bassett, cygwin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1800 bytes --]

Hack Kampbjørn wrote:


> Second, wget will save the first retrieved page as
> 'galleries.php?mode=2' which is an illegal filename on Windows file
> systems (because of the '?') again I cannot tell if this is the case.
> There are a couple of patches to workaround this in the wget archives
> ( http://sunsite.dk/wget/ ) but none has been accept by the Wget
> mantainers (too Windows specific), nor has the Cygwin mantainer used it
> in the cygwin version.


I didn't know they existed.  I'll take a look when I get ready to 
release the next cygwin-wget.

> Chuck, is there any extra functionality in cygwin-1.3.3 or is this just
> a resource problem (limited time) ?


Yeah, it's "just" a time thing; I got pretty bogged down in working on 
the latest cygwin (well, actually just *testing* cgf & corinna & etc's 
work.  I did no "real" work.)

> In the second case should I take over mantainership for cygwin wget (a
> version 1.7.1 is expected any time now) ?


I'd love to turn it over to somebody, if there are no objections.  One 
worry, though: you aren't very active on the mailing list.  I count only 
  four messages from you over the last six months (but ALL were about 
wget, so that's good) -- plus one message in May 2000 about CVS.  You do 
seem to be more informed that I about wget, where it's going, current 
development, etc. -- but will you answer list questions about it and 
stuff?  There's more to maintaining a cygwin package than just putting 
out timely releases (that's good, because otherwise with my track 
record, I'd be out of a job! <g>)

--Chuck



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wget not behaving correctly
  2001-09-08  2:51 ` wget not behaving correctly Hack Kampbjørn
  2001-09-08  8:47   ` Charles Wilson
@ 2001-09-08 13:42   ` C. Porter Bassett
  1 sibling, 0 replies; 7+ messages in thread
From: C. Porter Bassett @ 2001-09-08 13:42 UTC (permalink / raw)
  To: cygwin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1965 bytes --]

----- Original Message -----
From: "Hack Kampbjørn" <hack@hackdata.com>

>Yes this is the '?' in the URL. There can be two problems here depending
>on which shell you use. First your shell may not send the '?' to wget
>but directly complain that it cannot expand the globbing character '?'
>to a filename, I cannot tell if this is the case as you did not include
>full output of running the wget command.

OK, here it is:
1427:PWORK:~$ wget.exe -rd -A gif,jpg,png,bmp
http://fan.theonering.net/rolozo/galleries.php ?
mode=2
DEBUG output created by Wget 1.6 on cygwin32.

parseurl (" http://fan.theonering.net/rolozo/galleries.php?mode=2" ;) -> host
fan.theonering.net -> opath rolozo/galleries.php?mode=2 -> dir rolozo ->
file galleries.php?mode=2 -> ndir rolozo
newpath: /rolozo/galleries.php?mode=2
Checking for fan.theonering.net.
This is the first time I hear about host fan.theonering.net by that name.
--14:27:54--  http://fan.theonering.net/rolozo/galleries.php?mode=2
           => `fan.theonering.net/rolozo/galleries.php?mode=2'
Connecting to fan.theonering.net:80... Created fd 3.
connected!
---request begin---
GET /rolozo/galleries.php?mode=2 HTTP/1.0
User-Agent: Wget/1.6
Host: fan.theonering.net
Accept: */*

---request end---
HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Sat, 08 Sep 2001 20:35:39 GMT
Server: Apache/1.3.19 (Unix) PHP/4.0.4pl1 mod_perl/1.25
X-Powered-By: PHP/4.0.4pl1
Connection: close
Content-Type: text/html


Length: unspecified [text/html]
fan.theonering.net/rolozo/galleries.php?mode=2: No such file or directory
Closing fd 3

Cannot write to `fan.theonering.net/rolozo/galleries.php?mode=2' (No such
file or directory).

FINISHED --14:27:54--
Downloaded: 0 bytes in 0 files
1427:PWORK:~$




--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wget not behaving correctly
       [not found] <999982012.13196.ezmlm@sources.redhat.com>
@ 2001-09-16 15:21 ` Hack Kampbjørn
  0 siblings, 0 replies; 7+ messages in thread
From: Hack Kampbjørn @ 2001-09-16 15:21 UTC (permalink / raw)
  To: C. Porter Bassett; +Cc: cygwin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4057 bytes --]

Sorry about the delay, have been too busy this week 8-(


cygwin-digest-help@sources.redhat.com wrote:
> Subject: Re: wget not behaving correctly
> Date: Sat, 8 Sep 2001 14:34:59 -0600
> From: "C. Porter Bassett" <porter@et.byu.edu>
> Reply-To: "C. Porter Bassett" <cporter@byu.edu>
> To: <cygwin@cygwin.com>
> 
> ----- Original Message -----
> From: "Hack Kampbjørn" <hack@hackdata.com>
> 
> >Yes this is the '?' in the URL. There can be two problems here depending
> >on which shell you use. First your shell may not send the '?' to wget
> >but directly complain that it cannot expand the globbing character '?'
> >to a filename, I cannot tell if this is the case as you did not include
> >full output of running the wget command.
> 
> OK, here it is:
> 1427:PWORK:~$ wget.exe -rd -A gif,jpg,png,bmp
> http://fan.theonering.net/rolozo/galleries.php ?
> mode=2
> DEBUG output created by Wget 1.6 on cygwin32.
> 
> parseurl (" http://fan.theonering.net/rolozo/galleries.php?mode=2" ;) -> host
> fan.theonering.net -> opath rolozo/galleries.php?mode=2 -> dir rolozo ->
> file galleries.php?mode=2 -> ndir rolozo
> newpath: /rolozo/galleries.php?mode=2
> Checking for fan.theonering.net.
> This is the first time I hear about host fan.theonering.net by that name.
> --14:27:54--  http://fan.theonering.net/rolozo/galleries.php?mode=2
>            => `fan.theonering.net/rolozo/galleries.php?mode=2'
> Connecting to fan.theonering.net:80... Created fd 3.
> connected!
> ---request begin---
> GET /rolozo/galleries.php?mode=2 HTTP/1.0
> User-Agent: Wget/1.6
> Host: fan.theonering.net
> Accept: */*
> 
> ---request end---
> HTTP request sent, awaiting response... HTTP/1.1 200 OK
> Date: Sat, 08 Sep 2001 20:35:39 GMT
> Server: Apache/1.3.19 (Unix) PHP/4.0.4pl1 mod_perl/1.25
> X-Powered-By: PHP/4.0.4pl1
> Connection: close
> Content-Type: text/html
> 
> Length: unspecified [text/html]
> fan.theonering.net/rolozo/galleries.php?mode=2: No such file or directory
> Closing fd 3
> 
> Cannot write to `fan.theonering.net/rolozo/galleries.php?mode=2' (No such
> file or directory).

OK, it's the problem of '?' being an illegal character on Windows file
systems

> 
> FINISHED --14:27:54--
> Downloaded: 0 bytes in 0 files
> 1427:PWORK:~$
> 

You can work around it with the --output-document option but then you
loose the recursion. 

What I use this is patch (should apply clean on wget-1.6 too). Yes, it's
a crude hack, and doesn't address all the problems with illegal
characters. And some of the code in wget doesn't expect the file to be
saved with a different name than on the webserver so you loose some
options too (IIRC --convert-links).

I keep forgetting that the wget-patch list isn't archive. And not every
patch is sent to the wget list. Sorry for sending you on a fruitless
search there 8-(

Index: src/url.c
===================================================================
RCS file: /pack/anoncvs/wget/src/url.c,v
retrieving revision 1.21.2.1
diff -u -r1.21.2.1 url.c
--- src/url.c   2000/12/17 19:28:20     1.21.2.1
+++ src/url.c   2001/02/03 15:53:24
@@ -1272,16 +1272,17 @@
          file = nfile;
        }
     }
-  /* DOS-ish file systems don't like `%' signs in them; we change it
-     to `@'.  */
-#ifdef WINDOWS
+  /* Windows file systems don't like `?' signs in them; we change it
+     to `@'. 
+     #### Note: nor are \ / : * " < > | allowed */
+#if defined(WINDOWS) || defined (__CYGWIN__)
   {
     char *p = file;
     for (p = file; *p; p++)
-      if (*p == '%')
+      if (*p == '?')
        *p = '@';
   }
-#endif /* WINDOWS */
+#endif /* WINDOWS or CYGWIN */
 
   /* Check the cases in which the unique extensions are not used:
      1) Clobbering is turned off (-nc).


-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn               hack@hackdata.com
HackLine                     +45 2031 7799

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wget not behaving correctly
  2001-09-06 12:35 C. Porter Bassett
  2001-09-06 12:41 ` Charles Wilson
@ 2001-09-06 14:30 ` Andrew Markebo
  1 sibling, 0 replies; 7+ messages in thread
From: Andrew Markebo @ 2001-09-06 14:30 UTC (permalink / raw)
  To: C. Porter Bassett; +Cc: cygwin

/ "C. Porter Bassett" <porter@et.byu.edu> wrote:
| When I run
|  wget -r -A gif,jpg,png,bmp
| http://fan.theonering.net/rolozo/galleries.php?mode=2
| it cannot find the file
| http://fan.theonering.net/rolozo/galleries.php?mode=2 , but it can when run
| in linux.  Perhaps this has something to do with the ? in the URL?

Could be the '?' yes, try to put >"< around the url.. 

 wget -r -A gif,jpg,png,bmp " http://fan.theonering.net/rolozo/galleries.php?mode=2" ;

or escape the ? like \?

        /Andy

-- 
 The eye of the beholder rests on the beauty!

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wget not behaving correctly
  2001-09-06 12:35 C. Porter Bassett
@ 2001-09-06 12:41 ` Charles Wilson
  2001-09-06 14:30 ` Andrew Markebo
  1 sibling, 0 replies; 7+ messages in thread
From: Charles Wilson @ 2001-09-06 12:41 UTC (permalink / raw)
  To: C. Porter Bassett; +Cc: cygwin

C. Porter Bassett wrote:

> When I run
>  wget -r -A gif,jpg,png,bmp
> http://fan.theonering.net/rolozo/galleries.php?mode=2
> it cannot find the file
> http://fan.theonering.net/rolozo/galleries.php?mode=2 , but it can when run
> in linux.  Perhaps this has something to do with the ? in the URL?


I think it's a version thing.  Cygwin's wget is 1.6.1, the latest 
version is 1.7.  I'll update wget once cygwin-1.3.3 is out (and no, I 
don't know when that will be).

--Chuck




--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* wget not behaving correctly
@ 2001-09-06 12:35 C. Porter Bassett
  2001-09-06 12:41 ` Charles Wilson
  2001-09-06 14:30 ` Andrew Markebo
  0 siblings, 2 replies; 7+ messages in thread
From: C. Porter Bassett @ 2001-09-06 12:35 UTC (permalink / raw)
  To: cygwin

When I run
 wget -r -A gif,jpg,png,bmp
http://fan.theonering.net/rolozo/galleries.php?mode=2
it cannot find the file
http://fan.theonering.net/rolozo/galleries.php?mode=2 , but it can when run
in linux.  Perhaps this has something to do with the ? in the URL?




--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-09-16 15:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <999820851.27921.ezmlm@sources.redhat.com>
2001-09-08  2:51 ` wget not behaving correctly Hack Kampbjørn
2001-09-08  8:47   ` Charles Wilson
2001-09-08 13:42   ` C. Porter Bassett
     [not found] <999982012.13196.ezmlm@sources.redhat.com>
2001-09-16 15:21 ` Hack Kampbjørn
2001-09-06 12:35 C. Porter Bassett
2001-09-06 12:41 ` Charles Wilson
2001-09-06 14:30 ` Andrew Markebo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).