public inbox for cygwin-apps@cygwin.com
 help / color / mirror / Atom feed
* [PATCH] make setup mirror list more like web page not just urls
@ 2017-11-15 21:35 Brian Inglis
  2017-11-17 13:48 ` Ken Brown
  2017-11-17 13:53 ` Jon Turney
  0 siblings, 2 replies; 15+ messages in thread
From: Brian Inglis @ 2017-11-15 21:35 UTC (permalink / raw)
  To: cygwin-apps

[-- Attachment #1: Type: text/plain, Size: 2233 bytes --]

Hi folks,

[reposted without text attachment to see if patch gets thru]

Working on a FAST_CWD FAQ, looking at the Setup mirror site page and the web
page, I wondered if we could usefully add the mirrors.lst region and territory
(called area and location in the code) to the Setup display, to avoid users
having to refer to the web page to find out where mirrors are based geographically.

The first change was to prefix the displayed_url member in site.cc
site_list_type::init "constructor" with "area - location - ".

The sort key was the mirror url host name with the domain components in reverse
order to sort non-country TLDs together before CC TLDs.
The code tested for TLD length == 3 to distinguish between CC and non-CC TLDs,
as this code is over 10 years old and there were only the original com, edu,
gov, mil, net, org non-CC domains, before other gTLDs were added.
That test was changed to handle all gTLDs with length >= 3, to prefix area and
location to the sort key member, reverse the domain components in the servername
field instead of the url host, and suffix the url protocol prefix to keep the
key unique where sites offer both ftp and http mirrors, rather than the whole
url in the original.

For checking, a short script used awk to do the same when fed with mirrors.lst,
produce lines with the displayed url and sort key separated by a tab, and sort
by the key to produce the displayed urls in the same order as the code should.
The output of the script [was] in the first text attachment to show the
displayed urls in the sorted order.

When this was displayed in setup, some of the urls were cut off by the list box
border, estimated about 3 ems too narrow.
The width of the list box and related controls in res.rc was increased by about
30 pixels, and the position of the Add button moved over the same amount, to
give an acceptable display.

The required patch is attached for discussion: some may not like the display of
the default mirror from /etc/setup/setup.rc last_mirror, which appears in the
list as " - - url", as the other mirrors.lst fields are not currently saved
under last_mirror, and more work may be needed to improve this.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada


[-- Attachment #2: 0001-setup-site.cc-site_list_type-init-display-and-order.patch --]
[-- Type: text/plain, Size: 3509 bytes --]

diff --git a/res.rc b/res.rc
index 80d1bf1..fa90a65 100644
--- a/res.rc
+++ b/res.rc
@@ -135,10 +135,10 @@ CAPTION "Cygwin Setup - Choose Download Site(s)"
 FONT 8, "MS Shell Dlg"
 BEGIN
     ICON            IDI_CYGWIN,IDC_HEADICON,SETUP_HEADICON_X,0,21,20
-    LISTBOX         IDC_URL_LIST,66,45,185,110,LBS_NOINTEGRALHEIGHT | 
+    LISTBOX         IDC_URL_LIST,66,45,216,110,LBS_NOINTEGRALHEIGHT | 
                     LBS_EXTENDEDSEL | WS_VSCROLL | WS_HSCROLL | WS_GROUP | 
                     WS_TABSTOP
-    LTEXT           "Available Download Sites:",IDC_STATIC,66,34,183,8,NOT 
+    LTEXT           "Available Download Sites:",IDC_STATIC,66,34,216,8,NOT 
                     WS_GROUP
     CONTROL         "",IDC_HEADSEPARATOR,"Static",SS_BLACKFRAME | SS_SUNKEN,0,28,
                     SETUP_STANDARD_DIALOG_W,1
@@ -146,10 +146,10 @@ BEGIN
                     IDC_STATIC,21,9,239,16,NOT WS_GROUP
     LTEXT           "Choose A Download Site",IDC_STATIC_HEADER_TITLE,7,0,258,
                     8,NOT WS_GROUP
-    EDITTEXT        IDC_EDIT_USER_URL,65,160,185,14,ES_AUTOHSCROLL | 
+    EDITTEXT        IDC_EDIT_USER_URL,65,160,216,14,ES_AUTOHSCROLL | 
                     WS_GROUP
     LTEXT           "User URL:",IDC_SITE_USERURL,15,162,45,8,NOT WS_GROUP
-    PUSHBUTTON      "Add",IDC_BUTTON_ADD_URL,255,160,50,14
+    PUSHBUTTON      "Add",IDC_BUTTON_ADD_URL,288,160,50,14
 END
 
 IDD_NET DIALOG DISCARDABLE  0, 0, SETUP_STANDARD_DIALOG_DIMS
diff --git a/setup.exe.manifest b/setup.exe.manifest
old mode 100755
new mode 100644
diff --git a/setup64.exe.manifest b/setup64.exe.manifest
old mode 100755
new mode 100644
diff --git a/site.cc b/site.cc
index c33da36..b8b988e 100644
--- a/site.cc
+++ b/site.cc
@@ -154,30 +154,30 @@ site_list_type::init (const string &_url, const string &_servername,
   if (url.at(url.length()-1) != '/')
     url.append("/");
 
-  /* displayed_url is protocol and site name part of url */
+  /* displayed_url is area, location, protocol part of url, and servername */
   string::size_type path_offset = url.find ("/", url.find ("//") + 2);
-  displayed_url = url.substr(0, path_offset);
-
-  /* the sorting key is hostname components in reverse order (to sort by country code)
-     plus the url (to ensure uniqueness) */
-  key = string();
-  string::size_type last_idx = displayed_url.length () - 1;
-  string::size_type idx = url.find_last_of("./", last_idx);
-  if (last_idx - idx == 3)
+  displayed_url = area + " - " + location + " - " + url.substr (0, path_offset);
+
+  /* the sorting key is area, location, servername components in reverse order
+   * (to sort by country code) plus the protocol (to ensure uniqueness) */
+  key = area + " " + location + " ";
+  string::size_type last_idx = servername.length ();
+  string::size_type idx = servername.find_last_of (".", last_idx);
+
+  if (last_idx - idx >= 3)
   {
-    /* Sort non-country TLDs (.com, .net, ...) together. */
+    /* Sort gTLDs (.com, .net, .info, ...) together. */
     key += " ";
   }
+
   do
   {
-    key += url.substr(idx + 1, last_idx - idx);
-    key += " ";
+    key += servername.substr (idx + 1, last_idx - idx) + " ";
     last_idx = idx - 1;
-    idx = url.find_last_of("./", last_idx);
-    if (idx == string::npos)
-      idx = 0;
-  } while (idx > 0);
-  key += url;
+    idx = servername.find_last_of (".", last_idx);
+  } while (idx != string::npos && idx > 0);
+
+  key += url.substr (0, url.find (':'));
 }
 
 site_list_type::site_list_type (const string &_url,


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-15 21:35 [PATCH] make setup mirror list more like web page not just urls Brian Inglis
@ 2017-11-17 13:48 ` Ken Brown
  2017-11-19 20:24   ` Ken Brown
  2017-11-17 13:53 ` Jon Turney
  1 sibling, 1 reply; 15+ messages in thread
From: Ken Brown @ 2017-11-17 13:48 UTC (permalink / raw)
  To: cygwin-apps

On 11/15/2017 4:35 PM, Brian Inglis wrote:
> Hi folks,
> 
> [reposted without text attachment to see if patch gets thru]
> 
> Working on a FAST_CWD FAQ, looking at the Setup mirror site page and the web
> page, I wondered if we could usefully add the mirrors.lst region and territory
> (called area and location in the code) to the Setup display, to avoid users
> having to refer to the web page to find out where mirrors are based geographically.

I think this is a good idea in principle, but it still needs some work.

> The first change was to prefix the displayed_url member in site.cc
> site_list_type::init "constructor" with "area - location - ".
> 
> The sort key was the mirror url host name with the domain components in reverse
> order to sort non-country TLDs together before CC TLDs.
> The code tested for TLD length == 3 to distinguish between CC and non-CC TLDs,
> as this code is over 10 years old and there were only the original com, edu,
> gov, mil, net, org non-CC domains, before other gTLDs were added.
> That test was changed to handle all gTLDs with length >= 3, to prefix area and
> location to the sort key member, reverse the domain components in the servername
> field instead of the url host, and suffix the url protocol prefix to keep the
> key unique where sites offer both ftp and http mirrors, rather than the whole
> url in the original.

Adding the protocol to the key instead of the full url isn't enough to 
guarantee uniqueness of the key.  Users can add their own sites for 
which that fails.  Also, you shouldn't be relying on the servername 
field being filled in.  For example, the function 
SiteSetting::registerSavedSite() explicitly constructs a site 'tempSite' 
that has an empty servername.

> The required patch is attached for discussion: some may not like the display of
> the default mirror from /etc/setup/setup.rc last_mirror, which appears in the
> list as " - - url", as the other mirrors.lst fields are not currently saved
> under last_mirror, and more work may be needed to improve this.

If the saved url corresponds to a site in the list, I think setup should 
recognize this rather than listing the same site again as " - - url".

Ken

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-15 21:35 [PATCH] make setup mirror list more like web page not just urls Brian Inglis
  2017-11-17 13:48 ` Ken Brown
@ 2017-11-17 13:53 ` Jon Turney
  2017-11-20 21:30   ` Brian Inglis
  1 sibling, 1 reply; 15+ messages in thread
From: Jon Turney @ 2017-11-17 13:53 UTC (permalink / raw)
  To: cygwin-apps; +Cc: Brian Inglis

On 15/11/2017 21:35, Brian Inglis wrote:
> I wondered if we could usefully add the mirrors.lst region and territory
> (called area and location in the code) to the Setup display, to avoid users
> having to refer to the web page to find out where mirrors are based geographically.

Thanks for the patch.

Nice idea.  I hadn't noticed that this information is in mirrors.lst, 
but we're failing to use it.

> The first change was to prefix the displayed_url member in site.cc
> site_list_type::init "constructor" with "area - location - ".

The 'real solution' to this would be to drag setup into the present 
century and use a grid control to present this information.

Lots of work, though :(

> The sort key was the mirror url host name with the domain components in reverse
> order to sort non-country TLDs together before CC TLDs.
> The code tested for TLD length == 3 to distinguish between CC and non-CC TLDs,
> as this code is over 10 years old and there were only the original com, edu,
> gov, mil, net, org non-CC domains, before other gTLDs were added.
> That test was changed to handle all gTLDs with length >= 3, to prefix area and
> location to the sort key member, reverse the domain components in the servername
> field instead of the url host, and suffix the url protocol prefix to keep the
> key unique where sites offer both ftp and http mirrors, rather than the whole
> url in the original.
[...]

This code needs to handle manually added URLs with non-FQDNs.

Although we give no indication of it in the UI, I think you can also 
just add a path.

Can you test your changes in those cases?

It's also a long-standing defect here that if you add two URLs with the 
same protocol/host, they aren't distinguishable [1]

[1] https://cygwin.com/ml/cygwin-apps/2010-11/msg00178.html

> When this was displayed in setup, some of the urls were cut off by the list box
> border, estimated about 3 ems too narrow.
> The width of the list box and related controls in res.rc was increased by about
> 30 pixels, and the position of the Add button moved over the same amount, to
> give an acceptable display.
> 
> The required patch is attached for discussion: some may not like the display of
> the default mirror from /etc/setup/setup.rc last_mirror, which appears in the
> list as " - - url", as the other mirrors.lst fields are not currently saved
> under last_mirror, and more work may be needed to improve this.

Yeah, that's going to need fixing.

- If the URL matches one in the list, that item should be selected.
- Store area and location so we can populate those fields if it doesn't.

Please, please use git-format-patch in future.

The patch also has some spurious changes to the mode of the manifest files.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-17 13:48 ` Ken Brown
@ 2017-11-19 20:24   ` Ken Brown
  0 siblings, 0 replies; 15+ messages in thread
From: Ken Brown @ 2017-11-19 20:24 UTC (permalink / raw)
  To: cygwin-apps

On 11/17/2017 8:48 AM, Ken Brown wrote:
> which that fails.  Also, you shouldn't be relying on the servername 
> field being filled in.

One further comment about the servername field: AFAICT, the only use 
that's made of this field is to distinguish "official" mirrors (those 
listed in mirrors.lst, either now or in the past) from sites added by 
the user.  See 'check_dropped_mirrors()' and 'write_cache_list()' in 
site.cc.

I find this confusing, and it apparently threw Brian off also. 
Moreover, it seems to be undocumented.  I suggest either documenting 
this or changing it.  My preference is to change it, but I'm still 
thinking about how to best do that.  I'll probably send a patch later.

Ken

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-17 13:53 ` Jon Turney
@ 2017-11-20 21:30   ` Brian Inglis
  2017-11-20 22:59     ` Ken Brown
  0 siblings, 1 reply; 15+ messages in thread
From: Brian Inglis @ 2017-11-20 21:30 UTC (permalink / raw)
  To: cygwin-apps

[-- Attachment #1: Type: text/plain, Size: 6322 bytes --]

On 2017-11-17 06:53, Jon Turney wrote:
On 11/17/2017 8:48 AM, Ken Brown wrote:
> On 15/11/2017 21:35, Brian Inglis wrote:

Thanks for your comments, I understand all your points, and agree with them.
I merged the posts and responses as they addressed the same issues.
I also applied Ken's setup patch and rejigged mine where required.

KB> Ken Brown
JT> Jon Turney

KB> I think this is a good idea in principle, but it still needs some work.

JT> use a grid control to present this information.

Don't think I want to learn that much about Windows controls!
The width of the longest fields in each column would need to adjust the grid
and dialogue box widths, and Add button position, dynamically, which I did
manually, and the internal blank space may be a bit harder to scan for users
who are interested only in their geographic region, country, or territory
- these are pretty well aligned as strings.

KB> Adding the protocol to the key instead of the full url isn't enough to
KB> guarantee uniqueness of the key.
KB> Users can add their own sites for which that fails.
KB> Also, you shouldn't be relying on the servername field being filled in.
KB> For example, the function SiteSetting::registerSavedSite() explicitly
KB> constructs a site 'tempSite' that has an empty servername.

JT> This code needs to handle manually added URLs with non-FQDNs.
JT> - - url...'s going to need fixing.

If host name provided, use as servername, or default to localhost.

If at least two domain components:
if CC TLD, use as area in uppercase, or
if gTLD, use as area with init cap;
if 2-3 char TLD and 2-3 char 2nd level domain, may be subTLD, so if 3rd level
domain exists and does not start with (or contain?) "cygwin", "ftp", "mirror",
"www", add space plus 2-3 char 2nd level domain with init cap to area.
Use next level domain as location with init cap and uppercase after any "-".
Otherwise default area to Local, and use servername as location with init cap.
This adhoc approach seems to work well enough and avoids having to drag in and
process the rules in the Public Suffix List (libpsl5 and
publicsuffix-list-dafsa).

Examples:

Input	->
    List
    Displayed
    Key

http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com		->
http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/;mirror.cpsc.ucalgary.ca;CA;Ucalgary
CA - Ucalgary - http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com
CA Ucalgary ca ucalgary cpsc mirror
http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/

[cyg]file://server/c$/usr/local/cygwin/var/cache/setup/packages	->
[cyg]file://server/c$/usr/local/cygwin/var/cache/setup/packages/;server;Local;Server
Local - Server - file://server/c$/usr/local/cygwin/var/cache/setup/packages
Local Server server file://server/c$/usr/local/cygwin/var/cache/setup/packages/

cygfile:///var/cache/setup/packages		->
cygfile:///var/cache/setup/packages/;localhost;Local;Localhost
Local - Localhost - cygfile://var/cache/setup/packages
Local Localhost localhost cygfile://var/cache/setup/packages/

file:///c|/usr/local/cygwin/var/cache/setup/packages		->
file:///c|/usr/local/cygwin/var/cache/setup/packages/;localhost;Local;Localhost
Local - Localhost - file:///c|/usr/local/cygwin/var/cache/setup/packages
Local Localhost localhost file:///c|/usr/local/cygwin/var/cache/setup/packages/

Cygwin file:// does not appear to use /// or support anything but local files,
so should always be treated as localhost; or does it support servers, but they
can be treated as if directories under Windows, so we need to do a server or
directory check, or that check is done at a lower level?

JT> you can also just add a path.

If UNC path with server, use as above, but include share name in display,
otherwise default as above.

Examples:

\\server\c$\usr\local\cygwin\var\cache\setup\packages		->
\\server\c$\usr\local\cygwin\var\cache\setup\packages/;server;Local;Server
Local - Server - \\server\c$
Local Server server \\server\c$\usr\local\cygwin\var\cache\setup\packages/

c:\usr\local\cygwin\var\cache\setup\packages			->
c:\usr\local\cygwin\var\cache\setup\packages/;localhost;Local;Localhost
Local - Localhost - c:\usr\local\cygwin\var\cache\setup\packages
Local Localhost localhost c:\usr\local\cygwin\var\cache\setup\packages/

Where are the cached sites stashed - setup keeps showing my old test entries?

JT> if you add two URLs with the same protocol/host, they aren't distinguishable

Check for and eliminate duplicates after sort if same site_list fields.
Handled by KB patch.
Note on flag added by that patch - would is_current, is_good, or is_known be a
better reflection of what that flag means?
Should we add ==(string) to compare only the url member?
Should we add a constructor(string,is_official) to refactor string to site
conversions, and the url additions?

KB> If the saved url corresponds to a site in the list, I think setup should
KB> recognize this rather than listing the same site again as " - - url".

JT> If the URL matches one in the list, that item should be selected.

Check for URL match with site list and use match as site:
added for tempSite in RegisterSavedSites and newsite on Add;
required as key now includes area and location, not just URL.
Could we add a check to a string contructor to return only a reference to an
existing entry, if the url matches, to refactor the above changes?

JT> Store area and location so we can populate those fields if it doesn't.

See above.

JT> Please, please use git-format-patch in future.

I can't tell from looking, does it differ from diff output, and is there a
reason to use git-format-patch for review?
I use diffs for feedback until it DTRT.
I can manage one commit and one format-patch on a good day.
I should just stick to using what Tortoise Git supports.
My record with git includes a lot of rm -rf and re-clone, as the files, trees,
and indices get out of whack when I forget some incantation and try to fix it!
Very brittle and unfriendly - if it's going to bork itself, just don't do that!
I never had a single problem with CVS or Hg in many years ;^>

JT> The patch also has some spurious changes to the mode of the manifest files.

Realized I shouldn't tidy up unnecessary x modes under git control - habit
- gets me into trouble and I learn new things - checked out masters.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

[-- Attachment #2: 0001-setup-mirror-site-list-default-members-display-and-order-by-area-and-location.patch --]
[-- Type: text/plain, Size: 8935 bytes --]

diff --git a/res.rc b/res.rc
index 80d1bf1..fa90a65 100644
--- a/res.rc
+++ b/res.rc
@@ -135,10 +135,10 @@ CAPTION "Cygwin Setup - Choose Download Site(s)"
 FONT 8, "MS Shell Dlg"
 BEGIN
     ICON            IDI_CYGWIN,IDC_HEADICON,SETUP_HEADICON_X,0,21,20
-    LISTBOX         IDC_URL_LIST,66,45,185,110,LBS_NOINTEGRALHEIGHT | 
+    LISTBOX         IDC_URL_LIST,66,45,216,110,LBS_NOINTEGRALHEIGHT | 
                     LBS_EXTENDEDSEL | WS_VSCROLL | WS_HSCROLL | WS_GROUP | 
                     WS_TABSTOP
-    LTEXT           "Available Download Sites:",IDC_STATIC,66,34,183,8,NOT 
+    LTEXT           "Available Download Sites:",IDC_STATIC,66,34,216,8,NOT 
                     WS_GROUP
     CONTROL         "",IDC_HEADSEPARATOR,"Static",SS_BLACKFRAME | SS_SUNKEN,0,28,
                     SETUP_STANDARD_DIALOG_W,1
@@ -146,10 +146,10 @@ BEGIN
                     IDC_STATIC,21,9,239,16,NOT WS_GROUP
     LTEXT           "Choose A Download Site",IDC_STATIC_HEADER_TITLE,7,0,258,
                     8,NOT WS_GROUP
-    EDITTEXT        IDC_EDIT_USER_URL,65,160,185,14,ES_AUTOHSCROLL | 
+    EDITTEXT        IDC_EDIT_USER_URL,65,160,216,14,ES_AUTOHSCROLL | 
                     WS_GROUP
     LTEXT           "User URL:",IDC_SITE_USERURL,15,162,45,8,NOT WS_GROUP
-    PUSHBUTTON      "Add",IDC_BUTTON_ADD_URL,255,160,50,14
+    PUSHBUTTON      "Add",IDC_BUTTON_ADD_URL,288,160,50,14
 END
 
 IDD_NET DIALOG DISCARDABLE  0, 0, SETUP_STANDARD_DIALOG_DIMS
diff --git a/site.cc b/site.cc
index c9fac12..a2f9406 100644
--- a/site.cc
+++ b/site.cc
@@ -18,6 +18,7 @@
 
 #include <string>
 #include <algorithm>
+#include <cctype>
 
 #include "site.h"
 #include "win32.h"
@@ -141,6 +142,130 @@ SiteSetting::~SiteSetting ()
     save ();
 }
 
+static bool
+common (const string &domain)
+{
+    const string prefixes[] = { "cygwin", "ftp", "mirror", "www" };
+
+    for (auto& s: prefixes)
+    {
+      if (s.length() <= domain.length() &&
+	  mismatch(s.begin(), s.end(), domain.begin()).first == s.end())
+	return true;
+    }
+
+    return false;
+}
+
+static string
+area_default (const string &servername)
+{
+  string::size_type last_idx = servername.length () - 1;
+  string::size_type idx = servername.find_last_of (".", last_idx);
+
+  /* no dots => local server */
+  if (idx == string::npos)
+  {
+    return "Local";
+  }
+
+  string area = servername.substr (idx + 1, last_idx - idx);
+  /* gTLD - initial cap. */
+  area[0] = toupper (area[0]);
+
+  /* length 2 TLD => CC TLD - uppercase. */
+  if (last_idx - idx == 2)
+  {
+    area[1] = toupper (area[1]);
+  }
+
+  /* length 2-3 TLD => may have subTLD - check. */
+  if (last_idx - idx <= 3)
+  {
+    last_idx = idx - 1;
+    idx = servername.find_last_of (".", last_idx);
+
+    /* domain below length 2-3 subTLD? */
+    if (idx != string::npos && last_idx - idx <= 3)
+    {
+      string::size_type next_idx = servername.find_last_of (".", idx - 1);
+
+      /* no more dots - begins at start */
+      if (next_idx == string::npos)
+      {
+	next_idx = -1;
+      }
+
+      /* check for common domain prefixes to ignore as location */
+      if (!common (servername.substr (next_idx + 1, idx - next_idx - 1)))
+      {
+	/* not common domain - add subTLD to area with initial cap. */
+	area += " " + toupper (servername[idx + 1]) +
+			servername.substr (idx + 2, last_idx - idx - 1);
+      }
+    }
+  }
+
+  return area;
+}
+
+static string
+location_default (const string &servername)
+{
+  string::size_type last_idx = servername.length () - 1;
+  string::size_type idx = servername.find_last_of (".", last_idx);
+  string location = servername;
+  /* initial cap. */
+  location[0] = toupper (location[0]);
+
+  /* no dots => local server */
+  if (idx == string::npos)
+  {
+    return location;
+  }
+
+  /* length 2-3 TLD => may have subTLD - check. */
+  if (last_idx - idx <= 3)
+  {
+    last_idx = idx - 1;
+    idx = servername.find_last_of (".", last_idx);
+
+    /* domain below length 2-3 subTLD? */
+    if (idx != string::npos && last_idx - idx <= 3)
+    {
+      string::size_type next_idx = servername.find_last_of (".", idx - 1);
+
+      /* no more dots - begins at start */
+      if (next_idx == string::npos)
+      {
+	next_idx = -1;
+      }
+
+      /* check for common domain prefixes to ignore as location */
+      if (!common (servername.substr (next_idx + 1, idx - next_idx - 1)))
+      {
+	/* not common domain - use that as location. */
+	last_idx = idx - 1;
+	idx = next_idx;
+      }
+    }
+  }
+
+  location = servername.substr (idx + 1, last_idx - idx);
+
+  /* initial cap */
+  location[0] = toupper (location[0]);
+
+  idx = 0;
+  while ((idx = location.find( '-', idx)) != string::npos)
+  {
+    ++idx;
+    location[idx] = toupper (location[idx]);
+  }
+
+  return location;
+}
+
 site_list_type::site_list_type (const string &_url,
 				const string &_servername,
 				const string &_area,
@@ -157,30 +282,72 @@ site_list_type::site_list_type (const string &_url,
   if (url.at(url.length()-1) != '/')
     url.append("/");
 
-  /* displayed_url is protocol and site name part of url */
-  string::size_type path_offset = url.find ("/", url.find ("//") + 2);
-  displayed_url = url.substr(0, path_offset);
+  /* get protocol, host flag, and site name parts of url */
+  string::size_type host_flag_posn = url.find ("//");
+  string::size_type path_offset = (host_flag_posn == string::npos)
+					? string::npos
+					: url.find ("/", host_flag_posn + 2);
+  /* not url-like, check for Windows UNC paths */
+  if (host_flag_posn == string::npos)
+  {
+    if ('\\' == url[0] && '\\' == url[1])
+    {
+      /* UNC path - get server */
+      host_flag_posn = 0;
+      /* terminal '/' ensures result */
+      path_offset = url.find_first_of ("\\/", host_flag_posn + 2);
+    }
+  }
+  /* default servername to url host or localhost */
+  if (servername.length() == 0) {
+      servername = (path_offset != string::npos &&
+		    host_flag_posn + 2 < path_offset)
+			? url.substr (host_flag_posn + 2,
+					path_offset - (host_flag_posn + 2))
+			: "localhost";
+  }
+  /* default area to TLD and maybe short 2nd level domain */
+  if (area.length() == 0) {
+      area = area_default (servername);
+  }
+  /* default location to next level domain */
+  if (location.length() == 0) {
+      location = location_default (servername);
+  }
+
+  /* the sorting key is area, location, domain name components in reverse order
+   * (to group by country code) plus url (to ensure uniqueness) */
+  key = area + " " + location + " ";
+  string::size_type last_idx = path_offset - 1;
+  string::size_type idx = url.find_last_of ("./", last_idx);
 
-  /* the sorting key is hostname components in reverse order (to sort by country code)
-     plus the url (to ensure uniqueness) */
-  key = string();
-  string::size_type last_idx = displayed_url.length () - 1;
-  string::size_type idx = url.find_last_of("./", last_idx);
-  if (last_idx - idx == 3)
+  if (3 <= last_idx - idx)
   {
-    /* Sort non-country TLDs (.com, .net, ...) together. */
+    /* Sort gTLDs (.com, .net, .info, ...) together. */
     key += " ";
   }
-  do
+
+  while (idx < last_idx)
   {
-    key += url.substr(idx + 1, last_idx - idx);
+    key += url.substr (idx + 1, last_idx - idx);
     key += " ";
     last_idx = idx - 1;
-    idx = url.find_last_of("./", last_idx);
-    if (idx == string::npos)
-      idx = 0;
-  } while (idx > 0);
+    idx = url.find_last_of ("./\\", last_idx);
+  }
+
   key += url;
+
+  /* if double (back)slashes at start of url */
+  if (host_flag_posn == 0)
+  {
+    /* check for next path delimiter */
+    string::size_type share_end = url.find_first_of ("\\/", path_offset + 1);
+    /* include UNC server share name */
+    if (share_end != string::npos)
+      path_offset = share_end;
+  }
+  /* displayed_url is area, location, protocol and site name part of url */
+  displayed_url = area + " - " + location + " - " + url.substr (0, path_offset);
 }
 
 site_list_type::site_list_type (site_list_type const &rhs)
@@ -379,6 +546,15 @@ void
 SiteSetting::registerSavedSite (const char * site)
 {
   site_list_type tempSite(site, "", "", "", false);
+
+  /* use site in list if matching url */
+  for (auto& s: all_site_list)
+    if (site == s.url)
+    {
+      tempSite = s;
+      break;
+    }
+
   SiteList::iterator i = find (all_site_list.begin(),
 			       all_site_list.end(), tempSite);
   if (i == all_site_list.end())
@@ -699,6 +875,15 @@ bool SitePage::OnMessageCmd (int id, HWND hwndctl, UINT code)
 	    if (other_url.size())
 	    {
 	    site_list_type newsite (other_url, "", "", "", false);
+
+	    /* use site in list if matching url */
+	    for (auto& s: all_site_list)
+	      if (other_url == s.url)
+	      {
+		newsite = s;
+		break;
+	      }
+
 	    SiteList::iterator i = find (all_site_list.begin(),
 					 all_site_list.end(), newsite);
 	    if (i == all_site_list.end())

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-20 21:30   ` Brian Inglis
@ 2017-11-20 22:59     ` Ken Brown
  2017-11-20 23:02       ` Ken Brown
  2017-11-22 16:42       ` Brian Inglis
  0 siblings, 2 replies; 15+ messages in thread
From: Ken Brown @ 2017-11-20 22:59 UTC (permalink / raw)
  To: cygwin-apps

On 11/20/2017 4:30 PM, Brian Inglis wrote:
> On 2017-11-17 06:53, Jon Turney wrote:
> On 11/17/2017 8:48 AM, Ken Brown wrote:
>> On 15/11/2017 21:35, Brian Inglis wrote:
> 
> Thanks for your comments, I understand all your points, and agree with them.
> I merged the posts and responses as they addressed the same issues.
> I also applied Ken's setup patch and rejigged mine where required.
> 
> KB> Ken Brown
> JT> Jon Turney
> 
> KB> I think this is a good idea in principle, but it still needs some work.
> 
> JT> use a grid control to present this information.
> 
> Don't think I want to learn that much about Windows controls!
> The width of the longest fields in each column would need to adjust the grid
> and dialogue box widths, and Add button position, dynamically, which I did
> manually, and the internal blank space may be a bit harder to scan for users
> who are interested only in their geographic region, country, or territory
> - these are pretty well aligned as strings.
> 
> KB> Adding the protocol to the key instead of the full url isn't enough to
> KB> guarantee uniqueness of the key.
> KB> Users can add their own sites for which that fails.
> KB> Also, you shouldn't be relying on the servername field being filled in.
> KB> For example, the function SiteSetting::registerSavedSite() explicitly
> KB> constructs a site 'tempSite' that has an empty servername.
> 
> JT> This code needs to handle manually added URLs with non-FQDNs.
> JT> - - url...'s going to need fixing.
> 
> If host name provided, use as servername, or default to localhost.
> 
> If at least two domain components:
> if CC TLD, use as area in uppercase, or
> if gTLD, use as area with init cap;
> if 2-3 char TLD and 2-3 char 2nd level domain, may be subTLD, so if 3rd level
> domain exists and does not start with (or contain?) "cygwin", "ftp", "mirror",
> "www", add space plus 2-3 char 2nd level domain with init cap to area.
> Use next level domain as location with init cap and uppercase after any "-".
> Otherwise default area to Local, and use servername as location with init cap.
> This adhoc approach seems to work well enough and avoids having to drag in and
> process the rules in the Public Suffix List (libpsl5 and
> publicsuffix-list-dafsa).
> 
> Examples:
> 
> Input	->
>      List
>      Displayed
>      Key
> 
> http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com		->
> http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/;mirror.cpsc.ucalgary.ca;CA;Ucalgary
> CA - Ucalgary - http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com
> CA Ucalgary ca ucalgary cpsc mirror
> http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/
> 
> [cyg]file://server/c$/usr/local/cygwin/var/cache/setup/packages	->
> [cyg]file://server/c$/usr/local/cygwin/var/cache/setup/packages/;server;Local;Server
> Local - Server - file://server/c$/usr/local/cygwin/var/cache/setup/packages
> Local Server server file://server/c$/usr/local/cygwin/var/cache/setup/packages/
> 
> cygfile:///var/cache/setup/packages		->
> cygfile:///var/cache/setup/packages/;localhost;Local;Localhost
> Local - Localhost - cygfile://var/cache/setup/packages
> Local Localhost localhost cygfile://var/cache/setup/packages/
> 
> file:///c|/usr/local/cygwin/var/cache/setup/packages		->
> file:///c|/usr/local/cygwin/var/cache/setup/packages/;localhost;Local;Localhost
> Local - Localhost - file:///c|/usr/local/cygwin/var/cache/setup/packages
> Local Localhost localhost file:///c|/usr/local/cygwin/var/cache/setup/packages/
> 
> Cygwin file:// does not appear to use /// or support anything but local files,
> so should always be treated as localhost; or does it support servers, but they
> can be treated as if directories under Windows, so we need to do a server or
> directory check, or that check is done at a lower level?
> 
> JT> you can also just add a path.
> 
> If UNC path with server, use as above, but include share name in display,
> otherwise default as above.
> 
> Examples:
> 
> \\server\c$\usr\local\cygwin\var\cache\setup\packages		->
> \\server\c$\usr\local\cygwin\var\cache\setup\packages/;server;Local;Server
> Local - Server - \\server\c$
> Local Server server \\server\c$\usr\local\cygwin\var\cache\setup\packages/
> 
> c:\usr\local\cygwin\var\cache\setup\packages			->
> c:\usr\local\cygwin\var\cache\setup\packages/;localhost;Local;Localhost
> Local - Localhost - c:\usr\local\cygwin\var\cache\setup\packages
> Local Localhost localhost c:\usr\local\cygwin\var\cache\setup\packages/
> 
> Where are the cached sites stashed - setup keeps showing my old test entries?
> 
> JT> if you add two URLs with the same protocol/host, they aren't distinguishable
> 
> Check for and eliminate duplicates after sort if same site_list fields.
> Handled by KB patch.
> Note on flag added by that patch - would is_current, is_good, or is_known be a
> better reflection of what that flag means?
> Should we add ==(string) to compare only the url member?
> Should we add a constructor(string,is_official) to refactor string to site
> conversions, and the url additions?
> 
> KB> If the saved url corresponds to a site in the list, I think setup should
> KB> recognize this rather than listing the same site again as " - - url".
> 
> JT> If the URL matches one in the list, that item should be selected.
> 
> Check for URL match with site list and use match as site:
> added for tempSite in RegisterSavedSites and newsite on Add;
> required as key now includes area and location, not just URL.
> Could we add a check to a string contructor to return only a reference to an
> existing entry, if the url matches, to refactor the above changes?
> 
> JT> Store area and location so we can populate those fields if it doesn't.
> 
> See above.
> 
> JT> Please, please use git-format-patch in future.
> 
> I can't tell from looking, does it differ from diff output, and is there a
> reason to use git-format-patch for review?
> I use diffs for feedback until it DTRT.
> I can manage one commit and one format-patch on a good day.
> I should just stick to using what Tortoise Git supports.
> My record with git includes a lot of rm -rf and re-clone, as the files, trees,
> and indices get out of whack when I forget some incantation and try to fix it!
> Very brittle and unfriendly - if it's going to bork itself, just don't do that!
> I never had a single problem with CVS or Hg in many years ;^>
> 
> JT> The patch also has some spurious changes to the mode of the manifest files.
> 
> Realized I shouldn't tidy up unnecessary x modes under git control - habit
> - gets me into trouble and I learn new things - checked out masters.

Hi Brian,

The issue of recognizing a URL that's already in the list is not fixed. 
Here's what I tried:

I ran setup with the last mirror (as stored in /etc/setup/setup.rc) 
being http://mirrors.kernel.org/sourceware/cygwin/.  The resulting 
mirror list display contained the following, highlighted:

   Org - Kernel - http://mirrors.kernel.org

So setup didn't recognize that 
http://mirrors.kernel.org/sourceware/cygwin/ was already in the list, 
displayed as

   United States - California - http://mirrors.kernel.org

Also, I don't like the fake 'area' and 'location' fields for user-added 
URLs.  For example, "Org" and "Kernel" are startling in the display 
above and add no information to the plain URL.  Another example is my 
own personal repository, http://sanibeltranquility.com/cygwin, which 
displays as

   Com - Sanibeltranquility - http://sanibeltranquility.com

I would rather see the area and location left blank, or perhaps replaced 
by "Unknown".  A third alternative would be to replace "Area - Location" 
by "User-added" or something similar.

Ken

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-20 22:59     ` Ken Brown
@ 2017-11-20 23:02       ` Ken Brown
  2017-11-22 16:48         ` Brian Inglis
  2017-11-22 16:42       ` Brian Inglis
  1 sibling, 1 reply; 15+ messages in thread
From: Ken Brown @ 2017-11-20 23:02 UTC (permalink / raw)
  To: cygwin-apps

On 11/20/2017 5:59 PM, Ken Brown wrote:
> On 11/20/2017 4:30 PM, Brian Inglis wrote:
>> On 2017-11-17 06:53, Jon Turney wrote:
>> On 11/17/2017 8:48 AM, Ken Brown wrote:
>>> On 15/11/2017 21:35, Brian Inglis wrote:
>>
>> Thanks for your comments, I understand all your points, and agree with 
>> them.
>> I merged the posts and responses as they addressed the same issues.
>> I also applied Ken's setup patch and rejigged mine where required.
>>
>> KB> Ken Brown
>> JT> Jon Turney
>>
>> KB> I think this is a good idea in principle, but it still needs some 
>> work.
>>
>> JT> use a grid control to present this information.
>>
>> Don't think I want to learn that much about Windows controls!
>> The width of the longest fields in each column would need to adjust 
>> the grid
>> and dialogue box widths, and Add button position, dynamically, which I 
>> did
>> manually, and the internal blank space may be a bit harder to scan for 
>> users
>> who are interested only in their geographic region, country, or territory
>> - these are pretty well aligned as strings.
>>
>> KB> Adding the protocol to the key instead of the full url isn't 
>> enough to
>> KB> guarantee uniqueness of the key.
>> KB> Users can add their own sites for which that fails.
>> KB> Also, you shouldn't be relying on the servername field being 
>> filled in.
>> KB> For example, the function SiteSetting::registerSavedSite() explicitly
>> KB> constructs a site 'tempSite' that has an empty servername.
>>
>> JT> This code needs to handle manually added URLs with non-FQDNs.
>> JT> - - url...'s going to need fixing.
>>
>> If host name provided, use as servername, or default to localhost.
>>
>> If at least two domain components:
>> if CC TLD, use as area in uppercase, or
>> if gTLD, use as area with init cap;
>> if 2-3 char TLD and 2-3 char 2nd level domain, may be subTLD, so if 
>> 3rd level
>> domain exists and does not start with (or contain?) "cygwin", "ftp", 
>> "mirror",
>> "www", add space plus 2-3 char 2nd level domain with init cap to area.
>> Use next level domain as location with init cap and uppercase after 
>> any "-".
>> Otherwise default area to Local, and use servername as location with 
>> init cap.
>> This adhoc approach seems to work well enough and avoids having to 
>> drag in and
>> process the rules in the Public Suffix List (libpsl5 and
>> publicsuffix-list-dafsa).
>>
>> Examples:
>>
>> Input    ->
>>      List
>>      Displayed
>>      Key
>>
>> http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com        ->
>> http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/;mirror.cpsc.ucalgary.ca;CA;Ucalgary 
>>
>> CA - Ucalgary - http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com
>> CA Ucalgary ca ucalgary cpsc mirror
>> http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/
>>
>> [cyg]file://server/c$/usr/local/cygwin/var/cache/setup/packages    ->
>> [cyg]file://server/c$/usr/local/cygwin/var/cache/setup/packages/;server;Local;Server 
>>
>> Local - Server - 
>> file://server/c$/usr/local/cygwin/var/cache/setup/packages
>> Local Server server 
>> file://server/c$/usr/local/cygwin/var/cache/setup/packages/
>>
>> cygfile:///var/cache/setup/packages        ->
>> cygfile:///var/cache/setup/packages/;localhost;Local;Localhost
>> Local - Localhost - cygfile://var/cache/setup/packages
>> Local Localhost localhost cygfile://var/cache/setup/packages/
>>
>> file:///c|/usr/local/cygwin/var/cache/setup/packages        ->
>> file:///c|/usr/local/cygwin/var/cache/setup/packages/;localhost;Local;Localhost 
>>
>> Local - Localhost - file:///c|/usr/local/cygwin/var/cache/setup/packages
>> Local Localhost localhost 
>> file:///c|/usr/local/cygwin/var/cache/setup/packages/
>>
>> Cygwin file:// does not appear to use /// or support anything but 
>> local files,
>> so should always be treated as localhost; or does it support servers, 
>> but they
>> can be treated as if directories under Windows, so we need to do a 
>> server or
>> directory check, or that check is done at a lower level?
>>
>> JT> you can also just add a path.
>>
>> If UNC path with server, use as above, but include share name in display,
>> otherwise default as above.
>>
>> Examples:
>>
>> \\server\c$\usr\local\cygwin\var\cache\setup\packages        ->
>> \\server\c$\usr\local\cygwin\var\cache\setup\packages/;server;Local;Server 
>>
>> Local - Server - \\server\c$
>> Local Server server 
>> \\server\c$\usr\local\cygwin\var\cache\setup\packages/
>>
>> c:\usr\local\cygwin\var\cache\setup\packages            ->
>> c:\usr\local\cygwin\var\cache\setup\packages/;localhost;Local;Localhost
>> Local - Localhost - c:\usr\local\cygwin\var\cache\setup\packages
>> Local Localhost localhost c:\usr\local\cygwin\var\cache\setup\packages/
>>
>> Where are the cached sites stashed - setup keeps showing my old test 
>> entries?
>>
>> JT> if you add two URLs with the same protocol/host, they aren't 
>> distinguishable
>>
>> Check for and eliminate duplicates after sort if same site_list fields.
>> Handled by KB patch.
>> Note on flag added by that patch - would is_current, is_good, or 
>> is_known be a
>> better reflection of what that flag means?
>> Should we add ==(string) to compare only the url member?
>> Should we add a constructor(string,is_official) to refactor string to 
>> site
>> conversions, and the url additions?
>>
>> KB> If the saved url corresponds to a site in the list, I think setup 
>> should
>> KB> recognize this rather than listing the same site again as " - - url".
>>
>> JT> If the URL matches one in the list, that item should be selected.
>>
>> Check for URL match with site list and use match as site:
>> added for tempSite in RegisterSavedSites and newsite on Add;
>> required as key now includes area and location, not just URL.
>> Could we add a check to a string contructor to return only a reference 
>> to an
>> existing entry, if the url matches, to refactor the above changes?
>>
>> JT> Store area and location so we can populate those fields if it 
>> doesn't.
>>
>> See above.
>>
>> JT> Please, please use git-format-patch in future.
>>
>> I can't tell from looking, does it differ from diff output, and is 
>> there a
>> reason to use git-format-patch for review?
>> I use diffs for feedback until it DTRT.
>> I can manage one commit and one format-patch on a good day.
>> I should just stick to using what Tortoise Git supports.
>> My record with git includes a lot of rm -rf and re-clone, as the 
>> files, trees,
>> and indices get out of whack when I forget some incantation and try to 
>> fix it!
>> Very brittle and unfriendly - if it's going to bork itself, just don't 
>> do that!
>> I never had a single problem with CVS or Hg in many years ;^>
>>
>> JT> The patch also has some spurious changes to the mode of the 
>> manifest files.
>>
>> Realized I shouldn't tidy up unnecessary x modes under git control - 
>> habit
>> - gets me into trouble and I learn new things - checked out masters.
> 
> Hi Brian,
> 
> The issue of recognizing a URL that's already in the list is not fixed. 
> Here's what I tried:
> 
> I ran setup with the last mirror (as stored in /etc/setup/setup.rc) 
> being http://mirrors.kernel.org/sourceware/cygwin/.  The resulting 
> mirror list display contained the following, highlighted:
> 
>    Org - Kernel - http://mirrors.kernel.org
> 
> So setup didn't recognize that 
> http://mirrors.kernel.org/sourceware/cygwin/ was already in the list, 
> displayed as
> 
>    United States - California - http://mirrors.kernel.org
> 
> Also, I don't like the fake 'area' and 'location' fields for user-added 
> URLs.  For example, "Org" and "Kernel" are startling in the display 
> above and add no information to the plain URL.  Another example is my 
> own personal repository, http://sanibeltranquility.com/cygwin, which 
> displays as
> 
>    Com - Sanibeltranquility - http://sanibeltranquility.com
> 
> I would rather see the area and location left blank, or perhaps replaced 
> by "Unknown".  A third alternative would be to replace "Area - Location" 
> by "User-added" or something similar.

One other thing I just noticed: If I cancel the installation while the 
mirror list is displayed, the wrong URL(s) is/are written into 
/etc/setup/setup.rc under "last-mirror".  I don't recall this ever 
happening before, but maybe I just never noticed.

Ken

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-20 22:59     ` Ken Brown
  2017-11-20 23:02       ` Ken Brown
@ 2017-11-22 16:42       ` Brian Inglis
  2017-11-23 21:07         ` Ken Brown
  1 sibling, 1 reply; 15+ messages in thread
From: Brian Inglis @ 2017-11-22 16:42 UTC (permalink / raw)
  To: cygwin-apps

[-- Attachment #1: Type: text/plain, Size: 2625 bytes --]

On 2017-11-20 15:59, Ken Brown wrote:
> On 11/20/2017 4:30 PM, Brian Inglis wrote:
>> On 2017-11-17 06:53, Jon Turney wrote:
>> On 11/17/2017 8:48 AM, Ken Brown wrote:
>>> On 15/11/2017 21:35, Brian Inglis wrote:
> The issue of recognizing a URL that's already in the list is not fixed. Here's
> what I tried:
> 
> I ran setup with the last mirror (as stored in /etc/setup/setup.rc) being
> http://mirrors.kernel.org/sourceware/cygwin/.  The resulting mirror list display
> contained the following, highlighted:
>
>   Org - Kernel - http://mirrors.kernel.org
> 
> So setup didn't recognize that http://mirrors.kernel.org/sourceware/cygwin/ was
> already in the list, displayed as
> 
>   United States - California - http://mirrors.kernel.org

Should have been working so I added more LOG_BABBLE and it looks like setup.rc
is processed at the start as you would expect, but it is merged into empty site
lists, as mirrors.lst is downloaded only after you say you can, and proxy
settings, in a separate download thread: see attached.
And do you know where the cached list comes from - that seems to keep old
mirrors around from testing, but I can't find any cached list with the test
mirrors appearing in the list - I would expect that to come from setup.rc - but
appears to be loaded when mirrors.lst is.
So we need to defer adding the last mirror with some kind of lazy evaluation
hooked after cached mirrors and/or mirrors.lst is available.
If we have a setup.rc, I would expect the cached mirrors site list loaded from
there at startup, before the last mirror is merged (and found!).
Can make those changes but would like ny assumptions questioned, corrected, or
validated!

> Also, I don't like the fake 'area' and 'location' fields for user-added URLs. 
> For example, "Org" and "Kernel" are startling in the display above and add no
> information to the plain URL.  Another example is my own personal repository,
> http://sanibeltranquility.com/cygwin, which displays as
> 
>   Com - Sanibeltranquility - http://sanibeltranquility.com

Removed and defaulted both to " Unknown " - would sort to top as " Unknown  -
Unknown  - ...".

> I would rather see the area and location left blank, or perhaps replaced by
> "Unknown".  A third alternative would be to replace "Area - Location" by
> "User-added" or something similar.

Changed to " Saved " and " Mirror " and " User " and " Added " - sorts to top as
" Saved  -  Mirror  - ..." and " User  -  Added  - ..." - could make them a
single entry, but want to get mirror site list handling working first.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

[-- Attachment #2: setup.log.full.txt --]
[-- Type: text/plain, Size: 803 bytes --]

2017/11/22 01:09:17 Starting cygwin install, version 2.882-4-gdcbcbe
2017/11/22 01:09:17 User has backup/restore rights
Checking site: http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/
Merging site: http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/
2017/11/22 01:09:17 Current Directory: C:/usr/local/cygwin64/var/cache/setup/packages
2017/11/22 01:09:17 Could not open service McShield for query, start and stop. McAfee may not be installed, or we don't have access.
2017/11/22 01:09:19 source: network install
2017/11/22 01:09:20 root: C:\usr\local\cygwin64 system
2017/11/22 01:09:20 Selected local directory: C:/usr/local/cygwin64/var/cache/setup/packages
2017/11/22 01:09:21 net: Direct
Loaded cached mirror list
Fetched URL: http://cygwin.com/mirrors.lst
2017/11/22 01:09:25 Ending cygwin install

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-20 23:02       ` Ken Brown
@ 2017-11-22 16:48         ` Brian Inglis
  0 siblings, 0 replies; 15+ messages in thread
From: Brian Inglis @ 2017-11-22 16:48 UTC (permalink / raw)
  To: cygwin-apps

On 2017-11-20 16:02, Ken Brown wrote:
> On 11/20/2017 5:59 PM, Ken Brown wrote:
>> On 11/20/2017 4:30 PM, Brian Inglis wrote:
>>> On 2017-11-17 06:53, Jon Turney wrote:
>>> On 11/17/2017 8:48 AM, Ken Brown wrote:
>>>> On 15/11/2017 21:35, Brian Inglis wrote:
> One other thing I just noticed: If I cancel the installation while the mirror
> list is displayed, the wrong URL(s) is/are written into /etc/setup/setup.rc
> under "last-mirror".  I don't recall this ever happening before, but maybe I
> just never noticed.

Seems that's updated as soon as you select any list entry.
If you cursor scroll down, I expect each would be set and updated.
If you only ever hit Escape or press Cancel, it's left unchanged.
Never seen "wrong URLs" added - just what I selected!

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-22 16:42       ` Brian Inglis
@ 2017-11-23 21:07         ` Ken Brown
  2017-11-24  6:23           ` Brian Inglis
  0 siblings, 1 reply; 15+ messages in thread
From: Ken Brown @ 2017-11-23 21:07 UTC (permalink / raw)
  To: cygwin-apps

On 11/22/2017 11:42 AM, Brian Inglis wrote:
> On 2017-11-20 15:59, Ken Brown wrote:
>> On 11/20/2017 4:30 PM, Brian Inglis wrote:
>>> On 2017-11-17 06:53, Jon Turney wrote:
>>> On 11/17/2017 8:48 AM, Ken Brown wrote:
>>>> On 15/11/2017 21:35, Brian Inglis wrote:
>> The issue of recognizing a URL that's already in the list is not fixed. Here's
>> what I tried:
>>
>> I ran setup with the last mirror (as stored in /etc/setup/setup.rc) being
>> http://mirrors.kernel.org/sourceware/cygwin/.  The resulting mirror list display
>> contained the following, highlighted:
>>
>>    Org - Kernel - http://mirrors.kernel.org
>>
>> So setup didn't recognize that http://mirrors.kernel.org/sourceware/cygwin/ was
>> already in the list, displayed as
>>
>>    United States - California - http://mirrors.kernel.org
> 
> Should have been working so I added more LOG_BABBLE and it looks like setup.rc
> is processed at the start as you would expect, but it is merged into empty site
> lists, as mirrors.lst is downloaded only after you say you can, and proxy
> settings, in a separate download thread: see attached.
> And do you know where the cached list comes from - that seems to keep old
> mirrors around from testing, but I can't find any cached list with the test
> mirrors appearing in the list - I would expect that to come from setup.rc - but
> appears to be loaded when mirrors.lst is.
> So we need to defer adding the last mirror with some kind of lazy evaluation
> hooked after cached mirrors and/or mirrors.lst is available.
> If we have a setup.rc, I would expect the cached mirrors site list loaded from
> there at startup, before the last mirror is merged (and found!).
> Can make those changes but would like ny assumptions questioned, corrected, or
> validated!

I'm not aware of any place previously used sites could come from other 
than setup.rc.  Are you sure your test mirrors aren't listed there?

The fact that mirrors.lst is merged into all_site_list after the 
last-mirror list shouldn't prevent a site in mirrors.lst from replacing 
one with the same URL.  load_site_list() tries to do exactly that.  I 
think the problem is that 'find (theSites.begin(), theSites.end(), 
newsite)' is using an operator== based on the key instead of the URL. 
Changing that as you suggested in a different message might fix the 
problem, but I haven't thought about whether it could cause problems 
elsewhere.

Ken

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-23 21:07         ` Ken Brown
@ 2017-11-24  6:23           ` Brian Inglis
  2017-11-24 16:21             ` Ken Brown
  0 siblings, 1 reply; 15+ messages in thread
From: Brian Inglis @ 2017-11-24  6:23 UTC (permalink / raw)
  To: cygwin-apps

On 2017-11-23 14:07, Ken Brown wrote:
> On 11/22/2017 11:42 AM, Brian Inglis wrote:
>> On 2017-11-20 15:59, Ken Brown wrote:
>>> On 11/20/2017 4:30 PM, Brian Inglis wrote:
>>>> On 2017-11-17 06:53, Jon Turney wrote:
>>>> On 11/17/2017 8:48 AM, Ken Brown wrote:
>>>>> On 15/11/2017 21:35, Brian Inglis wrote:
>>> The issue of recognizing a URL that's already in the list is not fixed. Here's
>>> what I tried:
>>>
>>> I ran setup with the last mirror (as stored in /etc/setup/setup.rc) being
>>> http://mirrors.kernel.org/sourceware/cygwin/.  The resulting mirror list display
>>> contained the following, highlighted:
>>>
>>>    Org - Kernel - http://mirrors.kernel.org
>>>
>>> So setup didn't recognize that http://mirrors.kernel.org/sourceware/cygwin/ was
>>> already in the list, displayed as
>>>
>>>    United States - California - http://mirrors.kernel.org
>>
>> Should have been working so I added more LOG_BABBLE and it looks like setup.rc
>> is processed at the start as you would expect, but it is merged into empty site
>> lists, as mirrors.lst is downloaded only after you say you can, and proxy
>> settings, in a separate download thread: see attached.
>> And do you know where the cached list comes from - that seems to keep old
>> mirrors around from testing, but I can't find any cached list with the test
>> mirrors appearing in the list - I would expect that to come from setup.rc - but
>> appears to be loaded when mirrors.lst is.
>> So we need to defer adding the last mirror with some kind of lazy evaluation
>> hooked after cached mirrors and/or mirrors.lst is available.
>> If we have a setup.rc, I would expect the cached mirrors site list loaded from
>> there at startup, before the last mirror is merged (and found!).
>> Can make those changes but would like ny assumptions questioned, corrected, or
>> validated!
> 
> I'm not aware of any place previously used sites could come from other than
> setup.rc.  Are you sure your test mirrors aren't listed there?

No - that's why I've been scratching or beating my head about cache location!

> The fact that mirrors.lst is merged into all_site_list after the last-mirror
> list shouldn't prevent a site in mirrors.lst from replacing one with the same
> URL.  load_site_list() tries to do exactly that.  I think the problem is that
> 'find (theSites.begin(), theSites.end(), newsite)' is using an operator== based
> on the key instead of the URL. Changing that as you suggested in a different
> message might fix the problem, but I haven't thought about whether it could
> cause problems elsewhere.

That's why I suggested (string) constructor and/or operator== (string)
overloads, to allow very selective behaviour change, without changing other
uses. I have not used non-C-like OO C++ much, with too many casts required, but
find that approach very natural using JS object prototypes to augment behaviour.

Currently both the cached and downloaded mirror lists are set up after the
download. It may be more obvious to set up the cached list (if there is one,
which will be almost always, except brand new setups) at the start, before
"merging" the last mirror, which currently just adds that entry to an empty
cached list. If there is no cached list, there will be no last mirror to merge.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-24  6:23           ` Brian Inglis
@ 2017-11-24 16:21             ` Ken Brown
  2017-11-24 21:48               ` Brian Inglis
  0 siblings, 1 reply; 15+ messages in thread
From: Ken Brown @ 2017-11-24 16:21 UTC (permalink / raw)
  To: cygwin-apps

On 11/24/2017 1:23 AM, Brian Inglis wrote:
> On 2017-11-23 14:07, Ken Brown wrote:
>> On 11/22/2017 11:42 AM, Brian Inglis wrote:
>>> On 2017-11-20 15:59, Ken Brown wrote:
>>>> On 11/20/2017 4:30 PM, Brian Inglis wrote:
>>>>> On 2017-11-17 06:53, Jon Turney wrote:
>>>>> On 11/17/2017 8:48 AM, Ken Brown wrote:
>>>>>> On 15/11/2017 21:35, Brian Inglis wrote:
>>>> The issue of recognizing a URL that's already in the list is not fixed. Here's
>>>> what I tried:
>>>>
>>>> I ran setup with the last mirror (as stored in /etc/setup/setup.rc) being
>>>> http://mirrors.kernel.org/sourceware/cygwin/.  The resulting mirror list display
>>>> contained the following, highlighted:
>>>>
>>>>     Org - Kernel - http://mirrors.kernel.org
>>>>
>>>> So setup didn't recognize that http://mirrors.kernel.org/sourceware/cygwin/ was
>>>> already in the list, displayed as
>>>>
>>>>     United States - California - http://mirrors.kernel.org
>>>
>>> Should have been working so I added more LOG_BABBLE and it looks like setup.rc
>>> is processed at the start as you would expect, but it is merged into empty site
>>> lists, as mirrors.lst is downloaded only after you say you can, and proxy
>>> settings, in a separate download thread: see attached.
>>> And do you know where the cached list comes from - that seems to keep old
>>> mirrors around from testing, but I can't find any cached list with the test
>>> mirrors appearing in the list - I would expect that to come from setup.rc - but
>>> appears to be loaded when mirrors.lst is.
>>> So we need to defer adding the last mirror with some kind of lazy evaluation
>>> hooked after cached mirrors and/or mirrors.lst is available.
>>> If we have a setup.rc, I would expect the cached mirrors site list loaded from
>>> there at startup, before the last mirror is merged (and found!).
>>> Can make those changes but would like ny assumptions questioned, corrected, or
>>> validated!
>>
>> I'm not aware of any place previously used sites could come from other than
>> setup.rc.  Are you sure your test mirrors aren't listed there?
> 
> No - that's why I've been scratching or beating my head about cache location!
> 
>> The fact that mirrors.lst is merged into all_site_list after the last-mirror
>> list shouldn't prevent a site in mirrors.lst from replacing one with the same
>> URL.  load_site_list() tries to do exactly that.  I think the problem is that
>> 'find (theSites.begin(), theSites.end(), newsite)' is using an operator== based
>> on the key instead of the URL. Changing that as you suggested in a different
>> message might fix the problem, but I haven't thought about whether it could
>> cause problems elsewhere.
> 
> That's why I suggested (string) constructor and/or operator== (string)
> overloads, to allow very selective behaviour change, without changing other
> uses. I have not used non-C-like OO C++ much, with too many casts required, but
> find that approach very natural using JS object prototypes to augment behaviour.
> 
> Currently both the cached and downloaded mirror lists are set up after the
> download. It may be more obvious to set up the cached list (if there is one,
> which will be almost always, except brand new setups) at the start, before
> "merging" the last mirror, which currently just adds that entry to an empty
> cached list. If there is no cached list, there will be no last mirror to merge.

Sorry, but I'm having trouble following what you're saying.  By "cached 
list", are you talking about the variable 'cached_site_list'?  This is 
populated with the mirrors listed under 'mirrors-lst' in setup.rc.  The 
URLs listed under 'last-mirror' don't go into that list.  They go into 
'site_list' and 'all_site_list'.  See site.cc starting around line 85, 
where these variables are defined with explanatory comments.

Ken

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-24 16:21             ` Ken Brown
@ 2017-11-24 21:48               ` Brian Inglis
  2017-11-25 16:49                 ` Ken Brown
  0 siblings, 1 reply; 15+ messages in thread
From: Brian Inglis @ 2017-11-24 21:48 UTC (permalink / raw)
  To: cygwin-apps

On 2017-11-24 09:21, Ken Brown wrote:
> On 11/24/2017 1:23 AM, Brian Inglis wrote:
>> On 2017-11-23 14:07, Ken Brown wrote:
>>> On 11/22/2017 11:42 AM, Brian Inglis wrote:
>>>> On 2017-11-20 15:59, Ken Brown wrote:
>>>>> On 11/20/2017 4:30 PM, Brian Inglis wrote:
>>>>>> On 2017-11-17 06:53, Jon Turney wrote:
>>>>>> On 11/17/2017 8:48 AM, Ken Brown wrote:
>>>>>>> On 15/11/2017 21:35, Brian Inglis wrote:
>>>>> The issue of recognizing a URL that's already in the list is not fixed. Here's
>>>>> what I tried:
>>>>>
>>>>> I ran setup with the last mirror (as stored in /etc/setup/setup.rc) being
>>>>> http://mirrors.kernel.org/sourceware/cygwin/.  The resulting mirror list
>>>>> display
>>>>> contained the following, highlighted:
>>>>>
>>>>>     Org - Kernel - http://mirrors.kernel.org
>>>>>
>>>>> So setup didn't recognize that http://mirrors.kernel.org/sourceware/cygwin/
>>>>> was
>>>>> already in the list, displayed as
>>>>>
>>>>>     United States - California - http://mirrors.kernel.org
>>>>
>>>> Should have been working so I added more LOG_BABBLE and it looks like setup.rc
>>>> is processed at the start as you would expect, but it is merged into empty site
>>>> lists, as mirrors.lst is downloaded only after you say you can, and proxy
>>>> settings, in a separate download thread: see attached.
>>>> And do you know where the cached list comes from - that seems to keep old
>>>> mirrors around from testing, but I can't find any cached list with the test
>>>> mirrors appearing in the list - I would expect that to come from setup.rc - but
>>>> appears to be loaded when mirrors.lst is.
>>>> So we need to defer adding the last mirror with some kind of lazy evaluation
>>>> hooked after cached mirrors and/or mirrors.lst is available.
>>>> If we have a setup.rc, I would expect the cached mirrors site list loaded from
>>>> there at startup, before the last mirror is merged (and found!).
>>>> Can make those changes but would like ny assumptions questioned, corrected, or
>>>> validated!
>>>
>>> I'm not aware of any place previously used sites could come from other than
>>> setup.rc.  Are you sure your test mirrors aren't listed there?
>>
>> No - that's why I've been scratching or beating my head about cache location!
>>
>>> The fact that mirrors.lst is merged into all_site_list after the last-mirror
>>> list shouldn't prevent a site in mirrors.lst from replacing one with the same
>>> URL.  load_site_list() tries to do exactly that.  I think the problem is that
>>> 'find (theSites.begin(), theSites.end(), newsite)' is using an operator== based
>>> on the key instead of the URL. Changing that as you suggested in a different
>>> message might fix the problem, but I haven't thought about whether it could
>>> cause problems elsewhere.

Comparison of only the url is the same as done currently with they which is only
the url, so should have no adverse effect: the other members are used only for
sorting and display, not I/O.

>> That's why I suggested (string) constructor and/or operator== (string)
>> overloads, to allow very selective behaviour change, without changing other
>> uses. I have not used non-C-like OO C++ much, with too many casts required, but
>> find that approach very natural using JS object prototypes to augment behaviour.
>>
>> Currently both the cached and downloaded mirror lists are set up after the
>> download. It may be more obvious to set up the cached list (if there is one,
>> which will be almost always, except brand new setups) at the start, before
>> "merging" the last mirror, which currently just adds that entry to an empty
>> cached list. If there is no cached list, there will be no last mirror to merge.
> 
> Sorry, but I'm having trouble following what you're saying.  By "cached list",
> are you talking about the variable 'cached_site_list'?  This is populated with
> the mirrors listed under 'mirrors-lst' in setup.rc.  The URLs listed under
> 'last-mirror' don't go into that list.  They go into 'site_list' and
> 'all_site_list'.  See site.cc starting around line 85, where these variables are
> defined with explanatory comments.

Sorry last line first word cached should be removed - I was unclear about what
is and what could be done.
I am suggesting we split get_site_list to create the setup.rc cached list at the
start, flag entries false, merge the last-mirror entry into that, checking the
url for a match to use the list entry instead of the last mirror entry, then
after the mirrors.lst download, compare the cached entries with the mirrors list
entries, checking the url for a match to use the current mirrors list entry
instead of the cached entry, which would be flagged true in the cached list, and
proceeding with the rest of get_site_list and then check_dropped_mirrors.

(Also noticed, the site list search and merge operations in registerSavedSite
and load_site_list could be refactored into a separate site_list::merge() member
function, by allowing for an optional exclusion site list.)

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-24 21:48               ` Brian Inglis
@ 2017-11-25 16:49                 ` Ken Brown
  2017-11-25 17:44                   ` Brian Inglis
  0 siblings, 1 reply; 15+ messages in thread
From: Ken Brown @ 2017-11-25 16:49 UTC (permalink / raw)
  To: cygwin-apps

On 11/24/2017 4:47 PM, Brian Inglis wrote:
> I am suggesting we split get_site_list to create the setup.rc cached list at the
> start, flag entries false,

So you're throwing away the information that the entries in the cached 
site list came from mirrors.lst in a previous setup run?

> merge the last-mirror entry into that,

The sites listed under "last-mirror" are those that were selected during 
the last setup run.  They can include user-added URLs as well as mirrors 
from mirrors.lst.  They shouldn't be merged into the cached site list.

> checking the
> url for a match to use the list entry instead of the last mirror entry, then
> after the mirrors.lst download, compare the cached entries with the mirrors list
> entries, checking the url for a match to use the current mirrors list entry
> instead of the cached entry,

I'm not sure what you mean here.  Are you proposing merging the cached 
site list into all_site_list along with the sites from the current 
mirrors.lst?  I don't think that's a good idea.  It would mean 
deliberately showing the user old mirrors that are no longer in 
mirrors.lst.  Currently that only happens if such a mirror came from the 
"last-mirror" list, and it generates a warning when it happens.

> which would be flagged true in the cached list, and
> proceeding with the rest of get_site_list and then check_dropped_mirrors.
> 
> (Also noticed, the site list search and merge operations in registerSavedSite
> and load_site_list could be refactored into a separate site_list::merge() member
> function, by allowing for an optional exclusion site list.)

Ken

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] make setup mirror list more like web page not just urls
  2017-11-25 16:49                 ` Ken Brown
@ 2017-11-25 17:44                   ` Brian Inglis
  0 siblings, 0 replies; 15+ messages in thread
From: Brian Inglis @ 2017-11-25 17:44 UTC (permalink / raw)
  To: cygwin-apps

On 2017-11-25 09:49, Ken Brown wrote:
> On 11/24/2017 4:47 PM, Brian Inglis wrote:
>> I am suggesting we split get_site_list to create the setup.rc cached list at the
>> start, flag entries false,
> 
> So you're throwing away the information that the entries in the cached site list
> came from mirrors.lst in a previous setup run?
> 
>> merge the last-mirror entry into that,
> 
> The sites listed under "last-mirror" are those that were selected during the
> last setup run.  They can include user-added URLs as well as mirrors from
> mirrors.lst.  They shouldn't be merged into the cached site list.
> 
>> checking the
>> url for a match to use the list entry instead of the last mirror entry, then
>> after the mirrors.lst download, compare the cached entries with the mirrors list
>> entries, checking the url for a match to use the current mirrors list entry
>> instead of the cached entry,
> 
> I'm not sure what you mean here.  Are you proposing merging the cached site list
> into all_site_list along with the sites from the current mirrors.lst?  I don't
> think that's a good idea.  It would mean deliberately showing the user old
> mirrors that are no longer in mirrors.lst.  Currently that only happens if such
> a mirror came from the "last-mirror" list, and it generates a warning when it
> happens.

Merge in the sense of going down two lists with two cursors, using the known
current mirrors list to check what's okay in the cached setup.rc list, perhaps
replace the cached entry with the mirrors entry if found, or set the new known
current flag, and if not found, and it's in the last used list, add to the list
to complain about and confirm, as I said: ↓

>> which would be flagged true in the cached list, and
>> proceeding with the rest of get_site_list and then check_dropped_mirrors.
>>
>> (Also noticed, the site list search and merge operations in registerSavedSite
>> and load_site_list could be refactored into a separate site_list::merge() member
>> function, by allowing for an optional exclusion site list.)

My last used mirror is always my local university's public mirror which has
never been in the mirrors list, unadvertised and unmonitored, but local, fast,
convenient, and current (and also has mirrors I use for other distros).

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-11-25 17:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-15 21:35 [PATCH] make setup mirror list more like web page not just urls Brian Inglis
2017-11-17 13:48 ` Ken Brown
2017-11-19 20:24   ` Ken Brown
2017-11-17 13:53 ` Jon Turney
2017-11-20 21:30   ` Brian Inglis
2017-11-20 22:59     ` Ken Brown
2017-11-20 23:02       ` Ken Brown
2017-11-22 16:48         ` Brian Inglis
2017-11-22 16:42       ` Brian Inglis
2017-11-23 21:07         ` Ken Brown
2017-11-24  6:23           ` Brian Inglis
2017-11-24 16:21             ` Ken Brown
2017-11-24 21:48               ` Brian Inglis
2017-11-25 16:49                 ` Ken Brown
2017-11-25 17:44                   ` Brian Inglis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).