From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 108994 invoked by alias); 20 Nov 2017 21:30:31 -0000 Mailing-List: contact cygwin-apps-help@cygwin.com; run by ezmlm Precedence: bulk Sender: cygwin-apps-owner@cygwin.com List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Mail-Followup-To: cygwin-apps@cygwin.com Received: (qmail 108892 invoked by uid 89); 20 Nov 2017 21:30:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.3 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY,KB_WAM_FROM_NAME_SINGLEWORD,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=Canada, posts, cyg, Git X-HELO: smtp-out-no.shaw.ca Received: from smtp-out-no.shaw.ca (HELO smtp-out-no.shaw.ca) (64.59.134.13) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 20 Nov 2017 21:30:20 +0000 Received: from [192.168.1.100] ([24.64.240.204]) by shaw.ca with SMTP id GtdlejSkwRDG7Gtdme1eNs; Mon, 20 Nov 2017 14:30:07 -0700 X-Authority-Analysis: v=2.2 cv=b+PC2pOx c=1 sm=1 tr=0 a=MVEHjbUiAHxQW0jfcDq5EA==:117 a=MVEHjbUiAHxQW0jfcDq5EA==:17 a=r77TgQKjGQsHNAKrUKIA:9 a=1opBbpVxAAAA:8 a=w_pzkKWiAAAA:8 a=miUDL3Gluw6XiOjQS0IA:9 a=pILNOxqGKmIA:10 a=r_ptfxz7CoaZzq2Z1U4A:9 a=mWb7svc9HhpswGHa:21 a=QVQFOBihcYrwVO6n:21 a=QEXdDO2ut3YA:10 a=CdiWusdWvyIA:10 a=yVTSMUFZvGLFwOAdSMmD:22 a=sRI3_1zDfAgwuvI8zelB:22 Reply-To: Brian.Inglis@SystematicSw.ab.ca Subject: Re: [PATCH] make setup mirror list more like web page not just urls References: <09c0dc5b-f975-42a9-a204-b02edef6a3ad@SystematicSw.ab.ca> To: cygwin-apps@cygwin.com From: Brian Inglis Message-ID: <53a8001e-10ae-6318-ba3d-31d0cfa18f14@SystematicSw.ab.ca> Date: Mon, 20 Nov 2017 21:30:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------697424DD7D0078531B617290" X-CMAE-Envelope: MS4wfP3a6Ian/F6NbJgXvsqXVtUPKJaoNKwQaTgqOkJYh+5yMw57Fiw1vCXfKPGcd6V0qSYve6LSWOSj6hjGOn0eRDjPSpMpMHWJOcuGtaAJr1oztx77Q+xU kcTSs57PBpOm03zWhgfVH7rSPYwLCLl4tW7tOCbU7lWFZ8b73YHSwWHU378Ii73FHxRUSBT0LTOKYA== X-IsSubscribed: yes X-SW-Source: 2017-11/txt/msg00089.txt.bz2 This is a multi-part message in MIME format. --------------697424DD7D0078531B617290 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Content-length: 6322 On 2017-11-17 06:53, Jon Turney wrote: On 11/17/2017 8:48 AM, Ken Brown wrote: > On 15/11/2017 21:35, Brian Inglis wrote: Thanks for your comments, I understand all your points, and agree with them. I merged the posts and responses as they addressed the same issues. I also applied Ken's setup patch and rejigged mine where required. KB> Ken Brown JT> Jon Turney KB> I think this is a good idea in principle, but it still needs some work. JT> use a grid control to present this information. Don't think I want to learn that much about Windows controls! The width of the longest fields in each column would need to adjust the grid and dialogue box widths, and Add button position, dynamically, which I did manually, and the internal blank space may be a bit harder to scan for users who are interested only in their geographic region, country, or territory - these are pretty well aligned as strings. KB> Adding the protocol to the key instead of the full url isn't enough to KB> guarantee uniqueness of the key. KB> Users can add their own sites for which that fails. KB> Also, you shouldn't be relying on the servername field being filled in. KB> For example, the function SiteSetting::registerSavedSite() explicitly KB> constructs a site 'tempSite' that has an empty servername. JT> This code needs to handle manually added URLs with non-FQDNs. JT> - - url...'s going to need fixing. If host name provided, use as servername, or default to localhost. If at least two domain components: if CC TLD, use as area in uppercase, or if gTLD, use as area with init cap; if 2-3 char TLD and 2-3 char 2nd level domain, may be subTLD, so if 3rd level domain exists and does not start with (or contain?) "cygwin", "ftp", "mirror", "www", add space plus 2-3 char 2nd level domain with init cap to area. Use next level domain as location with init cap and uppercase after any "-". Otherwise default area to Local, and use servername as location with init cap. This adhoc approach seems to work well enough and avoids having to drag in and process the rules in the Public Suffix List (libpsl5 and publicsuffix-list-dafsa). Examples: Input -> List Displayed Key http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com -> http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/;mirror.cpsc.ucalgary.ca;CA;Ucalgary CA - Ucalgary - http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com CA Ucalgary ca ucalgary cpsc mirror http://mirror.cpsc.ucalgary.ca/mirror/cygwin.com/ [cyg]file://server/c$/usr/local/cygwin/var/cache/setup/packages -> [cyg]file://server/c$/usr/local/cygwin/var/cache/setup/packages/;server;Local;Server Local - Server - file://server/c$/usr/local/cygwin/var/cache/setup/packages Local Server server file://server/c$/usr/local/cygwin/var/cache/setup/packages/ cygfile:///var/cache/setup/packages -> cygfile:///var/cache/setup/packages/;localhost;Local;Localhost Local - Localhost - cygfile://var/cache/setup/packages Local Localhost localhost cygfile://var/cache/setup/packages/ file:///c|/usr/local/cygwin/var/cache/setup/packages -> file:///c|/usr/local/cygwin/var/cache/setup/packages/;localhost;Local;Localhost Local - Localhost - file:///c|/usr/local/cygwin/var/cache/setup/packages Local Localhost localhost file:///c|/usr/local/cygwin/var/cache/setup/packages/ Cygwin file:// does not appear to use /// or support anything but local files, so should always be treated as localhost; or does it support servers, but they can be treated as if directories under Windows, so we need to do a server or directory check, or that check is done at a lower level? JT> you can also just add a path. If UNC path with server, use as above, but include share name in display, otherwise default as above. Examples: \\server\c$\usr\local\cygwin\var\cache\setup\packages -> \\server\c$\usr\local\cygwin\var\cache\setup\packages/;server;Local;Server Local - Server - \\server\c$ Local Server server \\server\c$\usr\local\cygwin\var\cache\setup\packages/ c:\usr\local\cygwin\var\cache\setup\packages -> c:\usr\local\cygwin\var\cache\setup\packages/;localhost;Local;Localhost Local - Localhost - c:\usr\local\cygwin\var\cache\setup\packages Local Localhost localhost c:\usr\local\cygwin\var\cache\setup\packages/ Where are the cached sites stashed - setup keeps showing my old test entries? JT> if you add two URLs with the same protocol/host, they aren't distinguishable Check for and eliminate duplicates after sort if same site_list fields. Handled by KB patch. Note on flag added by that patch - would is_current, is_good, or is_known be a better reflection of what that flag means? Should we add ==(string) to compare only the url member? Should we add a constructor(string,is_official) to refactor string to site conversions, and the url additions? KB> If the saved url corresponds to a site in the list, I think setup should KB> recognize this rather than listing the same site again as " - - url". JT> If the URL matches one in the list, that item should be selected. Check for URL match with site list and use match as site: added for tempSite in RegisterSavedSites and newsite on Add; required as key now includes area and location, not just URL. Could we add a check to a string contructor to return only a reference to an existing entry, if the url matches, to refactor the above changes? JT> Store area and location so we can populate those fields if it doesn't. See above. JT> Please, please use git-format-patch in future. I can't tell from looking, does it differ from diff output, and is there a reason to use git-format-patch for review? I use diffs for feedback until it DTRT. I can manage one commit and one format-patch on a good day. I should just stick to using what Tortoise Git supports. My record with git includes a lot of rm -rf and re-clone, as the files, trees, and indices get out of whack when I forget some incantation and try to fix it! Very brittle and unfriendly - if it's going to bork itself, just don't do that! I never had a single problem with CVS or Hg in many years ;^> JT> The patch also has some spurious changes to the mode of the manifest files. Realized I shouldn't tidy up unnecessary x modes under git control - habit - gets me into trouble and I learn new things - checked out masters. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada --------------697424DD7D0078531B617290 Content-Type: text/plain; charset=UTF-8; name="0001-setup-mirror-site-list-default-members-display-and-order-by-area-and-location.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-setup-mirror-site-list-default-members-display-and-orde"; filename*1="r-by-area-and-location.patch" Content-length: 8935 diff --git a/res.rc b/res.rc index 80d1bf1..fa90a65 100644 --- a/res.rc +++ b/res.rc @@ -135,10 +135,10 @@ CAPTION "Cygwin Setup - Choose Download Site(s)" FONT 8, "MS Shell Dlg" BEGIN ICON IDI_CYGWIN,IDC_HEADICON,SETUP_HEADICON_X,0,21,20 - LISTBOX IDC_URL_LIST,66,45,185,110,LBS_NOINTEGRALHEIGHT | + LISTBOX IDC_URL_LIST,66,45,216,110,LBS_NOINTEGRALHEIGHT | LBS_EXTENDEDSEL | WS_VSCROLL | WS_HSCROLL | WS_GROUP | WS_TABSTOP - LTEXT "Available Download Sites:",IDC_STATIC,66,34,183,8,NOT + LTEXT "Available Download Sites:",IDC_STATIC,66,34,216,8,NOT WS_GROUP CONTROL "",IDC_HEADSEPARATOR,"Static",SS_BLACKFRAME | SS_SUNKEN,0,28, SETUP_STANDARD_DIALOG_W,1 @@ -146,10 +146,10 @@ BEGIN IDC_STATIC,21,9,239,16,NOT WS_GROUP LTEXT "Choose A Download Site",IDC_STATIC_HEADER_TITLE,7,0,258, 8,NOT WS_GROUP - EDITTEXT IDC_EDIT_USER_URL,65,160,185,14,ES_AUTOHSCROLL | + EDITTEXT IDC_EDIT_USER_URL,65,160,216,14,ES_AUTOHSCROLL | WS_GROUP LTEXT "User URL:",IDC_SITE_USERURL,15,162,45,8,NOT WS_GROUP - PUSHBUTTON "Add",IDC_BUTTON_ADD_URL,255,160,50,14 + PUSHBUTTON "Add",IDC_BUTTON_ADD_URL,288,160,50,14 END IDD_NET DIALOG DISCARDABLE 0, 0, SETUP_STANDARD_DIALOG_DIMS diff --git a/site.cc b/site.cc index c9fac12..a2f9406 100644 --- a/site.cc +++ b/site.cc @@ -18,6 +18,7 @@ #include #include +#include #include "site.h" #include "win32.h" @@ -141,6 +142,130 @@ SiteSetting::~SiteSetting () save (); } +static bool +common (const string &domain) +{ + const string prefixes[] = { "cygwin", "ftp", "mirror", "www" }; + + for (auto& s: prefixes) + { + if (s.length() <= domain.length() && + mismatch(s.begin(), s.end(), domain.begin()).first == s.end()) + return true; + } + + return false; +} + +static string +area_default (const string &servername) +{ + string::size_type last_idx = servername.length () - 1; + string::size_type idx = servername.find_last_of (".", last_idx); + + /* no dots => local server */ + if (idx == string::npos) + { + return "Local"; + } + + string area = servername.substr (idx + 1, last_idx - idx); + /* gTLD - initial cap. */ + area[0] = toupper (area[0]); + + /* length 2 TLD => CC TLD - uppercase. */ + if (last_idx - idx == 2) + { + area[1] = toupper (area[1]); + } + + /* length 2-3 TLD => may have subTLD - check. */ + if (last_idx - idx <= 3) + { + last_idx = idx - 1; + idx = servername.find_last_of (".", last_idx); + + /* domain below length 2-3 subTLD? */ + if (idx != string::npos && last_idx - idx <= 3) + { + string::size_type next_idx = servername.find_last_of (".", idx - 1); + + /* no more dots - begins at start */ + if (next_idx == string::npos) + { + next_idx = -1; + } + + /* check for common domain prefixes to ignore as location */ + if (!common (servername.substr (next_idx + 1, idx - next_idx - 1))) + { + /* not common domain - add subTLD to area with initial cap. */ + area += " " + toupper (servername[idx + 1]) + + servername.substr (idx + 2, last_idx - idx - 1); + } + } + } + + return area; +} + +static string +location_default (const string &servername) +{ + string::size_type last_idx = servername.length () - 1; + string::size_type idx = servername.find_last_of (".", last_idx); + string location = servername; + /* initial cap. */ + location[0] = toupper (location[0]); + + /* no dots => local server */ + if (idx == string::npos) + { + return location; + } + + /* length 2-3 TLD => may have subTLD - check. */ + if (last_idx - idx <= 3) + { + last_idx = idx - 1; + idx = servername.find_last_of (".", last_idx); + + /* domain below length 2-3 subTLD? */ + if (idx != string::npos && last_idx - idx <= 3) + { + string::size_type next_idx = servername.find_last_of (".", idx - 1); + + /* no more dots - begins at start */ + if (next_idx == string::npos) + { + next_idx = -1; + } + + /* check for common domain prefixes to ignore as location */ + if (!common (servername.substr (next_idx + 1, idx - next_idx - 1))) + { + /* not common domain - use that as location. */ + last_idx = idx - 1; + idx = next_idx; + } + } + } + + location = servername.substr (idx + 1, last_idx - idx); + + /* initial cap */ + location[0] = toupper (location[0]); + + idx = 0; + while ((idx = location.find( '-', idx)) != string::npos) + { + ++idx; + location[idx] = toupper (location[idx]); + } + + return location; +} + site_list_type::site_list_type (const string &_url, const string &_servername, const string &_area, @@ -157,30 +282,72 @@ site_list_type::site_list_type (const string &_url, if (url.at(url.length()-1) != '/') url.append("/"); - /* displayed_url is protocol and site name part of url */ - string::size_type path_offset = url.find ("/", url.find ("//") + 2); - displayed_url = url.substr(0, path_offset); + /* get protocol, host flag, and site name parts of url */ + string::size_type host_flag_posn = url.find ("//"); + string::size_type path_offset = (host_flag_posn == string::npos) + ? string::npos + : url.find ("/", host_flag_posn + 2); + /* not url-like, check for Windows UNC paths */ + if (host_flag_posn == string::npos) + { + if ('\\' == url[0] && '\\' == url[1]) + { + /* UNC path - get server */ + host_flag_posn = 0; + /* terminal '/' ensures result */ + path_offset = url.find_first_of ("\\/", host_flag_posn + 2); + } + } + /* default servername to url host or localhost */ + if (servername.length() == 0) { + servername = (path_offset != string::npos && + host_flag_posn + 2 < path_offset) + ? url.substr (host_flag_posn + 2, + path_offset - (host_flag_posn + 2)) + : "localhost"; + } + /* default area to TLD and maybe short 2nd level domain */ + if (area.length() == 0) { + area = area_default (servername); + } + /* default location to next level domain */ + if (location.length() == 0) { + location = location_default (servername); + } + + /* the sorting key is area, location, domain name components in reverse order + * (to group by country code) plus url (to ensure uniqueness) */ + key = area + " " + location + " "; + string::size_type last_idx = path_offset - 1; + string::size_type idx = url.find_last_of ("./", last_idx); - /* the sorting key is hostname components in reverse order (to sort by country code) - plus the url (to ensure uniqueness) */ - key = string(); - string::size_type last_idx = displayed_url.length () - 1; - string::size_type idx = url.find_last_of("./", last_idx); - if (last_idx - idx == 3) + if (3 <= last_idx - idx) { - /* Sort non-country TLDs (.com, .net, ...) together. */ + /* Sort gTLDs (.com, .net, .info, ...) together. */ key += " "; } - do + + while (idx < last_idx) { - key += url.substr(idx + 1, last_idx - idx); + key += url.substr (idx + 1, last_idx - idx); key += " "; last_idx = idx - 1; - idx = url.find_last_of("./", last_idx); - if (idx == string::npos) - idx = 0; - } while (idx > 0); + idx = url.find_last_of ("./\\", last_idx); + } + key += url; + + /* if double (back)slashes at start of url */ + if (host_flag_posn == 0) + { + /* check for next path delimiter */ + string::size_type share_end = url.find_first_of ("\\/", path_offset + 1); + /* include UNC server share name */ + if (share_end != string::npos) + path_offset = share_end; + } + /* displayed_url is area, location, protocol and site name part of url */ + displayed_url = area + " - " + location + " - " + url.substr (0, path_offset); } site_list_type::site_list_type (site_list_type const &rhs) @@ -379,6 +546,15 @@ void SiteSetting::registerSavedSite (const char * site) { site_list_type tempSite(site, "", "", "", false); + + /* use site in list if matching url */ + for (auto& s: all_site_list) + if (site == s.url) + { + tempSite = s; + break; + } + SiteList::iterator i = find (all_site_list.begin(), all_site_list.end(), tempSite); if (i == all_site_list.end()) @@ -699,6 +875,15 @@ bool SitePage::OnMessageCmd (int id, HWND hwndctl, UINT code) if (other_url.size()) { site_list_type newsite (other_url, "", "", "", false); + + /* use site in list if matching url */ + for (auto& s: all_site_list) + if (other_url == s.url) + { + newsite = s; + break; + } + SiteList::iterator i = find (all_site_list.begin(), all_site_list.end(), newsite); if (i == all_site_list.end()) --------------697424DD7D0078531B617290--