public inbox for cygwin-apps-cvs@sourceware.org
help / color / mirror / Atom feed
From: Jon TURNEY <jturney@sourceware.org>
To: cygwin-apps-cvs@sourceware.org
Subject: [calm - Cygwin server-side packaging maintenance script] branch master, updated. 20200401-6-g5db0ed1
Date: Sun, 12 Apr 2020 18:03:06 +0000 (GMT)	[thread overview]
Message-ID: <20200412180306.86E0A385BF83@sourceware.org> (raw)




https://sourceware.org/git/gitweb.cgi?p=cygwin-apps/calm.git;h=5db0ed1406e2b0b0d24ea9bfe131b251e3254850

commit 5db0ed1406e2b0b0d24ea9bfe131b251e3254850
Author: Jon Turney <jon.turney@dronecode.org.uk>
Date:   Sun Apr 12 14:44:11 2020 +0100

    Check for redirects when fixing homepage: in src.hint
    
    Check for redirects when fixing homepage: in src.hint, and fix http: to
    https: redirects


Diff:
---
 calm/fixes.py | 84 +++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 58 insertions(+), 26 deletions(-)

diff --git a/calm/fixes.py b/calm/fixes.py
index be11b57..b07dccc 100644
--- a/calm/fixes.py
+++ b/calm/fixes.py
@@ -21,11 +21,14 @@
 # THE SOFTWARE.
 #
 
+import functools
 import logging
 import os
 import re
 import shutil
 import tarfile
+import urllib.request
+import urllib.error
 
 from . import hint
 
@@ -58,6 +61,25 @@ def read_cygport(dirpath, tf):
     return content
 
 
+class NoRedirection(urllib.request.HTTPErrorProcessor):
+    def http_response(self, request, response):
+        return response
+
+    https_response = http_response
+
+
+@functools.lru_cache(maxsize=None)
+def follow_redirect(homepage):
+    opener = urllib.request.build_opener(NoRedirection)
+    try:
+        response = opener.open(homepage)
+        if response.code == 301:
+            return response.headers['Location']
+    except (ConnectionResetError, ValueError, urllib.error.URLError) as e:
+        logging.warning('error %s checking homepage:%s' % (e, homepage))
+    return homepage
+
+
 def fix_homepage_src_hint(dirpath, hf, tf):
     pn = os.path.basename(dirpath)
     hintfile = os.path.join(dirpath, hf)
@@ -70,33 +92,43 @@ def fix_homepage_src_hint(dirpath, hf, tf):
 
     # already present?
     if 'homepage' in hints:
-        return
+        homepage = hints['homepage']
+    else:
+        # crack open corresponding -src.tar and parse homepage out from .cygport
+        logging.debug('examining %s' % tf)
+        content = read_cygport(dirpath, tf)
 
-    # crack open corresponding -src.tar and parse homepage out from .cygport
-    logging.debug('examining %s' % tf)
-    content = read_cygport(dirpath, tf)
-
-    homepage = None
-    if content:
-        for l in content.splitlines():
-            match = re.match(r'^\s*HOMEPAGE\s*=\s*("|)([^"].*)\1', l)
-            if match:
-                if homepage:
-                    logging.warning('multiple HOMEPAGE lines in .cygport in srcpkg %s', tf)
-                homepage = match.group(2)
-                homepage = re.sub(r'\$({|)(PN|ORIG_PN|NAME)(}|)', pn, homepage)
-
-    if homepage and '$' in homepage:
-        logging.warning('unknown shell parameter expansions in HOMEPAGE="%s" in .cygport in srcpkg %s' % (homepage, tf))
         homepage = None
-
-    if not homepage:
-        logging.info('cannot determine homepage: from srcpkg %s' % tf)
-        return
-
-    logging.info('adding homepage:%s to hints for srcpkg %s' % (homepage, tf))
+        if content:
+            for l in content.splitlines():
+                match = re.match(r'^\s*HOMEPAGE\s*=\s*("|)([^"].*)\1', l)
+                if match:
+                    if homepage:
+                        logging.warning('multiple HOMEPAGE lines in .cygport in srcpkg %s', tf)
+                    homepage = match.group(2)
+                    homepage = re.sub(r'\$({|)(PN|ORIG_PN|NAME)(}|)', pn, homepage)
+
+        if homepage and '$' in homepage:
+            logging.warning('unknown shell parameter expansions in HOMEPAGE="%s" in .cygport in srcpkg %s' % (homepage, tf))
+            homepage = None
+
+        if not homepage:
+            logging.info('cannot determine homepage: from srcpkg %s' % tf)
+            return
+
+        logging.info('adding homepage:%s to hints for srcpkg %s' % (homepage, tf))
+
+    # check for http -> https redirects
+    redirect_homepage = follow_redirect(homepage)
+    if redirect_homepage != homepage:
+        if redirect_homepage == homepage.replace('http://', 'https://'):
+            logging.warning('homepage:%s permanently redirects to %s, fixing' % (homepage, redirect_homepage))
+            homepage = redirect_homepage
+        else:
+            logging.warning('homepage:%s permanently redirects to %s' % (homepage, redirect_homepage))
 
     # write updated hints
-    hints['homepage'] = homepage
-    shutil.copy2(hintfile, hintfile + '.bak')
-    hint.hint_file_write(hintfile, hints)
+    if homepage != hints.get('homepage', None):
+        hints['homepage'] = homepage
+        shutil.copy2(hintfile, hintfile + '.bak')
+        hint.hint_file_write(hintfile, hints)



                 reply	other threads:[~2020-04-12 18:03 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200412180306.86E0A385BF83@sourceware.org \
    --to=jturney@sourceware.org \
    --cc=cygwin-apps-cvs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).