* [PATCH cygport] Add check of SPDX expression provided by LICENSE variable
@ 2024-04-30 17:45 Christian Franke
2024-04-30 18:10 ` Brian Inglis
0 siblings, 1 reply; 5+ messages in thread
From: Christian Franke @ 2024-04-30 17:45 UTC (permalink / raw)
To: cygwin-apps
[-- Attachment #1: Type: text/plain, Size: 829 bytes --]
Jon Turney via Cygwin-apps wrote (thread "[PATCH cygport] Add
repro-finish command"):
> ...
>> PS: I have a local script which checks SPDX Identifiers and
>> expressions. Any interest to add this to cygport and then check
>> LICENSE settings?
>
> Oh, yes please. That sounds like a good idea.
>
Attached.
The new script uses the SPDX webpages to create the license file. I
didn't find a usable single license list at https://github.com/spdx
The data/spdx-licenses file is not included in the patch. It could be
generated from the source dir with:
$ tools/spdx-check -f data/spdx-licenses -m
...
data/spdx-licenses: created
$ sha1sum data/spdx-licenses
80a19d6891d08bf34113464464ee12308374c792 *data/spdx-licenses
The changes to the meson files are guessed. I didn't test the meson
build yet.
--
Regards,
Christian
[-- Attachment #2: 0001-Add-check-of-SPDX-expression-provided-by-LICENSE-var.patch --]
[-- Type: text/plain, Size: 7345 bytes --]
From 61f75757fa8e9118207cc09cf4a621aac8a4da78 Mon Sep 17 00:00:00 2001
From: Christian Franke <christian.franke@t-online.de>
Date: Tue, 30 Apr 2024 19:28:01 +0200
Subject: [PATCH] Add check of SPDX expression provided by LICENSE variable
The new script 'tools/spdx-checks' checks a SPDX license expression.
License identifiers are provided by the new file 'spdx-licenses'
which could be created by the script from the related SPDX webpages.
---
bin/cygport.in | 17 ++++
data/meson.build | 1 +
tools/meson.build | 1 +
tools/spdx-check | 198 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 217 insertions(+)
create mode 100644 tools/spdx-check
diff --git a/bin/cygport.in b/bin/cygport.in
index 15bd559e..3166beba 100755
--- a/bin/cygport.in
+++ b/bin/cygport.in
@@ -41,6 +41,7 @@ declare -r _cygport_version=@VERSION@;
declare -r _privdatadir=@pkgdatadir@;
declare -r _privclassdir=@cygclassdir@;
declare -r _privlibdir=@cygpartdir@;
+declare -r _privtoolsdir=@pkgdatadir@/tools;
declare -r _privgnuconfigdir=@gnuconfigdir@;
declare -r _privsysconfdir=@sysconfdir@;
@@ -489,6 +490,22 @@ do
fi
done
+if [ "${LICENSE+y}" = "y" ]
+then
+ if ! _out=$(${_privtoolsdir}/spdx-check -f ${_privdatadir}/spdx-licenses "${LICENSE}" 2>&1)
+ then
+ warning "LICENSE='${LICENSE}' is invalid:"
+ echo "${_out}"
+ elif [ "${_out:+y}" = "y" ]
+ then
+ warning "LICENSE='${LICENSE}' has warnings:"
+ echo "${_out}"
+ else
+ inform "LICENSE='${LICENSE}' is valid"
+ fi
+ unset _out
+fi
+
for restrict in ${RESTRICT//,/ }
do
declare _CYGPORT_RESTRICT_${restrict//-/_}_=1
diff --git a/data/meson.build b/data/meson.build
index 51c6a5fd..e83a90fe 100644
--- a/data/meson.build
+++ b/data/meson.build
@@ -2,6 +2,7 @@ datadocs = files('cygport.conf', 'mirrors')
install_data('mirrors',
'sample.cygport',
+ 'spdx-licenses',
install_dir: pkgdatadir)
install_data('gnuconfig/config.guess',
diff --git a/tools/meson.build b/tools/meson.build
index acd83926..96d8d19e 100644
--- a/tools/meson.build
+++ b/tools/meson.build
@@ -1,6 +1,7 @@
tools = files(
'deb2targz',
'pkgrip',
+ 'spdx-check',
'sysrootize'
)
diff --git a/tools/spdx-check b/tools/spdx-check
new file mode 100644
index 00000000..bffcaae0
--- /dev/null
+++ b/tools/spdx-check
@@ -0,0 +1,198 @@
+#! /bin/bash
+###############################################################################
+#
+# spdx-check - check SPDX license expression
+#
+# Copyright (C) 2024 Christian Franke
+#
+# SPDX-License-Identifier: BSD-3-Clause
+#
+################################################################################
+
+set -e -o pipefail
+myname=$0
+
+# SPDX license list web pages
+spdx_url_lic="https://spdx.org/licenses/index.html"
+spdx_url_exc="https://spdx.org/licenses/exceptions-index.html"
+
+# Default license file
+def_spdx_file="$(dirname "$myname")/spdx-licenses"
+
+usage()
+{
+ cat <<EOF
+Check SPDX license expression.
+
+Usage: $myname [-f FILE] [-mu] 'SPDX_EXPR'
+ $myname [-f FILE] -mu
+
+ -f read license identifiers from FILE
+ [default: $def_spdx_file]
+ -m create missing license file from SPDX webpages
+ -u always update the license file
+
+ SPDX_EXPR check this SPDX license expression
+EOF
+ exit 1
+}
+
+error()
+{
+ echo "Error:" "$@" >&2
+ exit 1
+}
+
+warning()
+{
+ echo "Warning:" "$@" >&2
+}
+
+check_spdx_id()
+{
+ local id=$1
+ local m m_id
+
+ if ! [ -f "$spdx_file" ]; then
+ warning "Missing '$spdx_file' - SPDX identifier '$1' not checked"
+ return 0
+ fi
+
+ # SPDX identifiers are case insensitive but the correct case is recommended
+ m=$(grep -Ei -m 1 "^!?&?${id//+/\\+}\$" "$spdx_file" 2>/dev/null) \
+ || error "Unknown SPDX identifier '$id'"
+
+ # TODO: Distinguish licenses and exceptions
+ m_id=${m#!}; m_id=${m_id#&}
+
+ [ "$m_id" = "$id" ] || warning "It is recommended to use '$m_id' instead of '$id'"
+ [ "$m" = "${m#!}" ] || warning "SPDX identifier '$m_id' is deprecated"
+}
+
+check_spdx_expr()
+{
+ local x=$1
+ local f s t
+
+ # Insert spaces around tokens to simplify parsing
+ x=" $x "; x=${x//(/ ( }; x=${x//)/ ) }
+
+ # Check tokens
+ f=false
+ for t in $x; do
+ f=true
+ case $t in
+ AND|OR|WITH|[\(\)])
+ ;;
+ [Aa][Nn][Dd]|[Oo][Rr]|[Ww][Ii][Tt][Hh])
+ error "Invalid token '$t' - use '${t@U}' instead" ;;
+ [0-9A-Za-z]*)
+ s=${t%+}; s=${s//[-.0-9A-Za-z]/}
+ [ -z "$s" ] || error "Invalid character(s) '$s' in '$t'" ;;
+ *)
+ error "Invalid token '$t'" ;;
+ esac
+ done
+ $f || error "Expression is empty"
+
+ # Check expression syntax heuristically using these replacements:
+ # - all operators -> '%'
+ # - all operands -> '@'
+ # - remove all spaces
+ # - '(@...%@)' -> '@' -- in a 'loop' to restart from the beginning
+ # - '@...%@' -> '' -- syntax error if nonempty
+ s=$(
+ sed -E \
+ -e 's/ (AND|OR|WITH) / % /g' \
+ -e 's/ [^ %()@]+ / @ /g' \
+ -e 's/ //g' \
+ -e ':loop' \
+ -e 's/\(@(%@)*\)/@/g' \
+ -e 't loop' \
+ -e 's/^@(%@)*$//' \
+ <<<"$x"
+ )
+ [ -z "$s" ] || error "Invalid syntax of SPDX expression"
+
+ # Check license identifiers
+ for t in $x; do
+ case $t in
+ AND|OR|WITH|[\(\)]) ;;
+ *) check_spdx_id "$t"
+ esac
+ done
+}
+
+# Extract identifiers from SPDX webpage and prepend flags
+html2id()
+{
+ sed -n \
+ -e '1,/<h2[^>]*>Deprecated/s/^.*<code property="spdx:licenseId">\([^>]*\)<.*$/\1/p' \
+ -e '/<h2[^>]*>Deprecated/,$s/^.*<code property="spdx:licenseId">\([^>]*\)<.*$/!\1/p' \
+ -e '1,/<h2[^>]*>Deprecated/s/^.*<code property="spdx:licenseExceptionId">\([^>]*\)<.*$/\&\1/p' \
+ -e '/<h2[^>]*>Deprecated/,$s/^.*<code property="spdx:licenseExceptionId">\([^>]*\)<.*$/!\&\1/p'
+}
+
+get_spdx_file()
+{
+ local f="$spdx_file.new"
+ local n1 n2
+
+ cat <<EOF >>"$f"
+# List of SPDX identifiers
+#
+# Created by '${myname##*/}' from these web pages:
+# $spdx_url_lic
+# $spdx_url_exc
+#
+# Flags: '!' - deprecated, '&' - license exception
+#
+EOF
+
+ # Download and extract identifiers
+ wget -O - "$spdx_url_lic" | html2id >>"$f"
+ n1=$(wc -l <"$f")
+ echo "#" >>"$f"
+ wget -O - "$spdx_url_exc" | html2id >>"$f"
+ n2=$(wc -l <"$f")
+
+ # Check length to detect download problems
+ { [[ n1 -gt 500 ]] && [[ $((n2-n1)) -gt 50 ]]; } || error "$f: File too small"
+
+ if [ -f "$spdx_file" ]; then
+ # Keep old file if unchanged otherwise keep it as backup
+ if cmp -s "$spdx_file" "$f"; then
+ echo "$spdx_file: unchanged"
+ rm -f "$f"
+ else
+ mv -bf "$f" "$spdx_file"
+ echo "$spdx_file: updated"
+ fi
+ else
+ mv -f "$f" "$spdx_file"
+ echo "$spdx_file: created"
+ fi
+}
+
+# Parse options
+spdx_file=$def_spdx_file
+m_opt=false; u_opt=false
+
+while true; do case $1 in
+ -f) [ $# -gt 1 ] || usage; shift; spdx_file=$1 ;;
+ -m) m_opt=true ;;
+ -u) u_opt=true; m_opt=true ;;
+ -*) usage ;;
+ *) break ;;
+esac; shift; done
+{ [ $# = 0 ] && $m_opt; } || [ $# = 1 ] || usage
+
+# Create or update the license file if requested
+if $u_opt || { $m_opt && ! [ -f "$spdx_file" ]; }; then
+ get_spdx_file
+fi
+
+# Check license expression
+if [ $# = 1 ]; then
+ check_spdx_expr "$1"
+fi
--
2.43.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH cygport] Add check of SPDX expression provided by LICENSE variable
2024-04-30 17:45 [PATCH cygport] Add check of SPDX expression provided by LICENSE variable Christian Franke
@ 2024-04-30 18:10 ` Brian Inglis
2024-04-30 21:07 ` Christian Franke
0 siblings, 1 reply; 5+ messages in thread
From: Brian Inglis @ 2024-04-30 18:10 UTC (permalink / raw)
To: cygwin-apps
On 2024-04-30 11:45, Christian Franke via Cygwin-apps wrote:
> Jon Turney via Cygwin-apps wrote:
>>> PS: I have a local script which checks SPDX Identifiers and expressions. Any
>>> interest to add this to cygport and then check LICENSE settings?
>> Oh, yes please. That sounds like a good idea.
> Attached.
> The new script uses the SPDX webpages to create the license file. I didn't find
> a usable single license list at https://github.com/spdx
What about:
https://spdx.github.io/license-list-data/
and everything under:
https://github.com/spdx/license-list-data
> The data/spdx-licenses file is not included in the patch. It could be generated
> from the source dir with:
> $ tools/spdx-check -f data/spdx-licenses -m
> ...
> data/spdx-licenses: created
> $ sha1sum data/spdx-licenses
> 80a19d6891d08bf34113464464ee12308374c792 *data/spdx-licenses
> The changes to the meson files are guessed. I didn't test the meson build yet.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH cygport] Add check of SPDX expression provided by LICENSE variable
2024-04-30 18:10 ` Brian Inglis
@ 2024-04-30 21:07 ` Christian Franke
2024-04-30 22:12 ` Brian Inglis
0 siblings, 1 reply; 5+ messages in thread
From: Christian Franke @ 2024-04-30 21:07 UTC (permalink / raw)
To: cygwin-apps
Brian Inglis via Cygwin-apps wrote:
> On 2024-04-30 11:45, Christian Franke via Cygwin-apps wrote:
> ...
>> Attached.
>> The new script uses the SPDX webpages to create the license file. I
>> didn't find a usable single license list at https://github.com/spdx
>
> What about:
>
> https://spdx.github.io/license-list-data/
>
This is apparently a draft version of
https://spdx.org/licenses/index.html which is used by the script to
generate the local license file.
> and everything under:
>
> https://github.com/spdx/license-list-data
I didn't find a single file which lists the licenses there.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH cygport] Add check of SPDX expression provided by LICENSE variable
2024-04-30 21:07 ` Christian Franke
@ 2024-04-30 22:12 ` Brian Inglis
2024-05-01 10:56 ` Christian Franke
0 siblings, 1 reply; 5+ messages in thread
From: Brian Inglis @ 2024-04-30 22:12 UTC (permalink / raw)
To: cygwin-apps
On 2024-04-30 15:07, Christian Franke via Cygwin-apps wrote:
> Brian Inglis via Cygwin-apps wrote:
>> On 2024-04-30 11:45, Christian Franke via Cygwin-apps wrote:
>>> The new script uses the SPDX webpages to create the license file. I didn't
>>> find a usable single license list at https://github.com/spdx
As usual, it is easier if you clearly state the purpose of the file you want,
and its desired properties, like data content, format, etc.
>> What about:
>> https://spdx.github.io/license-list-data/
> This is apparently a draft version of https://spdx.org/licenses/index.html which
> is used by the script to generate the local license file.
Strip out the table entries and create what you want with a command or script.
>> and everything under:
>> https://github.com/spdx/license-list-data
> I didn't find a single file which lists the licenses there.
GH does not always make access easy, with its limited online displays and fixed
display orders, and searches return a lot of junk, without easy access to better
searching in context, but try:
https://github.com/spdx/license-list-data/blob/main/licenses.md
which also has xrefs to the text files; also there are:
https://github.com/spdx/license-list-data/blob/main/json/licenses.json
https://github.com/spdx/license-list-data/blob/main/json/exceptions.json
which can be easily processed using `jq`.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH cygport] Add check of SPDX expression provided by LICENSE variable
2024-04-30 22:12 ` Brian Inglis
@ 2024-05-01 10:56 ` Christian Franke
0 siblings, 0 replies; 5+ messages in thread
From: Christian Franke @ 2024-05-01 10:56 UTC (permalink / raw)
To: cygwin-apps
Brian Inglis via Cygwin-apps wrote:
> On 2024-04-30 15:07, Christian Franke via Cygwin-apps wrote:
>> Brian Inglis via Cygwin-apps wrote:
>>> On 2024-04-30 11:45, Christian Franke via Cygwin-apps wrote:
>>>> The new script uses the SPDX webpages to create the license file. I
>>>> didn't find a usable single license list at https://github.com/spdx
>
> As usual, it is easier if you clearly state the purpose of the file
> you want, and its desired properties, like data content, format, etc.
>
>>> What about:
>>> https://spdx.github.io/license-list-data/
>
>> This is apparently a draft version of
>> https://spdx.org/licenses/index.html which is used by the script to
>> generate the local license file.
>
> Strip out the table entries and create what you want with a command or
> script.
The spdx-check script from the patch optionally (-m, -u) downloads
https://spdx.org/licenses/index.html and creates the local spdx-licenses
file intended to distribute with cygport. The file is grep'able.and
reduced to the bare minimum for this use case.
>
>>> and everything under:
>>> https://github.com/spdx/license-list-data
>
>> I didn't find a single file which lists the licenses there.
>
> GH does not always make access easy, ...
... including that github.com is still unreachable via IPv6 without
NAT64 (except for downloads from raw.githubusercontent.com) ...
> ... with its limited online displays and fixed display orders, and
> searches return a lot of junk, without easy access to better searching
> in context, but try:
>
> https://github.com/spdx/license-list-data/blob/main/licenses.md
>
> which also has xrefs to the text files; also there are:
>
> https://github.com/spdx/license-list-data/blob/main/json/licenses.json
>
> https://github.com/spdx/license-list-data/blob/main/json/exceptions.json
>
>
> which can be easily processed using `jq`.
>
Indeed, thanks. I obviously missed these files when I wrote the
spdx-check script some month ago.
The current file format used by the script could then be created with:
url="https://raw.githubusercontent.com/spdx/license-list-data/main/json"
wget -O - "$url/licenses.json" \
| jq -j '
.licenses[] | (
if .isDeprecatedLicenseId then "!" else "" end,
.licenseId,
"\n"
)'
wget -O - "$url/exceptions.json" \
| jq -j '
.exceptions[] | (
if .isDeprecatedLicenseId then "!&" else "&" end,
.licenseExceptionId,
"\n"
)'
This adds these license ids not yet mentioned at
https://spdx.org/licenses/index.html:
AMD-newlib, BSD-2-clause-first-lines, Catharon, HPND-UC-export-US,
MIT-Khronos-old, NCL, OAR, Sun-PPP-2000, pkgconf, threeparttable, xzoom
I could provide a new patch with an updated script if desired.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-05-01 10:56 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-30 17:45 [PATCH cygport] Add check of SPDX expression provided by LICENSE variable Christian Franke
2024-04-30 18:10 ` Brian Inglis
2024-04-30 21:07 ` Christian Franke
2024-04-30 22:12 ` Brian Inglis
2024-05-01 10:56 ` Christian Franke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).