From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dcvr.yhbt.net (dcvr.yhbt.net [173.255.242.215]) by sourceware.org (Postfix) with ESMTPS id 513743858D3C for ; Sun, 21 Aug 2022 20:53:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 513743858D3C Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 6E8591F54E; Sun, 21 Aug 2022 20:53:38 +0000 (UTC) Date: Sun, 21 Aug 2022 20:53:38 +0000 From: Eric Wong To: Mark Wielaard Cc: meta@public-inbox.org, overseers@sourceware.org Subject: Re: Using plus (+) in list name Message-ID: <20220821205338.M316466@dcvr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: overseers@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Overseers mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Aug 2022 20:53:40 -0000 Mark Wielaard wrote: > Hi, > > We are setting up a public-inbox instance for cygwin/gcc/sourceware > lists at https://inbox.sourceware.org/ and it seems to work pretty > nicely. Thanks. Except for lists which have a + in their name like > libstdc++. > > I assume this needs some escaping somewhere, but I cannot figure out > where. The .public-inbox/config snippet looks like: I seem to remember '+' is OK as-is in the path component of HTTP URLs, but is escaping for ' ' (SP) in query strings. At least it's OK for a git-config section name: > [publicinbox "libstdc++"] > address = libstdc++@gcc.gnu.org > url = https://inbox.sourceware.org/libstdc++ > inboxdir = /home/inbox/lists/libstdc++ > indexlevel = full > newsgroup = inbox.gcc.libstdc++ > listid = libstdc++.gcc.gnu.org > > This seems to work fine for nntp and imap, but not https. Interesting that NNTP and IMAP work (I wasn't expecting it :x). I can't remember off the top of my head, but is '+' allowed by the relevant NNTP and List-Id RFCs? Anyways, good to see public-inbox getting more adoption :> > It does work when replacing the ++ with pp in the list name and > url. But that looks somewhat odd imho. And the name with ++ can be > used with e.g. mailman: > https://gcc.gnu.org/mailman/listinfo/libstdc++ > > Is there some way to configure public-inbox-http to be able to use ++ > in list names and urls? > > We are using the EPEL public-inbox package public-inbox-1.7.0-2.el8.noarch Totally untested, but perhaps changing $INBOX_RE in PublicInbox/WWW.pm will work: diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm index b9b68382..77f463d3 100644 --- a/lib/PublicInbox/WWW.pm +++ b/lib/PublicInbox/WWW.pm @@ -23,7 +23,7 @@ use PublicInbox::WwwStatic qw(r path_info_raw); use PublicInbox::Eml; # TODO: consider a routing tree now that we have more endpoints: -our $INBOX_RE = qr!\A/([\w\-][\w\.\-]*)!; +our $INBOX_RE = qr!\A/([\w\-][\w\.\-\+]*)!; our $MID_RE = qr!([^/]+)!; our $END_RE = qr!(T/|t/|t\.mbox(?:\.gz)?|t\.atom|raw|)!; our $ATTACH_RE = qr!([0-9][0-9\.]*)-($PublicInbox::Hval::FN)!;