From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <fweimer@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTPS id ACB243858D28
 for <elfutils-devel@sourceware.org>; Tue, 30 Nov 2021 16:50:06 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ACB243858D28
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-378-5SGMezeyPq6UNdqQ-m0LVA-1; Tue, 30 Nov 2021 11:50:03 -0500
X-MC-Unique: 5SGMezeyPq6UNdqQ-m0LVA-1
Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com
 [10.5.11.22])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0B3E910A9080;
 Tue, 30 Nov 2021 16:50:02 +0000 (UTC)
Received: from oldenburg.str.redhat.com (unknown [10.39.193.123])
 by smtp.corp.redhat.com (Postfix) with ESMTPS id 9B61410013D6;
 Tue, 30 Nov 2021 16:50:00 +0000 (UTC)
From: Florian Weimer <fweimer@redhat.com>
To: "Frank Ch. Eigler via Elfutils-devel" <elfutils-devel@sourceware.org>
Cc: Mark Wielaard <mark@klomp.org>,  "Frank Ch. Eigler" <fche@redhat.com>,
 Luca Boccassi <luca.boccassi@gmail.com>
Subject: Re: [PATCH v2] libebl: recognize FDO Packaging Metadata ELF note
References: <20211119003127.466778-1-luca.boccassi@gmail.com>
 <20211121194318.105654-1-luca.boccassi@gmail.com>
 <40a5de54f089f344697ece88e11eb41e526462ac.camel@gmail.com>
 <17e1d554c9a52598d2c7d27e7a40f17381285ba5.camel@klomp.org>
 <20211130162352.GC17988@redhat.com>
Date: Tue, 30 Nov 2021 17:49:58 +0100
In-Reply-To: <20211130162352.GC17988@redhat.com> (Frank Ch. Eigler via
 Elfutils-devel's message of "Tue, 30 Nov 2021 11:23:52 -0500")
Message-ID: <87czmhbnbd.fsf@oldenburg.str.redhat.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain
X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW,
 RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: elfutils-devel@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Elfutils-devel mailing list <elfutils-devel.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/elfutils-devel>,
 <mailto:elfutils-devel-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/elfutils-devel/>
List-Help: <mailto:elfutils-devel-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/elfutils-devel>,
 <mailto:elfutils-devel-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Nov 2021 16:50:08 -0000

* Frank Ch. Eigler via Elfutils-devel:

> Hi -
>
> On Tue, Nov 30, 2021 at 12:25:41PM +0100, Mark Wielaard wrote:
>> [...]
>> The spec does explain the requirements for JSON numbers, but doesn't
>> mention any for JSON strings or JSON objects. It would be good to also
>> explain how to make the strings and objects unambiguous. [...]
>> For Objects it should require that all names are unique. [...]
>> For Strings it should require that \uXXXX escaping isn't used [...]
>> 
>> That should get rid of various corner cases that different parsers are
>> known to get wrong. 
>
> Are such buggy parsers seen in the wild, now, after all this time with
> JSON?  It seems to me it's not really elfutils' or systemd's place to
> impose -additional- constraints on customary JSON.

JSON has been targeted at the Windows/Java UTF-16 world, there is always
going to be a mismatch if you try to represent it in UTF-8 or anything
that doesn't have surrogate pairs.

>> Especially \uXXXX escaping is really confusing when using the UTF-8
>> encoding (and it is totally necessary since UTF-8 can express any
>> valid UTF character already).
>
> Yes, and yet we have had the bidi situation recently where UTF-8 raw
> codes could visually confuse a human reader whereas escaped \uXXXX
> wouldn't.  If we forbid \uXXXX unilaterally, we literally become
> incompatible with JSON (RFC8259 7. String. "Any character may be
> escaped."), and for what?

RFC 8259 says this:

   However, the ABNF in this specification allows member names and
   string values to contain bit sequences that cannot encode Unicode
   characters; for example, "\uDEAD" (a single unpaired UTF-16
   surrogate).  Instances of this have been observed, for example, when
   a library truncates a UTF-16 string without checking whether the
   truncation split a surrogate pair.  The behavior of software that
   receives JSON texts containing such values is unpredictable; for
   example, implementations might return different values for the length
   of a string value or even suffer fatal runtime exceptions.

A UTF-8 environment has to enforce *some* additional constraints
compared to the official JSON syntax.

Thanks,
Florian