From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libabigail-return-411-listarch-libabigail=sourceware.org@sourceware.org>
Received: (qmail 118637 invoked by alias); 19 Jan 2016 09:31:06 -0000
Mailing-List: contact libabigail-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Post: <mailto:libabigail@sourceware.org>
List-Help: <mailto:libabigail-help@sourceware.org>
List-Subscribe: <mailto:libabigail-subscribe@sourceware.org>
Sender: libabigail-owner@sourceware.org
Received: (qmail 117495 invoked by uid 89); 19 Jan 2016 09:31:05 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Checked: by ClamAV 0.99 on sourceware.org
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=H*r:1001
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org
X-Spam-Level: 
X-Spam-User: qpsmtpd, 2 recipients
X-HELO: ms.seketeli.fr
From: Dodji Seketeli <dodji@seketeli.org>
To: "woodard at redhat dot com" <sourceware-bugzilla@sourceware.org>
Cc: libabigail@sourceware.org
Subject: Re: [Bug default/19434] invalid character in attribute value
Organization: Me, myself and I
References: <bug-19434-9487@http.sourceware.org/bugzilla/>
	<bug-19434-9487-YlubBNgqdX@http.sourceware.org/bugzilla/>
X-Operating-System: Red Hat Enterprise Linux Workstation 7.2
X-URL: http://www.seketeli.net/~dodji
Date: Fri, 01 Jan 2016 00:00:00 -0000
In-Reply-To: <bug-19434-9487-YlubBNgqdX@http.sourceware.org/bugzilla/>
	(woodard at redhat dot com's message of "Mon, 18 Jan 2016 19:08:51
	+0000")
Message-ID: <864me97w6k.fsf@seketeli.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-SW-Source: 2016-q1/txt/msg00040.txt.bz2


> Is dropping the name on the floor the best thing to do? Wouldn't it be
> better to encode the non-ascii parameter name into 7b clean ascii sort
> of like uuencode does.

For now, we don't use the parameter name anyway.  In change reports,
function parameters are referred to using their position.

Furthermore, I think that since we don't know the actual encoding of the
characters, if we are sure they are not ASCII (which is the case here),
I don't think trying to encode each of the byte value can provide us
with any usable information.  It's just like if we had garbage.  We
won't be able to show any useable information to the user anyway.  Hence
my inclination to drop the name altogether.

But if one day we know the actual encoding of the parameter names, then
we can decode them.  At that point we'll change the code again and avoid
dropping the name if it's not ascii.  If it's, say, UTF-8, then we'll be
able to decode the byte stream, knowing that it's an UTF-8 stream.

-- 
		Dodji