From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 118637 invoked by alias); 19 Jan 2016 09:31:06 -0000 Mailing-List: contact libabigail-help@sourceware.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Subscribe: Sender: libabigail-owner@sourceware.org Received: (qmail 117495 invoked by uid 89); 19 Jan 2016 09:31:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.99 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=H*r:1001 X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-Spam-User: qpsmtpd, 2 recipients X-HELO: ms.seketeli.fr From: Dodji Seketeli To: "woodard at redhat dot com" Cc: libabigail@sourceware.org Subject: Re: [Bug default/19434] invalid character in attribute value Organization: Me, myself and I References: X-Operating-System: Red Hat Enterprise Linux Workstation 7.2 X-URL: http://www.seketeli.net/~dodji Date: Fri, 01 Jan 2016 00:00:00 -0000 In-Reply-To: (woodard at redhat dot com's message of "Mon, 18 Jan 2016 19:08:51 +0000") Message-ID: <864me97w6k.fsf@seketeli.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SW-Source: 2016-q1/txt/msg00040.txt.bz2 > Is dropping the name on the floor the best thing to do? Wouldn't it be > better to encode the non-ascii parameter name into 7b clean ascii sort > of like uuencode does. For now, we don't use the parameter name anyway. In change reports, function parameters are referred to using their position. Furthermore, I think that since we don't know the actual encoding of the characters, if we are sure they are not ASCII (which is the case here), I don't think trying to encode each of the byte value can provide us with any usable information. It's just like if we had garbage. We won't be able to show any useable information to the user anyway. Hence my inclination to drop the name altogether. But if one day we know the actual encoding of the parameter names, then we can decode them. At that point we'll change the code again and avoid dropping the name if it's not ascii. If it's, say, UTF-8, then we'll be able to decode the byte stream, knowing that it's an UTF-8 stream. -- Dodji