From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 82837 invoked by alias); 8 Nov 2016 11:52:32 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 82827 invoked by uid 89); 8 Nov 2016 11:52:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.8 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=negotiation, optin, getaddrinfo, our X-HELO: mx1.redhat.com To: GNU C Library From: Florian Weimer Subject: What to do about libidn? Message-ID: <44cead16-9db0-a4c0-82cd-1f6178260ed7@redhat.com> Date: Tue, 08 Nov 2016 11:52:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2016-11/txt/msg00269.txt.bz2 For AI_IDN support in getaddrinfo, we currently bundle a really old copy of libidn. This has several problems: 1. We lack a couple of security fixes. 2. libidn, as an API, is very to use because it has complicated preconditions for its input. This may have been fixed in later upstream versions. 3. The tables are fairly large. On the other hand, we may need the Unicode NFC tables for password hashing, too. 4. The IETF more or less replaced IDNA-2003 with a different and slightly incompatible standard, IDNA-2008. There is no version negotiation, and some registries tried to implement it with a flag day (each registry with a different date, of course). libidn seems to be IDNA-2003 only. 5. There is considerable variance among IDNA-2008 implementation. IDNA-2008 is described in terms of a specific Unicode version (5.2). The IANA tables were officially updated to Unicode 6.3 in RFC 6452. I'm not sure if actual implementation (in browsers, for example) follow these tables because they probably want to use newer Unicode version. 6. Distributions have their own system-wide copy of libidn (which is not the one in glibc). They do not use libidn2 (which seems to be required for IDNA-2008 support). This means that even if we update glibc, most applications will not benefit. 7. On the glibc side, IDN only applies to getaddrinfo, is opt-in via AI_IDN, and requires a non-ASCII locale. Everything else sends unencoded bytes over the wire via DNS. What should we do to improve this situation? I would really like to remove AI_IDN, but this is likely not an option. Should we remove our internal copy and try to dlopen libidn2? Maybe falling back to libidn if libdn2 is unavailable? Bundle libidn2? Write our own implementation? Thanks, Florian