public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* What to do about libidn?
@ 2016-11-08 11:52 Florian Weimer
  2016-11-08 15:27 ` Zack Weinberg
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Florian Weimer @ 2016-11-08 11:52 UTC (permalink / raw)
  To: GNU C Library

For AI_IDN support in getaddrinfo, we currently bundle a really old copy 
of libidn.

This has several problems:

1. We lack a couple of security fixes.

2. libidn, as an API, is very to use because it has complicated 
preconditions for its input.  This may have been fixed in later upstream 
versions.

3. The tables are fairly large.  On the other hand, we may need the 
Unicode NFC tables for password hashing, too.

4. The IETF more or less replaced IDNA-2003 with a different and 
slightly incompatible standard, IDNA-2008.  There is no version 
negotiation, and some registries tried to implement it with a flag day 
(each registry with a different date, of course).  libidn seems to be 
IDNA-2003 only.

5. There is considerable variance among IDNA-2008 implementation. 
IDNA-2008 is described in terms of a specific Unicode version (5.2). 
The IANA tables were officially updated to Unicode 6.3 in RFC 6452.  I'm 
not sure if actual implementation (in browsers, for example) follow 
these tables because they probably want to use newer Unicode version.

6. Distributions have their own system-wide copy of libidn (which is 
not the one in glibc).  They do not use libidn2 (which seems to be 
required for IDNA-2008 support).  This means that even if we update 
glibc, most applications will not benefit.

7. On the glibc side, IDN only applies to getaddrinfo, is opt-in via 
AI_IDN, and requires a non-ASCII locale.  Everything else sends 
unencoded bytes over the wire via DNS.


What should we do to improve this situation?  I would really like to 
remove AI_IDN, but this is likely not an option.

Should we remove our internal copy and try to dlopen libidn2?  Maybe 
falling back to libidn if libdn2 is unavailable?  Bundle libidn2?  Write 
our own implementation?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 11:52 What to do about libidn? Florian Weimer
@ 2016-11-08 15:27 ` Zack Weinberg
  2016-11-08 15:59   ` Florian Weimer
  2016-11-08 23:30 ` Joseph Myers
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Zack Weinberg @ 2016-11-08 15:27 UTC (permalink / raw)
  To: Florian Weimer, Simon Josefsson; +Cc: GNU C Library

On Tue, Nov 8, 2016 at 6:52 AM, Florian Weimer <fweimer@redhat.com> wrote:
> For AI_IDN support in getaddrinfo, we currently bundle a really old copy of
> libidn. This has several problems:
>
> 1. We lack a couple of security fixes.
>
> 2. libidn, as an API, is very to use because it has complicated
> preconditions for its input.  This may have been fixed in later upstream
> versions.
>
> 3. The tables are fairly large.  On the other hand, we may need the Unicode
> NFC tables for password hashing, too.
>
> 4. The IETF more or less replaced IDNA-2003 with a different and slightly
> incompatible standard, IDNA-2008.  There is no version negotiation, and some
> registries tried to implement it with a flag day (each registry with a
> different date, of course).  libidn seems to be IDNA-2003 only.
>
> 5. There is considerable variance among IDNA-2008 implementation. IDNA-2008
> is described in terms of a specific Unicode version (5.2). The IANA tables
> were officially updated to Unicode 6.3 in RFC 6452.  I'm not sure if actual
> implementation (in browsers, for example) follow these tables because they
> probably want to use newer Unicode version.
>
> 6. Distributions have their own system-wide copy of libidn (which is not the
> one in glibc).  They do not use libidn2 (which seems to be required for
> IDNA-2008 support).  This means that even if we update glibc, most
> applications will not benefit.
>
> 7. On the glibc side, IDN only applies to getaddrinfo, is opt-in via AI_IDN,
> and requires a non-ASCII locale.  Everything else sends unencoded bytes over
> the wire via DNS.

[...]

I just saw something go by about security problems with blindly
applying IDNA-2008 without additional input validation, too. Can't
find it right now.  cc:ing the libidn(2) maintainer.

> What should we do to improve this situation?  I would really like to remove
> AI_IDN, but this is likely not an option.

I also rather like the idea of dropping AI_IDN.  As a data point,
https://searchcode.com/?q=AI_IDN shows only 39 hits out of "20 billion
lines of code from 7,000,000 projects" - and at least half of those
appear to be implementations and library wrappers.

zw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 15:27 ` Zack Weinberg
@ 2016-11-08 15:59   ` Florian Weimer
  2016-11-09  7:53     ` Petr Spacek
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Weimer @ 2016-11-08 15:59 UTC (permalink / raw)
  To: Zack Weinberg, Simon Josefsson; +Cc: GNU C Library

On 11/08/2016 04:27 PM, Zack Weinberg wrote:

> I just saw something go by about security problems with blindly
> applying IDNA-2008 without additional input validation, too. Can't
> find it right now.  cc:ing the libidn(2) maintainer.

The upgrade to IDNA-2008 changes name resolution for some domains 
because registries did not handle the transition in a seamless manner. 
It also enables new homograph attacks (but I tend to discount those as 
irrelevant).

Disabling IDNA does not have this problem anymore because I don't think 
there is a registry which allows registration of non-ASCII name (e.g., 
labels of the form \195\164\195\182\195\188 instead of xn--4ca0bs).

>> What should we do to improve this situation?  I would really like to remove
>> AI_IDN, but this is likely not an option.
>
> I also rather like the idea of dropping AI_IDN.  As a data point,
> https://searchcode.com/?q=AI_IDN shows only 39 hits out of "20 billion
> lines of code from 7,000,000 projects" - and at least half of those
> appear to be implementations and library wrappers.

There is traceroute …

If we the consensus is that we want to get rid of AI_IDN, I'll happily 
prepare a patch (and use it in Fedora).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 11:52 What to do about libidn? Florian Weimer
  2016-11-08 15:27 ` Zack Weinberg
@ 2016-11-08 23:30 ` Joseph Myers
  2016-11-09 12:02   ` Florian Weimer
                     ` (2 more replies)
  2016-11-11 19:41 ` Mike Frysinger
  2016-11-11 20:00 ` Carlos O'Donell
  3 siblings, 3 replies; 13+ messages in thread
From: Joseph Myers @ 2016-11-08 23:30 UTC (permalink / raw)
  To: Florian Weimer; +Cc: GNU C Library

On Tue, 8 Nov 2016, Florian Weimer wrote:

> This has several problems:

8. Updating libidn would be problematic for license reasons (it's 
non-FSF-assigned and upsteam is now LGPLv3).

> Should we remove our internal copy and try to dlopen libidn2?  Maybe falling
> back to libidn if libdn2 is unavailable?  Bundle libidn2?  Write our own
> implementation?

Given that glibc's libidn add-on is not itself a public ABI or API, 
dlopening an external library would seem a reasonable way of implementing 
that getaddrinfo functionality.

Suppose we remove libidn (with or without keeping the libidn functionality 
through dlopen of another library).  Then we have no in-tree uses of the 
add-ons mechanism.  Do we have any use for keeping that mechanism for 
out-of-tree add-ons, or should it be removed?

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 15:59   ` Florian Weimer
@ 2016-11-09  7:53     ` Petr Spacek
  0 siblings, 0 replies; 13+ messages in thread
From: Petr Spacek @ 2016-11-09  7:53 UTC (permalink / raw)
  To: libc-alpha

On 8.11.2016 16:59, Florian Weimer wrote:
> On 11/08/2016 04:27 PM, Zack Weinberg wrote:
> 
>> I just saw something go by about security problems with blindly
>> applying IDNA-2008 without additional input validation, too. Can't
>> find it right now.  cc:ing the libidn(2) maintainer.
> 
> The upgrade to IDNA-2008 changes name resolution for some domains because
> registries did not handle the transition in a seamless manner. It also enables
> new homograph attacks (but I tend to discount those as irrelevant).
> 
> Disabling IDNA does not have this problem anymore because I don't think there
> is a registry which allows registration of non-ASCII name (e.g., labels of the
> form \195\164\195\182\195\188 instead of xn--4ca0bs).
> 
>>> What should we do to improve this situation?  I would really like to remove
>>> AI_IDN, but this is likely not an option.
>>
>> I also rather like the idea of dropping AI_IDN.  As a data point,
>> https://searchcode.com/?q=AI_IDN shows only 39 hits out of "20 billion
>> lines of code from 7,000,000 projects" - and at least half of those
>> appear to be implementations and library wrappers.
> 
> There is traceroute …
> 
> If we the consensus is that we want to get rid of AI_IDN, I'll happily prepare
> a patch (and use it in Fedora).

Personally I would agree to removing AI_IDN. The more we remove the better: It
will be incentive for applications to use something more modern than DNS
resolution layer from libc, which is really ancient and lacks modern
functionality (DNSSEC validation and error reporting, for instance).

-- 
Petr Spacek  @  Red Hat

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 23:30 ` Joseph Myers
@ 2016-11-09 12:02   ` Florian Weimer
  2016-11-09 16:03     ` Joseph Myers
  2016-11-11 19:53     ` Carlos O'Donell
  2016-11-10 15:32   ` Florian Weimer
  2016-11-11 19:49   ` Carlos O'Donell
  2 siblings, 2 replies; 13+ messages in thread
From: Florian Weimer @ 2016-11-09 12:02 UTC (permalink / raw)
  To: Joseph Myers; +Cc: GNU C Library

On 11/09/2016 12:30 AM, Joseph Myers wrote:
> On Tue, 8 Nov 2016, Florian Weimer wrote:
>
>> This has several problems:
>
> 8. Updating libidn would be problematic for license reasons (it's
> non-FSF-assigned and upsteam is now LGPLv3).
>
>> Should we remove our internal copy and try to dlopen libidn2?  Maybe falling
>> back to libidn if libdn2 is unavailable?  Bundle libidn2?  Write our own
>> implementation?
>
> Given that glibc's libidn add-on is not itself a public ABI or API,
> dlopening an external library would seem a reasonable way of implementing
> that getaddrinfo functionality.

Would you prefer us to do that, or to drop AI_IDN support completely?

> Suppose we remove libidn (with or without keeping the libidn functionality
> through dlopen of another library).  Then we have no in-tree uses of the
> add-ons mechanism.  Do we have any use for keeping that mechanism for
> out-of-tree add-ons, or should it be removed?

We have removed the rtkaio add-on from Fedora (and downstreams will 
inherit the removal).  Fedora now has dual libcrypt builds (with and 
without NSS), but this doesn't use the add-on mechanism at all.  So we 
do not need the add-on mechanism anymore.

In the broader picture, I think we should discourage out-of-tree ports 
and functionality as much as possible because if something is not part 
of regular builds because it's not in the official source tree, we might 
only learn about fundamental incompatibilities after a release or two, 
which would be annoying.  So I'd suggest the remove the add-on mechanism 
eventually.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-09 12:02   ` Florian Weimer
@ 2016-11-09 16:03     ` Joseph Myers
  2016-11-11 19:53     ` Carlos O'Donell
  1 sibling, 0 replies; 13+ messages in thread
From: Joseph Myers @ 2016-11-09 16:03 UTC (permalink / raw)
  To: Florian Weimer; +Cc: GNU C Library

On Wed, 9 Nov 2016, Florian Weimer wrote:

> > Given that glibc's libidn add-on is not itself a public ABI or API,
> > dlopening an external library would seem a reasonable way of implementing
> > that getaddrinfo functionality.
> 
> Would you prefer us to do that, or to drop AI_IDN support completely?

I don't have a view on that, beyond that it's not normal for us to drop a 
supported ABI for existing binaries.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 23:30 ` Joseph Myers
  2016-11-09 12:02   ` Florian Weimer
@ 2016-11-10 15:32   ` Florian Weimer
  2016-11-11 19:49   ` Carlos O'Donell
  2 siblings, 0 replies; 13+ messages in thread
From: Florian Weimer @ 2016-11-10 15:32 UTC (permalink / raw)
  To: Joseph Myers; +Cc: GNU C Library

On 11/09/2016 12:30 AM, Joseph Myers wrote:
> On Tue, 8 Nov 2016, Florian Weimer wrote:
>
>> This has several problems:
>
> 8. Updating libidn would be problematic for license reasons (it's
> non-FSF-assigned and upsteam is now LGPLv3).

9. Unicode UTS #46 uses an IDNA2008 customization mechanism to get 
behavior which is closer to IDNA2003:

   <http://www.unicode.org/reports/tr46/>

This suggests that as of 2016, we are still supposed to use the IDNA2003 
compatibility mappings because the “transitional period“ is not over 
(otherwise, why include it in the document?).

Reportedly, Chrome still uses transitional processing (or something 
closer to IDNA2003 than to IDNA2008).  I've verified that the Fedora 
Chromium build does this.  Internet Explorer 11 and Edge on Windows 10 
also implement something more closely related to IDNA2003.

So libidn2 might not even be the right choice.

I wonder if we could define a protocol extension to offload these policy 
decisions to the recursive resolver.

But then, an application which resolves host names probably needs to 
display them as well, and this needs slightly different logic because a 
IDNA host name could successfully resolve over the Internet, but the 
application is not supposed to show the IDNA name, only the 
“xn--”-encoded name.  This suggests to me that the application needs 
additional logic anyway (something we do not expose right now).  The 
canonical name and the AI_CANONIDN option is of no help here because the 
canonical name is often a generic CDN host name, which is not really 
relevant here.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 11:52 What to do about libidn? Florian Weimer
  2016-11-08 15:27 ` Zack Weinberg
  2016-11-08 23:30 ` Joseph Myers
@ 2016-11-11 19:41 ` Mike Frysinger
  2016-11-11 20:00 ` Carlos O'Donell
  3 siblings, 0 replies; 13+ messages in thread
From: Mike Frysinger @ 2016-11-11 19:41 UTC (permalink / raw)
  To: Florian Weimer; +Cc: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 241 bytes --]

since AI_IDN is not in POSIX, i'm fine with dropping it.  providing
an old implementation is a disservice to the ecosystem.  better to
push people to use an up-to-date & supported version instead imo
(i.e. tell people to use libidn2).
-mike

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 23:30 ` Joseph Myers
  2016-11-09 12:02   ` Florian Weimer
  2016-11-10 15:32   ` Florian Weimer
@ 2016-11-11 19:49   ` Carlos O'Donell
  2016-11-11 21:16     ` Joseph Myers
  2 siblings, 1 reply; 13+ messages in thread
From: Carlos O'Donell @ 2016-11-11 19:49 UTC (permalink / raw)
  To: Joseph Myers, Florian Weimer; +Cc: GNU C Library

On 11/08/2016 06:30 PM, Joseph Myers wrote:
> On Tue, 8 Nov 2016, Florian Weimer wrote:
> 
>> This has several problems:
> 
> 8. Updating libidn would be problematic for license reasons (it's 
> non-FSF-assigned and upsteam is now LGPLv3).
> 
>> Should we remove our internal copy and try to dlopen libidn2?  Maybe falling
>> back to libidn if libdn2 is unavailable?  Bundle libidn2?  Write our own
>> implementation?
> 
> Given that glibc's libidn add-on is not itself a public ABI or API, 
> dlopening an external library would seem a reasonable way of implementing 
> that getaddrinfo functionality.
> 
> Suppose we remove libidn (with or without keeping the libidn functionality 
> through dlopen of another library).  Then we have no in-tree uses of the 
> add-ons mechanism.  Do we have any use for keeping that mechanism for 
> out-of-tree add-ons, or should it be removed?
 
Is libdfp still an add-on?

It looks like in 2009 they converted to a stand-alone library.

I reviewed IEEE 754-2008 and found that we do require some DFP support
to fully implement the standard, which means if we did adopt libdfp code
we would do so directly and not as an add-on, so there would be no add-on
requirement there.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-09 12:02   ` Florian Weimer
  2016-11-09 16:03     ` Joseph Myers
@ 2016-11-11 19:53     ` Carlos O'Donell
  1 sibling, 0 replies; 13+ messages in thread
From: Carlos O'Donell @ 2016-11-11 19:53 UTC (permalink / raw)
  To: Florian Weimer, Joseph Myers; +Cc: GNU C Library

On 11/09/2016 07:02 AM, Florian Weimer wrote:
> In the broader picture, I think we should discourage out-of-tree
> ports and functionality as much as possible because if something is
> not part of regular builds because it's not in the official source
> tree, we might only learn about fundamental incompatibilities after a
> release or two, which would be annoying.  So I'd suggest the remove
> the add-on mechanism eventually.

Agreed.

I would also vote for out-of-tree functionality to be removed.

Users can simply have cloned trees with their own patches, and help
in the overall library maintenance or work with upstream to integrate
their changes.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-08 11:52 What to do about libidn? Florian Weimer
                   ` (2 preceding siblings ...)
  2016-11-11 19:41 ` Mike Frysinger
@ 2016-11-11 20:00 ` Carlos O'Donell
  3 siblings, 0 replies; 13+ messages in thread
From: Carlos O'Donell @ 2016-11-11 20:00 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library

On 11/08/2016 06:52 AM, Florian Weimer wrote:
> 7. On the glibc side, IDN only applies to getaddrinfo, is opt-in via
> AI_IDN, and requires a non-ASCII locale.  Everything else sends
> unencoded bytes over the wire via DNS.
> 
> What should we do to improve this situation?  I would really like to
> remove AI_IDN, but this is likely not an option.
> 
> Should we remove our internal copy and try to dlopen libidn2?  Maybe
> falling back to libidn if libdn2 is unavailable?  Bundle libidn2?
> Write our own implementation?

I think that AI_IDN was layered at the wrong level.

Applications need much more rich APIs to handle IDN than we give them.

I'm in favour of removing AI_IDN support for _new_ binaries, which means
versioning and stripping out AI_IDN support in new version of getaddrinfo.

For existing binaries I think we must continue to provide the support
we have but freeze out the code so we can eventually remove it.

If there are security issues in using the existing code as-is then I think
a dlopen of libidn (the system library with fixes) is acceptable.

Transitioning to libidn2 doesn't seem like a feasible solution.

I think this lines up with Joseph's recommendations.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: What to do about libidn?
  2016-11-11 19:49   ` Carlos O'Donell
@ 2016-11-11 21:16     ` Joseph Myers
  0 siblings, 0 replies; 13+ messages in thread
From: Joseph Myers @ 2016-11-11 21:16 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Florian Weimer, GNU C Library

On Fri, 11 Nov 2016, Carlos O'Donell wrote:

> Is libdfp still an add-on?
> 
> It looks like in 2009 they converted to a stand-alone library.
> 
> I reviewed IEEE 754-2008 and found that we do require some DFP support
> to fully implement the standard, which means if we did adopt libdfp code
> we would do so directly and not as an add-on, so there would be no add-on
> requirement there.

A language may choose which IEEE formats to support.  The C bindings allow 
an implementation to support IEEE 754 for binary, decimal or both.

DFP support in glibc would look rather different from both stand-alone 
libdfp and add-on libdfp (for example, through integrating DFP support 
into <bits/mathcalls.h> rather than having separate headers with DFP 
declarations).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-11-11 21:16 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-08 11:52 What to do about libidn? Florian Weimer
2016-11-08 15:27 ` Zack Weinberg
2016-11-08 15:59   ` Florian Weimer
2016-11-09  7:53     ` Petr Spacek
2016-11-08 23:30 ` Joseph Myers
2016-11-09 12:02   ` Florian Weimer
2016-11-09 16:03     ` Joseph Myers
2016-11-11 19:53     ` Carlos O'Donell
2016-11-10 15:32   ` Florian Weimer
2016-11-11 19:49   ` Carlos O'Donell
2016-11-11 21:16     ` Joseph Myers
2016-11-11 19:41 ` Mike Frysinger
2016-11-11 20:00 ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).