public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: Joseph Myers <joseph@codesourcery.com>
Cc: GNU C Library <libc-alpha@sourceware.org>
Subject: Re: Evolution of ELF symbol management
Date: Mon, 21 Nov 2016 15:35:00 -0000	[thread overview]
Message-ID: <774d610c-54fe-41fd-79e0-2b19aabbcc27@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1610251523320.4454@digraph.polyomino.org.uk>

[-- Attachment #1: Type: text/plain, Size: 3287 bytes --]

On 10/25/2016 05:37 PM, Joseph Myers wrote:
> On Tue, 25 Oct 2016, Florian Weimer wrote:
>
>>> There are a few existing __libc_foo exports at public symbol versions.  Do
>>> all those satisfy the rule that where both foo and __libc_foo exist, the
>>> latest version of foo and the latest version of __libc_foo are aliases or
>>> otherwise have the same semantics?  (It would seem very confusing for old
>>> and new __libc_* symbols to follow different rules in that regard.)
>>
>> I found a few symbols which differs in the exported version.  The unprefixed
>> symbol has a regular version, and the prefixed one appears as GLIBC_PRIVATE.
>> These are:
>>
>> clntudp_bufcreate
>> fork
>> longjmp
>> pread
>> pwrite
>> secure_getenv
>> siglongjmp
>> system
>> vfork
>
> My concern is mainly about __libc_* symbols at public versions, not
> GLIBC_PRIVATE, since we can freely change the ABIs for __libc_* at
> GLIBC_PRIVATE if those are confusing.

I put together the attached Python script to check for collisions.  It 
reports anything that is not UNDEF or GLIBC_PRIVATE and where the 
__libc_-prefixed and non-prefixed symbols have different values.  It 
reports some mismatches in libasan, so I think it works.  It does not 
flag anything for glibc on i386 and x86_64 with current master.

The script has a hard-coded path to elfutils readelf, it needs Mark's 
recent addition of the --symbols=SECTION argument.

So I think we are good on this front.

>> This is less relevant for functions in non-standard headers (which
>> applications would not include accidentally), but if we add something to
>> <stdio.h> (under _GNU_SOURCE) which is ripe for collisions, we need to somehow
>> make sure that a user-defined function of the same name does not end up
>> interposing the alias.
>
> Same name and type, that is; if the type is wrong and the header
> declaration is visible, a compile-time error will occur.

Good point.  We could add artificial transparent unions to arguments to 
make it harder to write a matching definition, even with current GCC 
versions.

> There is always the option of having the installed headers be generated
> files, so the source tree has .h.in files that contain some sort of
> annotations for use by a special glibc-specific preprocessor that does
> things the C preprocessor cannot - converting something that looks like a
> C macro call (say) into function declarations, __REDIRECT calls - and
> function-like macro definitions.  Of course then you need to get those
> headers generated at an early stage in the glibc build.

Interesting idea.

We could generate a different set of such headers of internal glibc user 
if required.  This could allow us to compile more parts of glibc as 
standard C sources, without mangling of public symbols, which would help 
with things like unit testing and fuzz testing.

>> I think this leads to the question whether we should prefer __ over __libc_
>> after all because as part of fixing the glibc-internal linknamespace issues,
>> we often added a __ symbol with a public version (but sometimes a
>
> We shouldn't have added them with public versions, just internally (and
> only at GLIBC_PRIVATE if needed by a separate library from the
> definition).

It's a bit too late for that, unfortunately.

Florian


[-- Attachment #2: glibc-libc-symbols-conflicts.py --]
[-- Type: text/x-python, Size: 2280 bytes --]

#!/usr/bin/python

import collections
import re
import subprocess
import sys

Version = collections.namedtuple("Version", "name default")
Symbol = collections.namedtuple("Symbol", "name version value")

RE_SYMBOL_LINE = re.compile(r"^\d+: ")
RE_SPACES = re.compile(r"\s+")

READELF = "/home/fweimer/src/ext/elfutils/e/src/readelf"

def get_symbols(path):
    p = subprocess.Popen([READELF, "--symbols=.dynsym", "--", path],
                         stdout=subprocess.PIPE)
    for line in p.stdout.readlines():
        line = line.strip()
        if RE_SYMBOL_LINE.match(line):
            split_line = RE_SPACES.split(line)
            # Num(0) Value(1) Size(2) Type(3) Bind(4) Vis(5) Ndx(6) Name(7)
            if len(split_line) < 8:
                continue
            value = int(split_line[1], 16)
            binding = split_line[4]
            ndx = split_line[6]
            name = split_line[7]
            
            if ndx == 'UNDEF':
                continue
            if binding == 'LOCAL':
                continue
            default= False
            if "@@" in name:
                default = True
                name, version = name.split("@@")
            elif "@" in name:
                name, version = name.split("@")
            else:
                version = None
            if version is None:
                yield Symbol(name, None, value)
            else:
                yield Symbol(name, Version(version, default), value)
    if p.wait() != 0:
        raise IOError(
            "readelf failed with exit status {}".format(p.returncode))

def check_file(path):
    with open(path, "rb") as dso:
        if dso.read(4) != "\177ELF":
            return
    
    libc_prefix = {}
    no_prefix = {}
    for sym in get_symbols(path):
        if sym.version and sym.version.name == "GLIBC_PRIVATE":
            continue
        if sym.name.startswith("__libc_"):
            libc_prefix[sym.name] = sym
        else:
            no_prefix[sym.name] = sym

    for np_sym in no_prefix.values():
        libc_sym = libc_prefix.get("__libc_" + np_sym.name, None)
        if libc_sym is None:
            continue
        if libc_sym.value != np_sym.value:
            print path, np_sym, libc_sym

for path in sys.argv[1:]:
    check_file(path)

  reply	other threads:[~2016-11-21 15:35 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-18  9:26 Florian Weimer
2016-10-18 16:50 ` Joseph Myers
2016-10-25 14:32   ` Florian Weimer
2016-10-25 15:37     ` Joseph Myers
2016-11-21 15:35       ` Florian Weimer [this message]
2016-10-26 12:17     ` Joseph Myers
2016-11-20 11:13     ` Mike Frysinger
2016-11-21 10:12       ` Florian Weimer
2016-11-16 15:55 ` Zack Weinberg
2016-11-18 15:48   ` Florian Weimer
2016-11-19 17:25     ` Zack Weinberg
2016-11-22 15:09       ` Florian Weimer
2016-11-22 15:30         ` Andreas Schwab
2016-11-22 15:39           ` Florian Weimer
2016-11-22 15:48             ` Zack Weinberg
2016-11-22 15:48               ` Zack Weinberg
2016-11-22 17:42         ` Joseph Myers
2016-11-23 14:09         ` Zack Weinberg
2016-11-24 10:01           ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=774d610c-54fe-41fd-79e0-2b19aabbcc27@redhat.com \
    --to=fweimer@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).