From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gnu-gabi-return-75-listarch-gnu-gabi=sourceware.org@sourceware.org>
Received: (qmail 111071 invoked by alias); 20 Jun 2016 14:18:50 -0000
Mailing-List: contact gnu-gabi-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gnu-gabi.sourceware.org>
List-Post: <mailto:gnu-gabi@sourceware.org>
List-Help: <mailto:gnu-gabi-help@sourceware.org>
List-Subscribe: <mailto:gnu-gabi-subscribe@sourceware.org>
Sender: gnu-gabi-owner@sourceware.org
Received: (qmail 111059 invoked by uid 89); 20 Jun 2016 14:18:49 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Checked: by ClamAV 0.99.1 on sourceware.org
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=totally, secure, relationship, our
X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org
X-Spam-Level: 
X-HELO: mail-pa0-f43.google.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=reply-to:subject:references:to:from:organization:message-id:date
         :user-agent:mime-version:in-reply-to:content-transfer-encoding;
        bh=7AqoxbBTh+4l/dJQKCP722fyFmnk9BFSn0Plmn0s+Ck=;
        b=IP7C2w6ubjlrct30+UZtrdv8QDPE6Fn6shT5DFnTGs+644ecYzBF3tQtHukMZW0ynQ
         Pn0iEz0+LCbmuQAeo0KaybZXjXKKeBogv7Nx6IDSBOiLTjqJ5twkLSoXtfRb/0NDGFjD
         zdwjUb9hZd5kmaUdfgG1R3bcjdRhfeEj0PQkyTGqovcQtYpBEBDpc/DYCfEK2oTz33CQ
         wVPBaCKwQvaRPrBRO0yFoG/KzKW2L5pBrvlFm0Nu0BtIv+4+XRsbSW5pu/S4m5ICJmXZ
         LZiimVG6FDzGxki1xlggkGkLTv8uXPvVgEaqsqtungEArO1Ob0KB+5LvtJTF+8QWboC0
         KieQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20130820;
        h=x-gm-message-state:reply-to:subject:references:to:from:organization
         :message-id:date:user-agent:mime-version:in-reply-to
         :content-transfer-encoding;
        bh=7AqoxbBTh+4l/dJQKCP722fyFmnk9BFSn0Plmn0s+Ck=;
        b=nONaZG9xu2/HW1T+WwHuok6nnnsM3gxqYq5dFYH7r6EQ56IyK5A59o1XBrusHX//g0
         EchoLKluQ94aAWI5pBgH4vMaxaLv8HDpcT2Yo4xRt7x19HjiqkSOKAqpaED4FlUG+lKM
         Etx2UkGm8lPVLW+VA2JFYoHvhgIEX0mbESkj+h2Adf6DHNbYxO9PT3qX3jesLRgpNw42
         SHbLpJqBCQ7ASTaGkt5rbtHh+/GUoSqs5U2+mp3miMQkL+8m7Pyq/Jt7SVupZ7js225M
         aNa2wJmVbu+dO0j2Lydh3jlZufZIdfNrL2UoBBYz3I2yMOZSPP+JB0c3wWFMjLWuaGeL
         dt6g==
X-Gm-Message-State: ALyK8tL0AoNb+O4pIwpD+glqu/i94hvUux5Hiu2w8RcodX3s7Gd7SNY1JLBWTtf5oFCpdQ==
X-Received: by 10.67.14.233 with SMTP id fj9mr17539603pad.147.1466432324957;
        Mon, 20 Jun 2016 07:18:44 -0700 (PDT)
Reply-To: hegdesmailbox@gmail.com
Subject: Re: GNU dlopen(3) differs from POSIX/IEEE
References: <25bc0c78-19ae-8974-b142-bb57f21cdb3d@gmail.com>
 <ca68d193-0a5d-1dc1-dc8c-bc59c8c27627@redhat.com>
 <763cd6f7-e33d-8d14-c0ba-f4e5797ddfa6@gmail.com>
 <42a86c64-a042-0c0d-9601-49729816c825@redhat.com>
 <8fead36d-c757-038a-3914-146ebeee8830@gmail.com>
 <f13c1d81-5f55-3d98-ee9c-7e39b8f28704@redhat.com>
To: Carlos O'Donell <carlos@redhat.com>, gnu-gabi@sourceware.org
From: Suprateeka R Hegde <hegdesmailbox@gmail.com>
Organization: HEGDESASPECT
Message-ID: <ae23d45a-7736-dc8f-5c45-9337bd8edaec@gmail.com>
Date: Fri, 01 Jan 2016 00:00:00 -0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.1.1
MIME-Version: 1.0
In-Reply-To: <f13c1d81-5f55-3d98-ee9c-7e39b8f28704@redhat.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Antivirus: avast! (VPS 160620-0, 20-06-2016), Outbound message
X-Antivirus-Status: Clean
X-SW-Source: 2016-q2/txt/msg00030.txt.bz2

On 19-Jun-2016 12:25 AM, Carlos O'Donell wrote:
> On 06/18/2016 04:01 AM, Suprateeka R Hegde wrote:
>>
>>
>> On 18-Jun-2016 11:02 AM, Carlos O'Donell wrote:
>>> On 06/18/2016 12:11 AM, Suprateeka R Hegde wrote:
>>>> All I am saying is, dlopen(3) with RTLD_GLOBAL also should bring in
>>>> foo at runtime to be compliant with POSIX.
>>>
>>> I disagree. Nothing in POSIX says that needs to be done. The
>>> key failure in your reasoning is that you have assumed lazy
>>> symbol resolution must happen at the point of the first function
>>> call.
>>
>> ld(1) on a GNU/Linux machine says:
>> ---
>> -z lazy
>>
>> When generating an executable or shared library, mark it to tell the
>> dynamic linker to defer function call resolution to the point when
>> the function is called (lazy binding)
>> ---
>
> Note that those man page is part of the linux man pages project and
> are not canonical documentation for the glibc project. Often the man
> pages documentation goes too far in describing the implementation
> and beyond what is guaranteed. We can work with Michael Kerrisk to
> get this changed quickly to read "defer function call resolution
> to an implementation-defined point in the future, possibly as late
> as the point when the function is called (lazy binding)."
>
>> This made me think that GNU implementation also matches with other
>> implementations -- that is lazy resolution happens at the time of the
>> first call.
>
> That is not an assumption that developers should be making.

Not as a developer. I usually read manpages as an end user. As a 
developer I can clearly see whats happening currently. And whats 
happening currently matches the description in the manoage too. They are 
in sync now -- that is resolution at the time of first function call.

>
>>> You have read "shall be made available for relocation" and
>>> then used implementation knowledge to decide that _today_ those
>>> relocations have a happens-after relationship with dlopen in your
>>> program. But because lazy symbol resolution is not an observable
>>> event for a well-defined program,
>>
>> Yes. I agree very much. But making some massive enterprise legacy
>> application to become "well-defined" now is beyond tool chain
>> writers.
>
> I agree that inevitably applications of a certain size end up having
> dependencies on implementation details that in turn make them costly
> to port to other operating systems.
>
> I care a lot about our users, and I don't want to see implementations
> constrained by standards text that might limit benefits to them in
> the future. So any suggestions you have I'm going to weigh against
> what I think a sensible user might expect, not a singular enterprise
> application.

I too agree very much on this. But we are not changing any defaults that 
affects sensible users. We are not standardizing definition of lazy 
resolution. Read more below.

>
>>> If you were to _require_ lazy resolution to happen at the point
>>> of the function call, which is what you're assuming here, then
>>> it would prevent the above implementation from being conforming.
>>
>> Both are mutually exclusive. In my opinion, programs either want
>> immediate binding or lazy binding. Not an arbitrary mix of both.
>
> I disagree. Lazy binding provides significant performance boosts,
> but in a mixed lazy/now binding environment you can bind a fixed
> number of key security related symbols early

I meant, as an observable event, they are exclusive. For optimizations 
or security, anything can be mixed. Any heuristics can be taken to 
achieve best results.

to quickly determine
> if the application uses say "execve" and decide if access control,
> in a policy-less environment, needs to be disabled (execve disabled
> unless the application needs it).
>
> You argue that we should standardize on "bind now" which happens
> immediately at startup, and "lazy binding" which always happens
> at the time the function is called, ignoring any opportunisitic
> binding that might happen if the dynamic loader happens to prove
> it knows what the binding result will be.

No. I am not at all suggesting "binding" be standardized. As you said, 
we do need space for optimizations and improvements.

We can keep existing semantics as is. We can add say "-z smart" 
(LD_BIND_SMARRT) or something like that to mean opportunistic binding 
later when it gets in.

All I am proposing is to make the dlopen(3) RTLD_GLOBAL semantics to 
match that of POSIX/IEEE description.


> No, if anything, I think we should be less proscriptive about
> lazy binding.
>
>>> However, because POSIX says nothing about when the lazy symbol
>>> resolution happens, or anything at all about it,
>>
>> It indeed says something:
>
> Only for dlopen...
>
>> ---
>> RTLD_LAZY
>>
>> Relocations shall be performed at an implementation-defined time,
>> ranging from the time of the dlopen() call until the first reference
>> to a given symbol occurs
>> ---
>
> ... and it says nothing really, like it should, leaving the choice
> up to the implementation. This text is specifically geared towards
> shared objects loaded via dlopen, not the symbols in the binary, for
> which the standard says nothing.
>
>> And then based on the ld(1) manpage, I thought GNU/Linux
>> implementation uses the time of first call.
>
> It does, but it doesn't use symbols brought into the global scope
> by dlopen for this resolution.
>
>> What is the harm if we go by the existing documentation and under the
>> option -z lazy or RTLD_LAZY, make lazy resolution happen at the point
>> of function call?
>
> You forbid a mixed binding environment, you forbid opportunistic binding,
> and force the binding to be truly as late as possible.

No. As I said, I do not want to standardize binding and forbid any 
optimizations.

I am saying, we can change RTLD_GLOBAL semantics and still have all that 
you said. By changing RTLD_GLOBAL semantics, we will not break any 
existing ABI. Its an additional one.

And we can also have -z smart (or -z secure). And we can even make them 
default (in place of existing -z lazy). In that way we have everything.

>
>> And eventually change the semantics of RTLD_GLOBAL to match the
>> description mentioned in the POSIX spec -- ...relocation processing
>> of any other executable object file.
>
> I don't yet see the benefit in this except that you say some undisclosed
> enterprise applications need these semantics because other operating
> systems provided them.
>
> That is not a good reason to be overly prescriptive in the standard.

I think we have a very minor difference of opinion in the whole 
discussion. To re-iterate, I am not proposing to restrict binding 
behaviors either to be NOW or be LAZY. We can add anything  in between 
to optimize or secure. We can add them under an option as I said and 
make it default too.

IMHO, (I was discussing with H.J too on the alternate code sequence 
proposal) lazy binding or writable-PLT cannot be totally removed from a 
platform. Tools like ltrace(1) will stop working. Couple of DSU 
solutions relying on writable-PLT/lazy_bind may stop working.

All of them should co-exist is what I think. One can always use the 
option of choice to achieve desired results.

--
Supra