public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Siddhesh Poyarekar <siddhesh@gotplt.org>,
	libc-alpha@sourceware.org, Florian Weimer <fweimer@redhat.com>
Subject: Re: [PATCH v5] elf: Fix DFS sorting algorithm for LD_TRACE_LOADED_OBJECTS with missing libraries (BZ #28868)
Date: Tue, 26 Apr 2022 16:29:36 -0300	[thread overview]
Message-ID: <aee2bce7-8dd3-b1fc-980d-8f14d2f5b078@linaro.org> (raw)
In-Reply-To: <594f5e59-699b-627c-5858-68a4fb3ce51d@gotplt.org>



On 26/04/2022 15:43, Siddhesh Poyarekar wrote:
> The patch doesn't apply anymore (conflict in elf/Makefile), please rebase it and post.  More comments below:
> 
> On 04/03/2022 22:42, Adhemerval Zanella via Libc-alpha wrote:
>> The _dl_map_object_deps ignores l_faked objects (set if the underlying
>> file can't be opened by _dl_map_object):
> 
> Maybe a more precise description is that the underlying file is not opened by _dl_map_object and the linker is in trace mode?  In other cases where the underlying file can't be opened, _dl_map_object quits with an error.

Ack.

> 
>>
>> 490   for (nlist = 0, runp = known; runp; runp = runp->next)
>> 491     {
>> 492       if (__builtin_expect (trace_mode, 0) && runp->map->l_faked)
>> 493         /* This can happen when we trace the loading.  */
>> 494         --map->l_searchlist.r_nlist;
>> 495       else
>> 496         {
>> 497           if (runp->map == map)
>> 498             map_index = nlist;
>> 499           map->l_searchlist.r_list[nlist++] = runp->map;
>> 500         }
>> 501
>> 502       /* Now clear all the mark bits we set in the objects on the search list
>> 503          to avoid duplicates, so the next call starts fresh.  */
>> 504       runp->map->l_reserved = 0;
>> 505     }
>>
>> If there any missing libraries being processed, they will not be
>> considered on final nlist size passed on _dl_sort_maps later in the
>> function.  And it is then used on _dl_sort_maps_dfs on the stack
>> allocated working maps:
>>
>> 222   /* Array to hold RPO sorting results, before we copy back to  maps[].  */
>> 223   struct link_map *rpo[nmaps];
>> 224
>> 225   /* The 'head' position during each DFS iteration. Note that we start at
>> 226      one past the last element due to first-decrement-then-store (see the
>> 227      bottom of above dfs_traversal() routine).  */
>> 228   struct link_map **rpo_head = &rpo[nmaps];
>>
>> However while transversing the 'l_initfini' on dfs_traversal it will
>> still considere the l_faked maps and thus update rpo more times than the
>> allocated working 'rpo', overflowing the stack object.
>>
>> As suggested in bugzilla, one option would be to avoid sorting the maps
>> for trace mode.  However I think ignoring l_faked object does make
>> sense (there is one less constraint to call the sorting function), it
>> allows a slight less stack usage for trace, and it is slight simpler
>> solution.
>>
>> The tests does trigger the stack overflow, however I tried to make
>> it more generic to check different scenarios or missing objects.
>>
>> Checked on x86_64-linux-gnu.
>> ---
>> v5: Refactor tests make rules.
>> v4: Use --soname to build the libraries, fixed typos.
>> v3: Removed stamp files to avoid add them as linker depedencies and
>>      moved test evaluation to python script.
>> v2: Added comments and fixed the default tst-trace1 --library-path.
>> ---
>>   elf/Makefile            |  54 ++++++++++++++++++++
>>   elf/dl-deps.c           |   2 +
>>   elf/dl-sort-maps.c      |   4 +-
>>   elf/libtracemod1-1.c    |   1 +
>>   elf/libtracemod2-1.c    |   1 +
>>   elf/libtracemod3-1.c    |   1 +
>>   elf/libtracemod4-1.c    |   1 +
>>   elf/libtracemod5-1.c    |   1 +
>>   elf/libtracemod6.c      |   1 +
>>   elf/tst-trace1.exp      |   4 ++
>>   elf/tst-trace2.exp      |   6 +++
>>   elf/tst-trace3.exp      |   6 +++
>>   elf/tst-trace4.exp      |   6 +++
>>   elf/tst-trace5.exp      |   6 +++
>>   scripts/tst-ld-trace.py | 108 ++++++++++++++++++++++++++++++++++++++++
>>   15 files changed, 201 insertions(+), 1 deletion(-)
>>   create mode 100644 elf/libtracemod1-1.c
>>   create mode 100644 elf/libtracemod2-1.c
>>   create mode 100644 elf/libtracemod3-1.c
>>   create mode 100644 elf/libtracemod4-1.c
>>   create mode 100644 elf/libtracemod5-1.c
>>   create mode 100644 elf/libtracemod6.c
>>   create mode 100644 elf/tst-trace1.exp
>>   create mode 100644 elf/tst-trace2.exp
>>   create mode 100644 elf/tst-trace3.exp
>>   create mode 100644 elf/tst-trace4.exp
>>   create mode 100644 elf/tst-trace5.exp
>>   create mode 100755 scripts/tst-ld-trace.py
>>
>> diff --git a/elf/Makefile b/elf/Makefile
>> index c96924e9c2..f30f64991f 100644
>> --- a/elf/Makefile
>> +++ b/elf/Makefile
>> @@ -617,6 +617,11 @@ modules-names = \
>>     libmarkermod4-2 \
>>     libmarkermod4-3 \
>>     libmarkermod4-4 \
>> +  libtracemod1-1 \
>> +  libtracemod2-1 \
>> +  libtracemod3-1 \
>> +  libtracemod4-1 \
>> +  libtracemod5-1 \
>>     ltglobmod1 \
>>     ltglobmod2 \
>>     neededobj1 \
>> @@ -1079,6 +1084,11 @@ tests-special += \
>>     $(objpfx)tst-initorder2-cmp.out \
>>     $(objpfx)tst-unused-dep-cmp.out \
>>     $(objpfx)tst-unused-dep.out \
>> +  $(objpfx)tst-trace1.out \
>> +  $(objpfx)tst-trace2.out \
>> +  $(objpfx)tst-trace3.out \
>> +  $(objpfx)tst-trace4.out \
>> +  $(objpfx)tst-trace5.out \
>>     # tests-special
>>   endif
>>   @@ -2725,3 +2735,47 @@ $(objpfx)tst-p_align3: $(objpfx)tst-p_alignmod3.so
>>   $(objpfx)tst-p_align3.out: tst-p_align3.sh $(objpfx)tst-p_align3
>>       $(SHELL) $< $(common-objpfx) '$(test-program-prefix)'; \
>>       $(evaluate-test)
>> +
>> +LDFLAGS-libtracemod1-1.so += -Wl,-soname,libtracemod1.so
>> +LDFLAGS-libtracemod2-1.so += -Wl,-soname,libtracemod2.so
>> +LDFLAGS-libtracemod3-1.so += -Wl,-soname,libtracemod3.so
>> +LDFLAGS-libtracemod4-1.so += -Wl,-soname,libtracemod4.so
>> +LDFLAGS-libtracemod5-1.so += -Wl,-soname,libtracemod5.so
>> +
>> +$(objpfx)libtracemod1-1.so: $(objpfx)libtracemod2-1.so \
>> +                $(objpfx)libtracemod3-1.so
>> +$(objpfx)libtracemod2-1.so: $(objpfx)libtracemod4-1.so \
>> +                $(objpfx)libtracemod5-1.so
>> +
>> +define libtracemod-x
>> +$(objpfx)libtracemod$(1)/libtracemod$(1).so: $(objpfx)libtracemod$(1)-1.so
>> +    $$(make-target-directory)
>> +    cp $$< $$@
>> +endef
>> +libtracemod-suffixes = 1 2 3 4 5
>> +$(foreach i,$(libtracemod-suffixes), $(eval $(call libtracemod-x,$(i))))
>> +
>> +define tst-trace-skeleton
>> +$(objpfx)tst-trace$(1).out: $(objpfx)libtracemod1/libtracemod1.so \
>> +                $(objpfx)libtracemod2/libtracemod2.so \
>> +                $(objpfx)libtracemod3/libtracemod3.so \
>> +                $(objpfx)libtracemod4/libtracemod4.so \
>> +                $(objpfx)libtracemod5/libtracemod5.so \
>> +                $(..)scripts/tst-ld-trace.py \
>> +                tst-trace$(1).exp
>> +    ( $(PYTHON) $(..)scripts/tst-ld-trace.py \
>> +        "$(test-wrapper-env) $(elf-objpfx)$(rtld-installed-name) \
>> +        --library-path $(common-objpfx):$(strip $(2)) \
>> +        $(objpfx)libtracemod1/libtracemod1.so" tst-trace$(1).exp \
>> +    ) > $$@; $$(evaluate-test)
>> +endef
> 
> OK, but would it be possible to not spawn the script as a subprocess? i.e. use {} instead of () for the recipe.
> 

Ack.

>> +
>> +$(eval $(call tst-trace-skeleton,1,))
>> +$(eval $(call tst-trace-skeleton,2,\
>> +    $(objpfx)libtracemod2))
>> +$(eval $(call tst-trace-skeleton,3,\
>> +    $(objpfx)libtracemod2:$(objpfx)libtracemod3))
>> +$(eval $(call tst-trace-skeleton,4,\
>> +    $(objpfx)libtracemod2:$(objpfx)libtracemod3:$(objpfx)libtracemod4))
>> +$(eval $(call tst-trace-skeleton,5,\
>> +    $(objpfx)libtracemod2:$(objpfx)libtracemod3:$(objpfx)libtracemod4:$(objpfx)libtracemod5))
> 
> OK.
> 
>> diff --git a/elf/dl-deps.c b/elf/dl-deps.c
>> index a2fc278256..77ff2b3895 100644
>> --- a/elf/dl-deps.c
>> +++ b/elf/dl-deps.c
>> @@ -473,6 +473,8 @@ _dl_map_object_deps (struct link_map *map,
>>       for (nlist = 0, runp = known; runp; runp = runp->next)
>>       {
>> +      /* _dl_sort_maps ignores l_faked object, so it is safe to not considere
> 
> consider
> 

Ack.

>> +     them for nlist.  */
>>         if (__builtin_expect (trace_mode, 0) && runp->map->l_faked)
>>       /* This can happen when we trace the loading.  */
>>       --map->l_searchlist.r_nlist;
>> diff --git a/elf/dl-sort-maps.c b/elf/dl-sort-maps.c
>> index 9e9d53ec47..96638d7ed1 100644
>> --- a/elf/dl-sort-maps.c
>> +++ b/elf/dl-sort-maps.c
>> @@ -140,7 +140,9 @@ static void
>>   dfs_traversal (struct link_map ***rpo, struct link_map *map,
>>              bool *do_reldeps)
>>   {
>> -  if (map->l_visited)
>> +  /* _dl_map_object_deps ignores l_faked objects when calculating the
>> +     number of maps before calling _dl_sort_maps, ignore them as well.  */
>> +  if (map->l_visited || map->l_faked)
>>       return;
>>       map->l_visited = 1;
> 
> OK.
> 
>> diff --git a/elf/libtracemod1-1.c b/elf/libtracemod1-1.c
>> new file mode 100644
>> index 0000000000..7c89c9a5a4
>> --- /dev/null
>> +++ b/elf/libtracemod1-1.c
>> @@ -0,0 +1 @@
>> +/* Empty  */
>> diff --git a/elf/libtracemod2-1.c b/elf/libtracemod2-1.c
>> new file mode 100644
>> index 0000000000..7c89c9a5a4
>> --- /dev/null
>> +++ b/elf/libtracemod2-1.c
>> @@ -0,0 +1 @@
>> +/* Empty  */
>> diff --git a/elf/libtracemod3-1.c b/elf/libtracemod3-1.c
>> new file mode 100644
>> index 0000000000..7c89c9a5a4
>> --- /dev/null
>> +++ b/elf/libtracemod3-1.c
>> @@ -0,0 +1 @@
>> +/* Empty  */
>> diff --git a/elf/libtracemod4-1.c b/elf/libtracemod4-1.c
>> new file mode 100644
>> index 0000000000..7c89c9a5a4
>> --- /dev/null
>> +++ b/elf/libtracemod4-1.c
>> @@ -0,0 +1 @@
>> +/* Empty  */
>> diff --git a/elf/libtracemod5-1.c b/elf/libtracemod5-1.c
>> new file mode 100644
>> index 0000000000..7c89c9a5a4
>> --- /dev/null
>> +++ b/elf/libtracemod5-1.c
>> @@ -0,0 +1 @@
>> +/* Empty  */
>> diff --git a/elf/libtracemod6.c b/elf/libtracemod6.c
>> new file mode 100644
>> index 0000000000..7c89c9a5a4
>> --- /dev/null
>> +++ b/elf/libtracemod6.c
>> @@ -0,0 +1 @@
>> +/* Empty  */
> 
> Isn't this unused?
> 

It is, I will remove it.

>> diff --git a/elf/tst-trace1.exp b/elf/tst-trace1.exp
>> new file mode 100644
>> index 0000000000..4a6f5211a6
>> --- /dev/null
>> +++ b/elf/tst-trace1.exp
>> @@ -0,0 +1,4 @@
>> +ld 1
>> +libc 1
>> +libtracemod2.so 0
>> +libtracemod3.so 0
> 
> OK, since libtracemod1.so depends on libtracemod2.so and libtracemod3.so but they're not in path.
> 
>> diff --git a/elf/tst-trace2.exp b/elf/tst-trace2.exp
>> new file mode 100644
>> index 0000000000..e13506e2eb
>> --- /dev/null
>> +++ b/elf/tst-trace2.exp
>> @@ -0,0 +1,6 @@
>> +ld 1
>> +libc 1
>> +libtracemod2.so 1
>> +libtracemod3.so 0
>> +libtracemod4.so 0
>> +libtracemod5.so 0
> 
> OK, since libtracemod2.so is now in path and libtracemod4.so and libtracemod5.so get pulled in libtracemod2.so.
> 
>> diff --git a/elf/tst-trace3.exp b/elf/tst-trace3.exp
>> new file mode 100644
>> index 0000000000..e574549d12
>> --- /dev/null
>> +++ b/elf/tst-trace3.exp
>> @@ -0,0 +1,6 @@
>> +ld 1
>> +libc 1
>> +libtracemod2.so 1
>> +libtracemod3.so 1
>> +libtracemod4.so 0
>> +libtracemod5.so 0
> 
> OK, since libtracemod3 is also in path.
> 
>> diff --git a/elf/tst-trace4.exp b/elf/tst-trace4.exp
>> new file mode 100644
>> index 0000000000..31ca97b35b
>> --- /dev/null
>> +++ b/elf/tst-trace4.exp
>> @@ -0,0 +1,6 @@
>> +ld 1
>> +libc 1
>> +libtracemod2.so 1
>> +libtracemod3.so 1
>> +libtracemod4.so 1
>> +libtracemod5.so 0
> 
> OK, since libtracemod4 is also in path.
> 
>> diff --git a/elf/tst-trace5.exp b/elf/tst-trace5.exp
>> new file mode 100644
>> index 0000000000..5d7d953726
>> --- /dev/null
>> +++ b/elf/tst-trace5.exp
>> @@ -0,0 +1,6 @@
>> +ld 1
>> +libc 1
>> +libtracemod2.so 1
>> +libtracemod3.so 1
>> +libtracemod4.so 1
>> +libtracemod5.so 1
> 
> OK, since libtracemod5 is also in path.
> 
>> diff --git a/scripts/tst-ld-trace.py b/scripts/tst-ld-trace.py
>> new file mode 100755
>> index 0000000000..f5a4028003
>> --- /dev/null
>> +++ b/scripts/tst-ld-trace.py
>> @@ -0,0 +1,108 @@
>> +#!/usr/bin/python3
>> +# Dump the output of LD_TRACE_LOADED_OBJECTS in architecture neutral format.
>> +# Copyright (C) 2022 Free Software Foundation, Inc.
>> +# Copyright The GNU Toolchain Authors.
>> +# This file is part of the GNU C Library.
>> +#
>> +# The GNU C Library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +#
>> +# The GNU C Library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +# Lesser General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with the GNU C Library; if not, see
>> +# <https://www.gnu.org/licenses/>.
>> +
>> +import argparse
>> +import os
>> +import subprocess
>> +import sys
>> +
>> +try:
>> +    subprocess.run
>> +except:
>> +    class _CompletedProcess:
>> +        def __init__(self, args, returncode, stdout=None, stderr=None):
>> +            self.args = args
>> +            self.returncode = returncode
>> +            self.stdout = stdout
>> +            self.stderr = stderr
>> +
>> +    def _run(*popenargs, input=None, timeout=None, check=False, **kwargs):
>> +        assert(timeout is None)
>> +        with subprocess.Popen(*popenargs, **kwargs) as process:
>> +            try:
>> +                stdout, stderr = process.communicate(input)
>> +            except:
>> +                process.kill()
>> +                process.wait()
>> +                raise
>> +            returncode = process.poll()
>> +            if check and returncode:
>> +                raise subprocess.CalledProcessError(returncode, popenargs)
>> +        return _CompletedProcess(popenargs, returncode, stdout, stderr)
>> +
>> +    subprocess.run = _run
>> +
>> +def is_vdso(lib):
>> +    return lib.startswith('linux-gate') or lib.startswith('linux-vdso')
> 
> OK.
> 
>> +
>> +
>> +def parse_trace(cmd, fref):
>> +    new_env = os.environ.copy()
>> +    new_env['LD_TRACE_LOADED_OBJECTS'] = '1'
>> +    trace_out = subprocess.run(cmd, stdout=subprocess.PIPE, check=True,
>> +                               universal_newlines=True, env=new_env).stdout
>> +    trace = []
>> +    for line in trace_out.splitlines():
>> +        line = line.strip()
>> +        if is_vdso(line):
>> +            continue
>> +        fields = line.split('=>' if '=>' in line else ' ')
>> +        lib = os.path.basename(fields[0].strip())
>> +        if lib.startswith('ld'):
>> +            lib = 'ld'
>> +        elif lib.startswith('libc'):
>> +            lib = 'libc'
>> +        found = 1 if fields[1].strip() != 'not found' else 0
>> +        trace += ['{} {}'.format(lib, found)]
>> +    trace = sorted(trace)
>> +
>> +    reference = sorted(line.replace('\n','') for line in fref.readlines())
>> +
>> +    ret = 0 if trace == reference else 1
>> +    if ret != 0:
>> +        for i in reference:
>> +            if i not in trace:
>> +                print("Only in {}: {}".format(fref.name, i))
>> +        for i in trace:
>> +            if i not in reference:
>> +                print("Only in trace: {}".format(i))
>> +
>> +    sys.exit(ret)
> 
> OK.
> 
>> +
>> +
>> +def get_parser():
>> +    parser = argparse.ArgumentParser(description=__doc__)
>> +    parser.add_argument('command',
>> +                        help='comand to run')
>> +    parser.add_argument('reference',
>> +                        help='reference file to compare')
>> +    return parser
>> +
> 
> OK.
> 
>> +
>> +def main(argv):
>> +    parser = get_parser()
>> +    opts = parser.parse_args(argv)
>> +    with open(opts.reference, 'r') as fref:
>> +        # Remove the initial 'env' command.
>> +        parse_trace(opts.command.split()[1:], fref)
>> +
>> +
>> +if __name__ == '__main__':
>> +    main(sys.argv[1:])
> 
> OK.

      reply	other threads:[~2022-04-26 19:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-04 17:12 Adhemerval Zanella
2022-04-26 18:43 ` Siddhesh Poyarekar
2022-04-26 19:29   ` Adhemerval Zanella [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aee2bce7-8dd3-b1fc-980d-8f14d2f5b078@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=siddhesh@gotplt.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).