From: Mark Wielaard <mark@klomp.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: elfutils-devel@sourceware.org,
Panu Matilainen <pmatilai@laiskiainen.org>
Subject: Re: [PATCH] elfclassify tool
Date: Sun, 11 Aug 2019 23:38:00 -0000 [thread overview]
Message-ID: <a2dd6b0becb9c886f03b66a067c0d25417e4ec16.camel@klomp.org> (raw)
In-Reply-To: <20190729142419.GB2881@wildebeest.org>
[-- Attachment #1: Type: text/plain, Size: 3401 bytes --]
On Mon, 2019-07-29 at 16:24 +0200, Mark Wielaard wrote:
> On Mon, Jul 29, 2019 at 10:43:56AM +0200, Florian Weimer wrote:
> > * Mark Wielaard:
> >
> > > + if (elf == NULL)
> > > + {
> > > + /* This likely means it just isn't an ELF file, probably not a
> > > + real issue, but warn if verbose reporting. */
> > > + if (verbose > 0)
> > > + fprintf (stderr, "warning: %s: %s\n", current_path, elf_errmsg (-1));
> > > + return false;
> > > + }
> >
> > Is it possible to distinguish the error from a memory allocation error?
> > It would be wrong to mis-classify a file just because the system is low
> > on memory.
>
> You are right this is not the proper way to report the issue.
> Normally, when just using elf_begin, a NULL return should be reported
> through elf_issue (which will set the issues flag).
>
> But, because I added -z, we are using either elf_begin or
> dwelf_elf_begin. dwelf_elf_begin will return NULL (instead of a an
> empty (ELF_K_NONE) Elf descriptor when there is an issue, or the
> (decompressed) file wasn't an ELF file.
>
> So we should split the error reporting. If we called elf_begin and get
> NULL we should call elf_issue to report the proper issue.
>
> If we called dwefl_elf_begin and we get NULL, I am not sure yet what
> the proper way is to detect whether it is a real issue, or "just" not
> a (decompressed) ELF file. I am afraid the current handling is the
> best we can do.
>
> Maybe we can fix dwelf_elf_begin to return an empty (ELF_K_NONE) Elf
> descriptor if there was no issue, but the (decompressed) file wasn't
> an ELF file.
Sorry this took so long. And this is indeed the last issue holding up
the release. But this is a tricky problem.
We made a mistake when we wrote the contract for dwelf_elf_begin by
saying it would never return ELF_K_NONE. That made it different from
elf_begin and made it impossible to distinguish between a real (file or
decompression) error and whether the file simply wasn't an ELF file and
also wasn't a compressed ELF file.
I think we should fix the contract. Technically it would be an API
break, but I think no user is really relying on the fact that the Elf
handle returned is never ELF_K_NONE. Users still need to distinguish
between ELF_K_ELF and ELF_K_AR (and theoretically any other ELF_K_type,
like COFF, which we currently don't support, but we do define it).
So that is what the attached patch does. I also audited all
decompression code to make sure it returns error codes consistently.
The decompression will either decompress successfully and return
DWFL_E_NOERROR, or if the file wasn't compressed (or an embedded image)
it will return DWFL_E_BADELF. In all other cases (file or decompression
error) it will set a a different DWFL_E error.
This "only" leaves the problem that we don't have a good way to
translate those errors into "real" libelf error codes. So for now we
just fake one if it wasn't an elf_errno value. I don't intent to try to
solve this error translation issue before the release (I don't know how
to do it yet).
What do you think about this change to dwelf_elf_begin?
The change would make it possible to detect real errors in the
elfclassify code, whether elf_begin or dwelf_elf_begin was used. So we
would not misclassify files (but return an error status of 2).
Thanks,
Mark
[-- Attachment #2: 0001-libdwelf-Make-dwelf_elf_begin-return-NULL-only-when-.patch --]
[-- Type: text/x-patch, Size: 7796 bytes --]
From 648837a9f1be7628e9ceee6818bf56c80b9d3fa1 Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mark@klomp.org>
Date: Mon, 12 Aug 2019 00:43:22 +0200
Subject: [PATCH] libdwelf: Make dwelf_elf_begin return NULL only when there is
an error.
dwelf_elf_begin was slightly different from elf_begin in case the file
turned out to not be an ELF file. elf_begin would return an Elf handle
with ELF_K_NONE. But dwelf_elf_begin would return NULL. This made it
impossible to tell the difference between a file or decompression error
and a (decompressed) file not being an ELF file.
Since dwelf_elf_begin could still return different kinds of ELF files
(ELF_K_ELF or ELF_K_AR - and theoretically ELF_K_COFF) this was not
really useful anyway. So make it so that dwelf_elf_begin always returns
an Elf handle unless there was a real error reading or decompressing
the file. Otherwise return NULL to make clear there was a real error.
Make sure that the decompression function returns DWFL_E_BADELF only
when the file isn't compressed. In which case the Elf handle won't
be replaced and can be returned (as ELF_K_NONE).
Add a new version to dwelf_elf_begin so programs can rely on it
returning NULL only for real errors.
Signed-off-by: Mark Wielaard <mark@klomp.org>
---
libdw/ChangeLog | 4 ++++
libdw/libdw.map | 4 ++++
libdwelf/ChangeLog | 6 ++++++
libdwelf/dwelf_elf_begin.c | 12 +++++++-----
libdwelf/libdwelf.h | 9 ++++++---
libdwfl/ChangeLog | 8 ++++++++
libdwfl/gzip.c | 5 +++--
libdwfl/open.c | 10 +++++++---
8 files changed, 45 insertions(+), 13 deletions(-)
diff --git a/libdw/ChangeLog b/libdw/ChangeLog
index 6b779e77..bf1f4857 100644
--- a/libdw/ChangeLog
+++ b/libdw/ChangeLog
@@ -1,3 +1,7 @@
+2019-08-12 Mark Wielaard <mark@klomp.org>
+
+ * libdw.map (ELFUTILS_0.177): Add new version of dwelf_elf_begin.
+
2019-06-28 Mark Wielaard <mark@klomp.org>
* libdw.map (ELFUTILS_0.177): New section. Add
diff --git a/libdw/libdw.map b/libdw/libdw.map
index 2e1c0e9e..decac05c 100644
--- a/libdw/libdw.map
+++ b/libdw/libdw.map
@@ -365,4 +365,8 @@ ELFUTILS_0.175 {
ELFUTILS_0.177 {
global:
dwelf_elf_e_machine_string;
+ # Replaced ELFUTILS_0.175 versions. Both versions point to the
+ # same implementation, but users of the new symbol version can
+ # presume that NULL is only returned on error (otherwise ELF_K_NONE).
+ dwelf_elf_begin;
} ELFUTILS_0.175;
diff --git a/libdwelf/ChangeLog b/libdwelf/ChangeLog
index 29f9a509..5b48ed8f 100644
--- a/libdwelf/ChangeLog
+++ b/libdwelf/ChangeLog
@@ -1,3 +1,9 @@
+2019-08-12 Mark Wielaard <mark@klomp.org>
+
+ * libdwelf.h (dwelf_elf_begin): Update documentation.
+ * dwelf_elf_begin.c (dwelf_elf_begin): Don't suppress ELF_K_NONE.
+ Mark old and new version.
+
2019-06-28 Mark Wielaard <mark@klomp.org>
* Makefile.am (libdwelf_a_SOURCES): Add dwelf_elf_e_machine_string.c.
diff --git a/libdwelf/dwelf_elf_begin.c b/libdwelf/dwelf_elf_begin.c
index 79825338..c7d63a1c 100644
--- a/libdwelf/dwelf_elf_begin.c
+++ b/libdwelf/dwelf_elf_begin.c
@@ -41,13 +41,13 @@ dwelf_elf_begin (int fd)
{
Elf *elf = NULL;
Dwfl_Error e = __libdw_open_elf (fd, &elf);
- if (elf != NULL && elf_kind (elf) != ELF_K_NONE)
+ if (e == DWFL_E_NOERROR)
return elf;
- /* Elf wasn't usable. Make sure there is a proper elf error message. */
-
- if (elf != NULL)
- elf_end (elf);
+ /* Elf wasn't usable. Make sure there is a proper elf error
+ message. This is probably not the real error, because there is
+ no good way to propagate errnos or decompression errors, but
+ better than nothing. */
if (e != DWFL_E_LIBELF)
{
@@ -60,3 +60,5 @@ dwelf_elf_begin (int fd)
return NULL;
}
+OLD_VERSION (dwelf_elf_begin, ELFUTILS_0.175)
+NEW_VERSION (dwelf_elf_begin, ELFUTILS_0.177)
diff --git a/libdwelf/libdwelf.h b/libdwelf/libdwelf.h
index cb7ea091..dbb8f08c 100644
--- a/libdwelf/libdwelf.h
+++ b/libdwelf/libdwelf.h
@@ -128,9 +128,12 @@ extern void dwelf_strtab_free (Dwelf_Strtab *st)
/* Creates a read-only Elf handle from the given file handle. The
file may be compressed and/or contain a linux kernel image header,
in which case it is eagerly decompressed in full and the Elf handle
- is created as if created with elf_memory (). On error NULL is
- returned. The Elf handle should be closed with elf_end (). The
- file handle will not be closed. Does not return ELF_K_NONE handles. */
+ is created as if created with elf_memory (). On decompression or
+ file errors NULL is returned (and elf_errno will be set). If there
+ was no error, but the file is not an ELF file, then an ELF_K_NONE
+ Elf handle is returned (just like with elf_begin). The Elf handle
+ should be closed with elf_end (). The file handle will not be
+ closed. */
extern Elf *dwelf_elf_begin (int fd);
/* Returns a human readable string for the given ELF header e_machine
diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog
index 8cbe90c9..04a39637 100644
--- a/libdwfl/ChangeLog
+++ b/libdwfl/ChangeLog
@@ -1,3 +1,11 @@
+2019-08-12 Mark Wielaard <mark@klomp.org>
+
+ * gzip.c (open_stream): Return DWFL_E_ERRNO on bad file operation.
+ * open.c (libdw_open_elf): New argument bad_elf_ok. Check it and
+ return DWFL_E_NOERROR in case it is set and error was DWFL_E_BADELF.
+ (__libdw_open_file): Call libdw_open_elf with bad_elf_ok false.
+ (__libdw_open_elf): Call libdw_open_elf with bad_elf_ok true.
+
2019-08-05 Omar Sandoval <osandov@fb.com>
* dwfl_segment_report_module.c (dwfl_segment_report_module): Assign
diff --git a/libdwfl/gzip.c b/libdwfl/gzip.c
index c2c13baf..043d0b6e 100644
--- a/libdwfl/gzip.c
+++ b/libdwfl/gzip.c
@@ -139,14 +139,14 @@ open_stream (int fd, off_t start_offset, struct unzip_state *state)
{
int d = dup (fd);
if (unlikely (d < 0))
- return DWFL_E_BADELF;
+ return DWFL_E_ERRNO;
if (start_offset != 0)
{
off_t off = lseek (d, start_offset, SEEK_SET);
if (off != start_offset)
{
close (d);
- return DWFL_E_BADELF;
+ return DWFL_E_ERRNO;
}
}
state->zf = gzdopen (d, "r");
@@ -288,6 +288,7 @@ unzip (int fd, off_t start_offset,
if (result == DWFL_E_NOERROR && gzdirect (state.zf))
{
gzclose (state.zf);
+ /* Not a compressed stream after all. */
return fail (&state, DWFL_E_BADELF);
}
diff --git a/libdwfl/open.c b/libdwfl/open.c
index 74367359..35fc5283 100644
--- a/libdwfl/open.c
+++ b/libdwfl/open.c
@@ -118,7 +118,7 @@ what_kind (int fd, Elf **elfp, Elf_Kind *kind, bool *may_close_fd)
static Dwfl_Error
libdw_open_elf (int *fdp, Elf **elfp, bool close_on_fail, bool archive_ok,
- bool never_close_fd)
+ bool never_close_fd, bool bad_elf_ok)
{
bool may_close_fd = false;
@@ -164,6 +164,10 @@ libdw_open_elf (int *fdp, Elf **elfp, bool close_on_fail, bool archive_ok,
&& !(archive_ok && kind == ELF_K_AR))
error = DWFL_E_BADELF;
+ /* This basically means, we keep a ELF_K_NONE Elf handle and return it. */
+ if (bad_elf_ok && error == DWFL_E_BADELF)
+ error = DWFL_E_NOERROR;
+
if (error != DWFL_E_NOERROR)
{
elf_end (elf);
@@ -184,11 +188,11 @@ libdw_open_elf (int *fdp, Elf **elfp, bool close_on_fail, bool archive_ok,
Dwfl_Error internal_function
__libdw_open_file (int *fdp, Elf **elfp, bool close_on_fail, bool archive_ok)
{
- return libdw_open_elf (fdp, elfp, close_on_fail, archive_ok, false);
+ return libdw_open_elf (fdp, elfp, close_on_fail, archive_ok, false, false);
}
Dwfl_Error internal_function
__libdw_open_elf (int fd, Elf **elfp)
{
- return libdw_open_elf (&fd, elfp, false, true, true);
+ return libdw_open_elf (&fd, elfp, false, true, true, true);
}
--
2.18.1
next prev parent reply other threads:[~2019-08-11 23:38 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-12 15:38 Florian Weimer
2019-04-15 15:39 ` Mark Wielaard
2019-04-16 11:38 ` Florian Weimer
2019-04-18 11:17 ` Florian Weimer
2019-07-19 12:47 ` Mark Wielaard
2019-07-19 13:43 ` Dmitry V. Levin
2019-07-19 14:21 ` Mark
2019-07-19 18:35 ` Dmitry V. Levin
2019-07-19 21:00 ` Florian Weimer
2019-07-19 21:23 ` Dmitry V. Levin
2019-07-19 21:36 ` Mark Wielaard
2019-07-19 22:57 ` Dmitry V. Levin
2019-07-20 21:51 ` Mark Wielaard
2019-07-25 22:39 ` [PATCH] elfclassify: Add --library classification Mark Wielaard
2019-07-26 22:53 ` Dmitry V. Levin
2019-07-26 23:04 ` [PATCH] elfclassify tool Dmitry V. Levin
2019-07-27 11:54 ` Mark Wielaard
2019-07-20 21:40 ` Mark Wielaard
2019-07-22 15:55 ` Florian Weimer
2019-07-26 22:11 ` Mark Wielaard
2019-07-29 8:44 ` Florian Weimer
2019-07-29 14:24 ` Mark Wielaard
2019-08-11 23:38 ` Mark Wielaard [this message]
2019-08-12 8:14 ` Florian Weimer
2019-08-12 15:18 ` Mark Wielaard
2019-07-29 9:16 ` Florian Weimer
2019-07-29 14:34 ` Mark Wielaard
2019-07-29 14:38 ` Florian Weimer
2019-08-13 9:44 ` Mark Wielaard
2019-08-13 11:42 ` Mark Wielaard
2019-08-14 20:46 ` [PATCH] config/elfutils.spec.in: package eu-elfclassify Dmitry V. Levin
2019-08-15 7:33 ` Mark Wielaard
2019-07-29 9:22 ` [PATCH] elfclassify tool Florian Weimer
2019-07-29 14:40 ` Mark Wielaard
2019-07-29 14:42 ` Florian Weimer
2019-07-19 13:24 ` Mark Wielaard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a2dd6b0becb9c886f03b66a067c0d25417e4ec16.camel@klomp.org \
--to=mark@klomp.org \
--cc=elfutils-devel@sourceware.org \
--cc=fweimer@redhat.com \
--cc=pmatilai@laiskiainen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).