public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* RFC: Adding a fixed format output mode to readelf
@ 2021-02-26 17:03 Nick Clifton
  2021-02-26 17:48 ` Fangrui Song
  2021-02-26 18:51 ` Mike Frysinger
  0 siblings, 2 replies; 6+ messages in thread
From: Nick Clifton @ 2021-02-26 17:03 UTC (permalink / raw)
  To: Martin Liška; +Cc: binutils

Hi Martin,

   PR 27309 got me thinking about readelf's output, and whilst it is
   true that we do not make any guarantees about the format or contents
   of the output it appears that other packages are becoming reliant
   upon its currently established behaviour.

   So I was wondering whether it would be a good idea to introduce
   some kind of fixed format output option, and what exactly this
   format might look like.  Do you have any opinions on the matter ?

Cheers
   Nick



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Adding a fixed format output mode to readelf
  2021-02-26 17:03 RFC: Adding a fixed format output mode to readelf Nick Clifton
@ 2021-02-26 17:48 ` Fangrui Song
  2021-02-26 18:51 ` Mike Frysinger
  1 sibling, 0 replies; 6+ messages in thread
From: Fangrui Song @ 2021-02-26 17:48 UTC (permalink / raw)
  To: Nick Clifton; +Cc: Martin Liška, binutils

Hi Nick,

On 2021-02-26, Nick Clifton via Binutils wrote:
>Hi Martin,
>
>  PR 27309 got me thinking about readelf's output, and whilst it is
>  true that we do not make any guarantees about the format or contents
>  of the output it appears that other packages are becoming reliant
>  upon its currently established behaviour.
>
>  So I was wondering whether it would be a good idea to introduce
>  some kind of fixed format output option, and what exactly this
>  format might look like.  Do you have any opinions on the matter ?
>
>Cheers
>  Nick

My feeling is that the --debug-dump option is not commonly used.  Yes,
it was unfortunate that the event happened to strace, but it is still
rare and part of software maintenance (probably not worse than
addressing new compiler warning).

Options such as -r, -S, -s, --dyn-syms, -p, -x are used more widely.
We should care about their stability.
For --debug-dump specifically, I think a new format does not worth the
trouble.

(
Something related:
tt turns out many folks use --wide regardless of the mode as the narrow
output can inadvertently omit some important trailing strings.
The wide output is the output scripts may depend on.
)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Adding a fixed format output mode to readelf
  2021-02-26 17:03 RFC: Adding a fixed format output mode to readelf Nick Clifton
  2021-02-26 17:48 ` Fangrui Song
@ 2021-02-26 18:51 ` Mike Frysinger
  2021-02-27  1:24   ` Fangrui Song
  2021-03-01 13:03   ` Nick Alcock
  1 sibling, 2 replies; 6+ messages in thread
From: Mike Frysinger @ 2021-02-26 18:51 UTC (permalink / raw)
  To: Nick Clifton; +Cc: Martin Liška, binutils

On 26 Feb 2021 17:03, Nick Clifton via Binutils wrote:
>    PR 27309 got me thinking about readelf's output, and whilst it is
>    true that we do not make any guarantees about the format or contents
>    of the output it appears that other packages are becoming reliant
>    upon its currently established behaviour.
> 
>    So I was wondering whether it would be a good idea to introduce
>    some kind of fixed format output option, and what exactly this
>    format might look like.  Do you have any opinions on the matter ?

having an official machine readable output format would be nice.  it's def
true that a number of scripts out there have already grown dependencies on
various output modes of readelf for lack of alternatives.

i added a --format option to the scanelf utility many moons ago because
there was no other utility available that would provide stable output.
but that's heavily tailored towards scanelf's operating mode (kind of a
`find` for ELFs for distro packagers).

only two options generally come to mind in this space: output a stable
known format like JSON, or hoist it onto the user with a --format option
(like what git log does).

the --format mode only works well though when there's not repeated fields,
and the data is simple (e.g. just printable ASCII).  unfortunately ELFs
are the opposite of this :).  for example, if the user wanted to dump all
DT_NEEDED or DT_RPATH tags, how would those be shown in a way that is
reliable and parseable ?

JSON would handle both of these issues.  but i think in the low-level tool
space that we occupy, there's a general aversion towards things like JSON,
so i'm not confident how well it'd be adopted, which would kind of defeat
the purpose of doing this in the first place.  i think the biggest reason
for JSON dislike is that there is no standard tool for parsing the format,
so people have to use something higher level (like python) or something a
bit non-standard (like jq), or something terrible (like sed/awk + regex).
-mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Adding a fixed format output mode to readelf
  2021-02-26 18:51 ` Mike Frysinger
@ 2021-02-27  1:24   ` Fangrui Song
  2021-03-01 12:02     ` Martin Liška
  2021-03-01 13:03   ` Nick Alcock
  1 sibling, 1 reply; 6+ messages in thread
From: Fangrui Song @ 2021-02-27  1:24 UTC (permalink / raw)
  To: Nick Clifton, Martin Liška, binutils

On 2021-02-26, Mike Frysinger via Binutils wrote:
>On 26 Feb 2021 17:03, Nick Clifton via Binutils wrote:
>>    PR 27309 got me thinking about readelf's output, and whilst it is
>>    true that we do not make any guarantees about the format or contents
>>    of the output it appears that other packages are becoming reliant
>>    upon its currently established behaviour.
>>
>>    So I was wondering whether it would be a good idea to introduce
>>    some kind of fixed format output option, and what exactly this
>>    format might look like.  Do you have any opinions on the matter ?
>
>having an official machine readable output format would be nice.  it's def
>true that a number of scripts out there have already grown dependencies on
>various output modes of readelf for lack of alternatives.
>
>i added a --format option to the scanelf utility many moons ago because
>there was no other utility available that would provide stable output.
>but that's heavily tailored towards scanelf's operating mode (kind of a
>`find` for ELFs for distro packagers).
>
>only two options generally come to mind in this space: output a stable
>known format like JSON, or hoist it onto the user with a --format option
>(like what git log does).
>
>the --format mode only works well though when there's not repeated fields,
>and the data is simple (e.g. just printable ASCII).  unfortunately ELFs
>are the opposite of this :).  for example, if the user wanted to dump all
>DT_NEEDED or DT_RPATH tags, how would those be shown in a way that is
>reliable and parseable ?
>
>JSON would handle both of these issues.  but i think in the low-level tool
>space that we occupy, there's a general aversion towards things like JSON,
>so i'm not confident how well it'd be adopted, which would kind of defeat
>the purpose of doing this in the first place.  i think the biggest reason
>for JSON dislike is that there is no standard tool for parsing the format,
>so people have to use something higher level (like python) or something a
>bit non-standard (like jq), or something terrible (like sed/awk + regex).
>-mike

Speaking of JSON, perhaps folks may have an opinion on addr2line's JSON
output (if there is a desire to support that).

On LLVM land someone is trying https://reviews.llvm.org/D96883

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Adding a fixed format output mode to readelf
  2021-02-27  1:24   ` Fangrui Song
@ 2021-03-01 12:02     ` Martin Liška
  0 siblings, 0 replies; 6+ messages in thread
From: Martin Liška @ 2021-03-01 12:02 UTC (permalink / raw)
  To: Fangrui Song, Nick Clifton, binutils

On 2/27/21 2:24 AM, Fangrui Song wrote:
> On 2021-02-26, Mike Frysinger via Binutils wrote:
>> On 26 Feb 2021 17:03, Nick Clifton via Binutils wrote:
>>>    PR 27309 got me thinking about readelf's output, and whilst it is
>>>    true that we do not make any guarantees about the format or contents
>>>    of the output it appears that other packages are becoming reliant
>>>    upon its currently established behaviour.
>>>
>>>    So I was wondering whether it would be a good idea to introduce
>>>    some kind of fixed format output option, and what exactly this
>>>    format might look like.  Do you have any opinions on the matter ?
>>
>> having an official machine readable output format would be nice.  it's def
>> true that a number of scripts out there have already grown dependencies on
>> various output modes of readelf for lack of alternatives.
>>
>> i added a --format option to the scanelf utility many moons ago because
>> there was no other utility available that would provide stable output.
>> but that's heavily tailored towards scanelf's operating mode (kind of a
>> `find` for ELFs for distro packagers).
>>
>> only two options generally come to mind in this space: output a stable
>> known format like JSON, or hoist it onto the user with a --format option
>> (like what git log does).
>>
>> the --format mode only works well though when there's not repeated fields,
>> and the data is simple (e.g. just printable ASCII).  unfortunately ELFs
>> are the opposite of this :).  for example, if the user wanted to dump all
>> DT_NEEDED or DT_RPATH tags, how would those be shown in a way that is
>> reliable and parseable ?
>>
>> JSON would handle both of these issues.  but i think in the low-level tool
>> space that we occupy, there's a general aversion towards things like JSON,
>> so i'm not confident how well it'd be adopted, which would kind of defeat
>> the purpose of doing this in the first place.  i think the biggest reason
>> for JSON dislike is that there is no standard tool for parsing the format,
>> so people have to use something higher level (like python) or something a
>> bit non-standard (like jq), or something terrible (like sed/awk + regex).
>> -mike
> 
> Speaking of JSON, perhaps folks may have an opinion on addr2line's JSON
> output (if there is a desire to support that).
> 
> On LLVM land someone is trying https://reviews.llvm.org/D96883

I do support the JSON format. I was handling a similar thing in the GCC compiler.
GCOV tool supports human readable format and an intermediate format. During the time
we moved the intermediate format into JSON. It's extensible format feasible
for output processing. Parsing a standard output is awkward and breaks easily.

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Adding a fixed format output mode to readelf
  2021-02-26 18:51 ` Mike Frysinger
  2021-02-27  1:24   ` Fangrui Song
@ 2021-03-01 13:03   ` Nick Alcock
  1 sibling, 0 replies; 6+ messages in thread
From: Nick Alcock @ 2021-03-01 13:03 UTC (permalink / raw)
  To: Nick Clifton; +Cc: Martin Liška, binutils

On 26 Feb 2021, Mike Frysinger via Binutils told this:

> On 26 Feb 2021 17:03, Nick Clifton via Binutils wrote:
>>    PR 27309 got me thinking about readelf's output, and whilst it is
>>    true that we do not make any guarantees about the format or contents
>>    of the output it appears that other packages are becoming reliant
>>    upon its currently established behaviour.
>> 
>>    So I was wondering whether it would be a good idea to introduce
>>    some kind of fixed format output option, and what exactly this
>>    format might look like.  Do you have any opinions on the matter ?
>
> having an official machine readable output format would be nice.  it's def
> true that a number of scripts out there have already grown dependencies on
> various output modes of readelf for lack of alternatives.

The JSON output mode in iproute2's tools is really useful for simple
automated realtime response stuff, fwiw.

> the --format mode only works well though when there's not repeated fields,
> and the data is simple (e.g. just printable ASCII).  unfortunately ELFs
> are the opposite of this :).  for example, if the user wanted to dump all
> DT_NEEDED or DT_RPATH tags, how would those be shown in a way that is
> reliable and parseable ?
>
> JSON would handle both of these issues.  but i think in the low-level tool
> space that we occupy, there's a general aversion towards things like JSON,

I don't know why. The JSON format spec is a horror, but that's only if
you have to parse all its quirks. Simply generating valid JSON (when you
can decide not to emit any of the badly-specified quirks) is easy
enough.

(FWIW, though it is only a very minor part of things, I would be happy
to add a JSON output mode to ctf_dump(), which both objdump and readelf
use.)

> so i'm not confident how well it'd be adopted, which would kind of defeat
> the purpose of doing this in the first place.  i think the biggest reason
> for JSON dislike is that there is no standard tool for parsing the format,
> so people have to use something higher level (like python) or something a
> bit non-standard (like jq), or something terrible (like sed/awk + regex).

I find it surprising that nobody thinks jq *is* a standard tool by now.
(My only complaint about jq is that its syntax for doing complicated
stuff is *also* incredibly confusing...)

Another interesting thing for exploring JSON and related formats is
nushell. It's completely bizarre if you think of it as a shell, but if
you think of it as a structured data explorer... it's still not as good
as a hurd translator from/to JSON would be, but it's better than any
alternative I can remember seeing.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-01 13:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26 17:03 RFC: Adding a fixed format output mode to readelf Nick Clifton
2021-02-26 17:48 ` Fangrui Song
2021-02-26 18:51 ` Mike Frysinger
2021-02-27  1:24   ` Fangrui Song
2021-03-01 12:02     ` Martin Liška
2021-03-01 13:03   ` Nick Alcock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).