public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Ulf Samuelsson <binutils@emagii.com>
To: Fangrui Song <i@maskray.me>
Cc: Nick Clifton <nickc@redhat.com>, binutils@sourceware.org
Subject: Re: [RFC v0 0/1] Add support for CRC64 generation in linker
Date: Mon, 6 Mar 2023 11:00:33 +0100	[thread overview]
Message-ID: <1d074a88-856c-bf0c-524f-6eaa8c268cfa@emagii.com> (raw)
In-Reply-To: <DS7PR12MB57654E11983392D5DCBF1D85CBB69@DS7PR12MB5765.namprd12.prod.outlook.com>


On 2023-03-06 08:50, Fangrui Song wrote:
> On Fri, Feb 17, 2023 at 4:03 AM Ulf Samuelsson <binutils@emagii.com> wrote:
>>
>> Den 2023-02-17 kl. 12:11, skrev Nick Clifton:
>>> Hi Ulf,
>>>
>>>>> Hi Ulf, can you state why a built-in support of ld is needed? If you
>>>>> want to embed a checksum, you can use Output Section Data to reserve a
>>>>> few bytes in the output, then use a post-link tool to compute the
>>>>> checksum and rewrite the reserved bytes.
>>>> In my experience, the post link tools usually work on the binary data
>>>> and not the ELF file.
>>> The objcopy program can do most of this for you though.  For example:
>>>
>>>    % objcopy --dump-section .text=contents-of-text a.out
>>>    % crc32 contents-of-text > crc32-of-text
>>>    % objcopy --add-section .crc32=crc32-of-text a.out
>>>    % readelf -x.crc32 a.out
>>>    Hex dump of section '.crc32':
>>>      0x00000000 32323064 37636339 0a                220d7cc9.
>>>
>>> In this example the crc32 is stored as ascii text, but I am sure that
>>> you can find a version of the crc32 program that generates binary output.
>> The crc32 generates a 32-bit CRC. Modern microcontrollers require a
>> 64-bit CRC.
>>
>> The second problem is: where is the .crc32 section and its contents?
>> The program needs to access the contents, but it is already linked.
>> The typical use is a header in front of the program, and the header
>> is part of the ".text" area.
>> Can you explain how this would work?
>>
>>>
>>>> Another thing is that the post-link tools I have seen are typically
>>>> poorly maintained.
>>> ...and so you want to move that maintainership burden onto us, yes ?
>> The problem with the post-link tools is that they are hard wired to work on
>> special use cased.
>> Example of problems
>>
>> * CRC is fixed to be at a certain address
>>
>> * CRC table is fixed to be at a certain address.
>>
>> * Works on binaries and not on ELF files
>>
>> * You have to have one postprocessor for each file format.
> Hi Ulf, I think a natural question from other binutils contributors
> is: why is the CRC-64 feature so special that it deserves several
> keywords dedicated for it in the linker script language.
> You can place placeholder content into the CRC-64 section (say, it is
> .crc64), compute its value with a post-link program, then update the
> content with
> objcopy --update-section .crc64=.... exe

If you look at the latest patchset (v11),
you will find that there is no "CRC-64" keyword.
It is replaced by the "DIGEST" keyword which takes a string parameter
describing a "known" algorithm, or the POLY keyword
which allows you to specify your own algorithm.

If someone wants to support additional algorithms like the SHA series
it is esaily extended.

> If objcopy --update-section somehow doesn't achieve your goal, it may
> be worth a feature request or bug, since the operation is generic and
> useful for a large number of users, not just your CRC-64 customers.
The problems with supporting things in an external application
is that you need an application for every conceivable object file format.

The ielftools that I have used for this is 11-12000 lines of code of 
non-trivial code.

I do not not know just how many file format are supported but to me, it 
appears
to be at least a dozen.

There is no standard for such a tool, so you will find that many companies
need to maintain their own tool for this very purpose.

The next problem is that the "crc32" application is really only good for 
very
small microcontrollers with a few kB of flash. In order to support the 
128kB-MBs
of flash available in modern microcontrollers you need a 64 bit CRC.

There is no standard application that generates CRC-64.
That means someone needs to write an application that is allows standard
polynomes as well as custom polynomes.

Since the problem is difficult, many resort to just updating the binary,

which means that the code is not easily debuggable.
You cannot load the ELF file into the debugger and run, because it does
not have the checksum.

This patchset really allows people worldwide to get rid of millions
of lines of code that no longer is needed.

=====

The alternative I am proposing is straightforward code

The core code, doing the CRC calculation is well known.
The libcrc has not changed in 7 years.

>> None of these problems affect the linker since it is agnostic on the
>> file format
>> as long as there is a ".text" section.
>>
>> The CRC calculation has been stable on www.libcrc.com for 7 years.
>> There is no reason for the CRC calculation to change.
>> The only chance I can see is a different polynom, but that is already
>> supported.
>>
>> The biggest problem is of course that it slows down the debugging
>> because you cannot download from an ELF file - it lacks the CRC.
> I am unsure how a linker script extension is more convenient than
> using the existing functionality plus a CRC64 calculator and objcopy.
> The linker script extension appears to add a lot of code to the linker
> script language, which is already quite challenging to maintain.

It supports all object file formats and all architectures immediately.
It replaces millions of lines of code.

Your solution requires that the process knows which section contains the 
CRC.
If someone moves the CRC to a different section, your toolchain is broken.

The crc32 application computes the crc of all the .text section, so it 
cannot be used.
but you need to calculate the checsum of only part of the selected section.

You also need to insert the checksum at a user selectable part of the 
section.

On top of that, not having to postprocess the object code simplifies the 
flow.

So objcopy and crc32 does not meet the requirements as of today.

KEYWORDS.

Right now it adds the following keywords "DIGEST", "TABLE", "POLY".

In addition the patch has two more features.

* Debugging feature adding "DEBUG", "ON", "OFF"

* timestamp feature adding "TIMESTAMP"

> If you distribute such object files with CRC64, I don't think it harms
> debuggability.

There is no CRC64 application, and even if it was, it is a fragile solution,
because there is no automatic information on where to put the checksum
and what area to calculate the checksum on.

Adding checksum calculations to the linker is converting a very
difficult problem into a really simple one.

If you were managing a team to decide where to put the CRC calculation
the linker is the obvious place to look at.

Best Regards
Ulf Samuelsson

>
>>>
>>>> Adding a post-link step seems like a kludge if the linker can provide
>>>> the CRC inside the ELF file.
>>> But it also keeps things simple.  No new code in the linker = no new
>>> bugs in the linker.  Solving a problem using existing tools = no need
>>> for new versions of the linker when the already existing versions will
>>> work just fine.
>> The problem is that the existing versions *barely* work.
>> Every company have to write their own solution.
>> It does not support source level debugging.
>>
>> Supporting it in the linker makes for a much cleaner solution.
>>
>>> Cheers
>>>    Nick
>>>
>> Best Regards
>>
>> Ulf Samuelsson
>>
>>

  parent reply	other threads:[~2023-03-06 10:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-16 20:40 binutils
2023-02-16 20:40 ` [PATCH v0 1/6] CRC64 header binutils
2023-02-16 20:40 ` [PATCH v0 2/6] ldlang.h: CRC64 binutils
2023-02-16 20:40 ` [PATCH v0 3/6] ldlex.l: CRC64 binutils
2023-02-16 20:40 ` [PATCH v0 4/6] ldgram.y: CRC64 binutils
2023-02-16 20:40 ` [PATCH v0 5/6] ldlang.c: CRC64 binutils
2023-02-16 20:40 ` [PATCH v0 6/6] ldlang.c: Try to get the .text section for checking CRC binutils
2023-02-16 21:30 ` [RFC v0 0/1] Add support for CRC64 generation in linker Fangrui Song
     [not found] ` <DS7PR12MB57657A0E46493FAAA203AF77CBA09@DS7PR12MB5765.namprd12.prod.outlook.com>
2023-02-16 22:37   ` Ulf Samuelsson
2023-02-17 11:11     ` Nick Clifton
2023-02-17 12:03       ` Ulf Samuelsson
2023-03-06  7:50         ` Fangrui Song
     [not found]         ` <DS7PR12MB57654E11983392D5DCBF1D85CBB69@DS7PR12MB5765.namprd12.prod.outlook.com>
2023-03-06 10:00           ` Ulf Samuelsson [this message]
2023-02-17  7:53 ` Ulf Samuelsson
2023-02-17 10:55   ` Nick Clifton
     [not found] ` <DS7PR12MB5765096101240054A648F8C9CBA09@DS7PR12MB5765.namprd12.prod.outlook.com>
2023-02-17 10:46   ` Nick Clifton
  -- strict thread matches above, loose matches on Subject: below --
2023-02-16 13:19 binutils
2023-02-16 13:21 ` Ulf Samuelsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d074a88-856c-bf0c-524f-6eaa8c268cfa@emagii.com \
    --to=binutils@emagii.com \
    --cc=binutils@sourceware.org \
    --cc=i@maskray.me \
    --cc=nickc@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).