From: Ulf Samuelsson <binutils@emagii.com>
To: Nick Clifton <nickc@redhat.com>, binutils@sourceware.org
Subject: Re: [PATCH v8 0/11 Add support for CRC64 generation in linker
Date: Thu, 2 Mar 2023 23:02:30 +0100 [thread overview]
Message-ID: <d125743e-21f2-e113-3b12-919774383a2f@emagii.com> (raw)
In-Reply-To: <2b4f460e-3796-10b0-7fc4-a0635acdea18@redhat.com>
On 2023-03-02 17:43, Nick Clifton wrote:
> Hi Ulf,
>
> The patch set is looking good so far.
>
> I have been running some experiments on misusing the DIGEST directive
> and found a few problems. For example, if I want to have two digests
> in my binary:
>
> % ld foo.o -T test.t
> ld:test.t:19: warning: Only the first CRC polynome is used
> ld:crc.t:0: warning: updating CRC failed
> CRC generation aborted
>
> which would be fine, except for the fact that the link succeeds:
>
> % echo $?
> 0
>
> (Hint: You can use %X in your message string when calling einfo() to set
> an error exit code...)
>
> Even better would be to allow an number of digests. I still do not
> know
> why you want to have a limit of 1.
I only see the DIGEST feature as useful for an embedded system executing
from flash.
It would not be used for applications running linux - until linux starts
to CRC check
the applications before they run.
Embedded Linux, typically compresses the kernel and encapsulates the the
compressed data in a header (U-Boot mkimage) is an example,
but that, I think is outside the scope of the linker.
In an embedded system you use the digest in two ways.
1. The bootloader will start by verifying its own contents.
At reset, the ".data" and ".bss" sections of the bootloader are
uninitialized.
They will be initialized by the bootloader, from the ".text" section.
The CRC check will be done before initialization so you cannot
use this to check the ".data" or ".bss".
2. The bootloader will verify the real application before it is started.
The application has a header containing the digest in a location
known to the bootloader.
The check should be of the *entire flash contents* except the
digest itself.
There is nothing else to check so having additional checks is
meaningless.
The bootloader cannot check the ".data" or ".bss" of the application
because they are initialized by the application, and the bootloader
will check
before the application is started.
The bootloader will again check the entire flash contents, leaving
nothing else to check.
If the application needs to do a selfcheck of the flash while running,
it will use the same digest as the bootloader and check the same area.
There is no need for a new digest for this.
If the application needs to calculate CRC on other things, like
communications,
using another CRC algorithm, it can add the CRC routines and table
in the
application code. Doing it in the linker just adds complexity.
That is why I suspect it does not make sense to support anything
but ".text"
and it does not make sense to have more than one digest.
=================================================
I can see that it would be good to have a built in function inside
the gcc compiler
that would generate crc routines.
I.E: Something like this:
#pragma digest "CRC64-ECMA" "crc_ecma" using "crc_tab"
#pragma digest_table "CRC64-ECMA" "crc_tab"
extern uint64_t crc_ecma(char *s, uint32_t numbytes);
=================================================
The digest table is strictly not neccessary, but it is convenient and
removes the risk
that the user uses a different table than the table used to generate the
digest in the linker.
=============================================
If you try to use two digests in error, then you have two cases.
1. You add a second digest which is not used by the application.
Then the link completes with a warning.
Really not worrysome.
2. You add a second digest which is used by the application.
DIGEST "digest1" "CRC64-ECMA"
DIGEST "digest2" "CRC64-ISO"
The application needs to refer to the digest and declares.
extern uint64_t digest1;
extern uint64_t digest2;
uint64_t mycrc = digest2;
The linker will not find "digest2" and will report that as an error;
and then the warning message will be detected as the user looks at
the log.
===============================================
>
> I do not like using the DIGEST SECTION directive to specify the section
> for the digest. This should be implied by the presence of the DIGEST
> directive itself. For example:
>
> .data : { DIGEST ... }
I did not manage to figure out which section I am in
when the digest command is executed, so I tool the easy way out.
As I said above, this is not going to happen in the real world
as you cannot use this to check the ".data" section in an proper setup.
I'd like to remove the function and only support ".text".
>
> ought to create a digest in the .data section. Furthermore it seems
> that
> the DIGEST SECTION directive does not work reliably:
>
> % cat test.t
> _start = 0x000000;
> SECTIONS
> {
> . = 0x1000 + SIZEOF_HEADERS;
>
> .foo ALIGN (CONSTANT (COMMONPAGESIZE)) :
> {
> foo_start = .;
> *(.foo)
> foo_end = .;
> }
>
> .data ALIGN (CONSTANT (COMMONPAGESIZE)) :
> {
> DIGEST SECTION ".foo"
> DIGEST "BE.BE.BE" POLY (64,a,b,c,d,e,f) (foo_start , foo_end)
> *(.data)
> }
>
> /DISCARD/ : { *(*) }
> }
>
> % ld -T test.t foo.o
> ld:test.t:0: warning: updating CRC failed
> CRC generation aborted
>
>
The digest, start and end needs to be in the same section.
You put the start and end in ".foo" and the digest in ".data".
I do a check that the digest and the start/end are in the selected
section, but I do not understand why there is no error message.
===============================================
> Another issue - if the digest is unrecognised then the error message
> is not very helpful. For example I tried:
>
> DIGEST "0.BE" "TCRC32" (___CRC32_START___ , text_end)
>
> and received the error message:
>
> ld: Unknown DIGEST: no error
>
Will see if I can fix that.
> (There is no line number in the error message, nor does it repeat of the
> unrecognized name).
>
> Incidentally if a valid digest name is used in the above example a
> symbol
> with the name "0" is generated, which is not really allowed as symbol
> names should not start with a numeric value. But that is probably
> being
> too pedantic.
If the user tries to access the digest, he needs to declare a variable with
the name "0"
I.E:
uint64_t 0;
Good luck with that :-)
If I read the manual, I find this:
####################
3.10.3 Symbol Names
Unless quoted, symbol names start with a letter, underscore, or period
and may include
letters, digits, underscores, periods, and hyphens. Unquoted symbol
names must not conflict
with any keywords. You can specify a symbol which contains odd
characters or has the
same name as a keyword by surrounding the symbol name in double quotes:
"SECTION" = 9;
"with a space" = "also with a space" + 10;
Since symbols can contain many non-alphabetic characters, it is safest
to delimit symbols
with spaces. For example, ‘A-B’ is one symbol, whereas ‘A - B’ is an
expression involving
subtraction.
####################
so "0" is actually a legal symbol in the linker.
Since '.' is also a legal character, I will change the endian character
to '#'.
I.E:
DIGEST "CRC#BE" "CRC64-ECMA"
===============================================
> Oh - and here is a good one. If the start symbol is beyond the end
> symbol
> the code triggers a segmentation fault:
>
> % cat test.t
> [...]
> .text ALIGN (CONSTANT (COMMONPAGESIZE)) :
> {
> DIGEST "BE.BE.BE" POLY (64,a,b,c,d,e,f) (text_end , text_start)
> text_start = .;
> *(.text)
> text_end = .;
> }
> [...]
>
> % ld foo.o -T test.t
> Segmentation fault (core dumped)
OK, I verify the sanity of those variables, and that is a simple check
which I will add.
>
>
> I am not sure if the next one matters or not, but just in case: The
> checksum and crc table can be generated on unaligned boundaries if
> the code does not happen to be aligned. For example:
>
> .text ALIGN (CONSTANT (COMMONPAGESIZE)) :
> {
> BYTE(0);
> DIGEST "BE.BE.BE" POLY (64,a,b,c,d,e,f) (text_start , text_end)
> text_start = .;
> *(.text)
> text_end = .;
> }
Since there are architectures which supports byte aligned instructions
it is up to the user to specify alignment in the linker file.
( I was raised with the Series 32000 which supported instruction lengths
between 2..23 bytes )
If they need the alignment but fail in the linker file,
then this will result in a trap when testing.
================================================
> If the new directives are used outside of an output section then they
> do not generate very helpful error messages. For example:
>
> % cat test.t
> _start = 0x000000;
> SECTIONS
> {
> DEBUG ON
> /DISCARD/ : { *(*) }
> }
>
> % ld foo.o -T test.t
> ld:test.t:5: syntax error
>
> (Note the incorrect line number for the error). This is probably a
> limitation
> of the grammar supported by bison however, so I would not be
> surprised if there
> is no easy way to fix it.
That is part of the higher level parsing, so there is probably nothing I
can do about that.
The line number problem is also high level parsing, so I think that
should be reported as a bug.
I suspect you get the same problem if you replace the DEBUG with a QUAD
directive in the same place.
================================================
>
> On a related note - I think that there is a problem with the checksum
> generation. Specifically using the ".BE" suffix to indicate that a
> big-endian checksum should be computed. The problem is that some
> targets
> support both big-endian and little-endian output, selectable via a
> command line option, and so a static choice in a linker script is
> inappropriate. May I suggest an alternative ? Support both ".LE" and
> ".BE" suffixes, and if neither is specified then use whichever endian
> format has been selected by the command line.
>
I can have a look at that.
It will be small endian if nothing is specified in the label nor in the
command line.
> Cheers
> Nick
>
next prev parent reply other threads:[~2023-03-02 22:02 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-01 17:43 binutils
2023-03-01 17:43 ` [PATCH v8 01/11] DIGEST: LICENSING binutils
2023-03-01 17:43 ` [PATCH v8 02/11] DIGEST: NEWS binutils
2023-03-01 17:43 ` [PATCH v8 03/11] DIGEST: Documentation binutils
2023-03-01 17:43 ` [PATCH v8 04/11] DIGEST: testsuite binutils
2023-03-01 17:43 ` [PATCH v8 05/11] DIGEST: ldlex.l binutils
2023-03-01 17:43 ` [PATCH v8 06/11] DIGEST: ldgram.y binutils
2023-03-01 17:43 ` [PATCH v8 07/11] DIGEST: ldmain.c binutils
2023-03-01 17:43 ` [PATCH v8 08/11] DIGEST: ldlang.*: add timestamp binutils
2023-03-01 17:43 ` [PATCH v8 09/11] DIGEST: calculation binutils
2023-03-01 17:43 ` [PATCH v8 10/11] DIGEST: Makefile.* binutils
2023-03-01 17:43 ` [PATCH v8 11/11] Build ldint binutils
2023-03-02 16:43 ` [PATCH v8 0/11 Add support for CRC64 generation in linker Nick Clifton
2023-03-02 22:02 ` Ulf Samuelsson [this message]
2023-03-03 7:28 ` Ulf Samuelsson
2023-03-03 9:47 ` Nick Clifton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d125743e-21f2-e113-3b12-919774383a2f@emagii.com \
--to=binutils@emagii.com \
--cc=binutils@sourceware.org \
--cc=nickc@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).