From: Christophe Lyon <christophe.lyon@linaro.org>
To: Daniel Engel <libgcc@danielengel.com>
Cc: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>,
gcc Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v3] libgcc: Thumb-1 Floating-Point Library for Cortex M0
Date: Mon, 25 Jan 2021 18:48:54 +0100 [thread overview]
Message-ID: <CAKdteOYptCFvha9X_aOZzDptRkmdEyuiGtpC-an3+sRX+UbYig@mail.gmail.com> (raw)
In-Reply-To: <CAKdteObxRH5GbpfpWOU3DBL=HyFpL9Yc6xznWF4nHPfiaeqSGQ@mail.gmail.com>
On Fri, 22 Jan 2021 at 19:28, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
>
> On Thu, 21 Jan 2021 at 21:35, Daniel Engel <libgcc@danielengel.com> wrote:
> >
> > Hi Christophe,
> >
> > On Thu, Jan 21, 2021, at 2:29 AM, Christophe Lyon wrote:
> > > On Sat, 16 Jan 2021 at 17:13, Daniel Engel <libgcc@danielengel.com> wrote:
> > > >
> > > > Hi Christophe,
> > > >
> > > > On Fri, Jan 15, 2021, at 4:30 AM, Christophe Lyon wrote:
> > > > > On Fri, 15 Jan 2021 at 12:39, Daniel Engel <libgcc@danielengel.com> wrote:
> > > > > >
> > > > > > Hi Christophe,
> > > > > >
> > > > > > On Mon, Jan 11, 2021, at 8:39 AM, Christophe Lyon wrote:
> > > > > > > On Mon, 11 Jan 2021 at 17:18, Daniel Engel <libgcc@danielengel.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Jan 11, 2021, at 8:07 AM, Christophe Lyon wrote:
> > > > > > > > > On Sat, 9 Jan 2021 at 14:09, Christophe Lyon <christophe.lyon@linaro.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Sat, 9 Jan 2021 at 13:27, Daniel Engel <libgcc@danielengel.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Jan 7, 2021, at 4:56 AM, Richard Earnshaw wrote:
> > > > > > > > > > > > On 07/01/2021 00:59, Daniel Engel wrote:
> > > > > > > > > > > > > --snip--
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Jan 6, 2021, at 9:05 AM, Richard Earnshaw wrote:
> > > > > > > > > > > > > --snip--
> > > > > > > > > > > > >
> > > > > > > > > > > > >> - finally, your popcount implementations have data in the code segment.
> > > > > > > > > > > > >> That's going to cause problems when we have compilation options such as
> > > > > > > > > > > > >> -mpure-code.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am just following the precedent of existing lib1funcs (e.g. __clz2si).
> > > > > > > > > > > > > If this matters, you'll need to point in the right direction for the
> > > > > > > > > > > > > fix. I'm not sure it does matter, since these functions are PIC anyway.
> > > > > > > > > > > >
> > > > > > > > > > > > That might be a bug in the clz implementations - Christophe: Any thoughts?
> > > > > > > > > > >
> > > > > > > > > > > __clzsi2() has test coverage in "gcc.c-torture/execute/builtin-bitops-1.c"
> > > > > > > > > > Thanks, I'll have a closer look at why I didn't see problems.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > So, that's because the code goes to the .text section (as opposed to
> > > > > > > > > .text.noread)
> > > > > > > > > and does not have the PURECODE flag. The compiler takes care of this
> > > > > > > > > when generating code with -mpure-code.
> > > > > > > > > And the simulator does not complain because it only checks loads from
> > > > > > > > > the segment with the PURECODE flag set.
> > > > > > > > >
> > > > > > > > This is far out of my depth, but can something like:
> > > > > > > >
> > > > > > > > ifeq (,$(findstring __symbian__,$(shell $(gcc_compile_bare) -dM -E - </dev/null)))
> > > > > > > >
> > > > > > > > be adapted to:
> > > > > > > >
> > > > > > > > a) detect the state of the -mpure-code switch, and
> > > > > > > > b) pass that flag to the preprocessor?
> > > > > > > >
> > > > > > > > If so, I can probably fix both the target section and the data usage.
> > > > > > > > Just have to add a few instructions to finish unrolling the loop.
> > > > > > >
> > > > > > > I must confess I never checked libgcc's Makefile deeply before,
> > > > > > > but it looks like you can probably detect whether -mpure-code is
> > > > > > > part of $CFLAGS.
> > > > > > >
> > > > > > > However, it might be better to write pure-code-safe code
> > > > > > > unconditionally because the toolchain will probably not
> > > > > > > be rebuilt with -mpure-code as discussed before.
> > > > > > > Or that could mean adding a -mpure-code multilib....
> > > > > >
> > > > > > I have learned a few things since the last update. I think I know how
> > > > > > to get -mpure-code out of CFLAGS and into a macro. However, I have hit
> > > > > > something of a wall with testing. I can't seem to compile any flavor of
> > > > > > libgcc with CFLAGS_FOR_TARGET="-mpure-code".
> > > > > >
> > > > > > 1. Configuring --with-multilib-list=rmprofile results in build failure:
> > > > > >
> > > > > > checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/libgcc':
> > > > > > configure: error: cannot compute suffix of object files: cannot compile
> > > > > > See `config.log' for more details
> > > > > >
> > > > > > cc1: error: -mpure-code only supports non-pic code on M-profile targets
> > > > > >
> > > > >
> > > > > Yes, I did hit that wall too :-)
> > > > >
> > > > > Hence what we discussed earlier: the toolchain is not rebuilt with -mpure-code.
> > > > >
> > > > > Note that there are problems in newlib too, but users of -mpure-code seem
> > > > > to be able to work around that (eg. using their own startup code and no stdlib)
> > > >
> > > > Is there a current side project to solve the makefile problems?
> > > None that I know of.
> > >
> > >
> > > > I think I'm back to my original question: If libgcc can't be built
> > > > with -mpure-code, and users bypass it completely with -nostdlib, then
> > > > why this conversation about pure-code compatibility of __clzsi2() etc?
> > > I think Richard noticed this pre-existing problem as part of the review
> > > of your patches. I don't think I meant fixing this is a prerequisite.
> > > But maybe I misunderstood :-)
> >
> > I might have misunderstood too then. It was certainly a pre-existing
> > problem, but I took the comments to mean that I had to own it as part of
> > touching those functions.
> >
> > > > > > 2. Attempting to filter the multib list results in configuration error.
> > > > > > This might have been misguided, but it was something I tried:
> > > > > >
> > > > > > Error: --with-multilib-list=armv6s-m not supported.
> > > > > >
> > > > > > Error: --with-multilib-list=mthumb/march=armv6s-m/mfloat-abi=soft not supported
> > > > >
> > > > > I think only 2 values are supported: aprofile and rmprofile.
> > > >
> > > > It looks like this might require a custom t-* multilib in gcc/config/arm.
> > > Or we could add -mpure-code to the rmprofile list.
> >
> > I have no strong opinions here. Are you proposing that the "m" versions
> > of libgcc be built with -mpure-code enabled by default, or are you
> > proposing a parallel set of multilibs? -mpure-code by default would
> > have costs in both size and speed.
> I'm not sure how large the penalty is for thumb-2 cores?
> Maybe it's acceptable to build thumb-2 with -mpure-code by default,
> but probably not for cortex-m0.
>
> > > > > > 3. Attempting to configure a single architecture results in a build error.
> > > > > >
> > > > > > --with-mode=thumb --with-arch=armv6s-m --with-float=soft
> > > > > >
> > > > > > checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/arm/autofp/v5te/fpu/libgcc':
> > > > > > configure: error: cannot compute suffix of object files: cannot compile
> > > > > > See `config.log' for more details
> > > > > >
> > > > > > conftest.c:9:10: fatal error: ac_nonexistent.h: No such file or directory
> > > > > > 9 | #include <ac_nonexistent.h>
> > > > > > | ^~~~~~~~~~~~~~~~~~
> > > > > I never saw that error message, but I never build using --with-arch.
> > > > > I do use --with-cpu though.
> > > > >
> > > > > > This has me wondering whether pure-code in libgcc is a real issue ...
> > > > > > If there's a way to build libgcc with -mpure-code, please enlighten me.
> > > > > I haven't done so yet. Maybe building the toolchain --with-cpu=cortex-m0
> > > > > works?
> > > >
> > > > No luck with that. Same error message as before:
> > > >
> > > > 4. --with-mode=thumb --with-arch=armv6s-m --with-float=soft --with-cpu=cortex-m0
> > > >
> > > > Switch "--with-arch" may not be used with switch "--with-cpu"
> > > >
> > > > 5. Then: --with-mode=thumb --with-float=soft --with-cpu=cortex-m0
> > > >
> > > > checking for suffix of object files... configure: error: in `/home/mirdan/gcc-obj/arm-none-eabi/arm/autofp/v5te/fpu/libgcc':
> > > > configure: error: cannot compute suffix of object files: cannot compile
> > > > See `config.log' for more details
> > > >
> > > > cc1: error: -mpure-code only supports non-pic code on M-profile targets
> > > Yes that's because default multilibs include targets incompatible with
> > > -mpure-code
> > >
> > > > 6. Finally! --with-float=soft --with-cpu=cortex-m0 --disable-multilib
> > > >
> > > > Once you know this, and read the docs sideways, the previous errors are
> > > > all probably "works as designed". But, I can still grumble.
> > > Yes, I think it's "as designed". I faced the "incompatible multilibs" issue
> > > too some time ago. Hence testing is not easy.
> > >
> > > > With libgcc compiled with -mpure-code, I can confirm that
> > > > 'builtin-bitops-1.c' (the test for __clzsi2) passed with libgcc as-is.
> > > >
> > > > I then added the SHF_ARM_PURECODE flag to the libgcc assembly functions
> > > > and re-ran the test. Still passed. I then added -mpure-code to
> > > > RUNTESTFLAGS and re-ran the test. Still passed. readelf confirmed that
> > > > the test program is compiling as expected [1]:
> > > >
> > > > [ 2] .text PROGBITS 0000800c 00800c 003314 00 AXy 0 0 4
> > > > Key to Flags:
> > > > W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
> > > > L (link order), O (extra OS processing required), G (group), T (TLS),
> > > > C (compressed), x (unknown), o (OS specific), E (exclude),
> > > > y (purecode), p (processor specific)
> > > >
> > > > It was only when I started inserting pure-code test directives into
> > > > 'builtin-bitops-1.c' that 'make check' began to report errors.
> > > >
> > > > /* { dg-do compile } */
> > > > ...
> > > > /* { dg-options "-mpure-code -mfp16-format=ieee" } */
> > > > /* { dg-final { scan-assembler-not "\\.(float|l\\?double|\d?byte|short|int|long|quad|word)\\s+\[^.\]" } } */
> > > >
> > > > However, for reasons [2] [3] [4] [5], this wasn't actually useful. It's
> > > > sufficient to say that there are many reasons that non-pure-code
> > > > compatible functions exist in libgcc.
> > >
> > > I filed a PR to better track this discussion:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98779
> >
> > Possibly worth noting that my patch series addresses __clzsi2, __ctzsi2,
> > __aeabi_ldivmod, __aeabi_uldivmod, and most of __gnu_float2h_internal as
> > long as CFLAGS_FOR_TARGET contains -mpure-code.
> >
> Great.
>
> > >
> > > >
> > > > Although I'm not sure how useful this will be in light of the previous
> > > > findings, I did take the opportunity with a working compile process to
> > > > modify the relevant assembly functions for -mpure-code compatibility.
> > > > I can manually disassemble the library and verify correct compilation.
> > > > I can manually run a non-pure-code builtin-bitops-1 with a pure-code
> > > > library to verify correct execution. But, I don't think your standard
> > > > regression suite will be able to exercise the new paths.
> > > Thanks for doing that. With the SHF_ARM_PURECODE flag you
> > > added to clz, I think my simulator would catch problems.
> >
> > See my more detailed comments following. Unless you're also using a
> > custom linker script, I would have expected any simulator capable of
> > catching errors to have already caught them. Putting the
> > SHF_ARM_PURECODE flag on clz actually seems rather cosmetic.
> >
>
> Yes, I have a custom linker script with
> .text.noread :
> {
> INPUT_SECTION_FLAGS (SHF_ARM_PURECODE) *(.text*)
> } > purecode_memory
>
> > > > The patch is below; you can consider this as 34/33 in the series.
> > > >
> > > > Regards,
> > > > Daniel
> > > >
> > > > [1] It's pretty clear that the section flags in libgcc have never really
> > > > mattered. When the linker strings all of the used objects together,
> > > > the original sections disappear into a single output object. The
> > > > compiler controls those flags regardless of what libgcc does.)
> > > Not sure what you mean? The linker creates two segments, one
> > > with and one without the SHF_ARM_PURECODE flag.
> >
> > When libgcc is compiled "normally", individual objects in libgcc.a are
> > compiled _without_ SHF_ARM_PURECODE. Note line 4 below, with flags
> > "AX" only (no "y").
> >
> > `readelf -S arm-none-eabi/thumb/v6-m/nofp/libgcc/libgcc.a`
> >
> > File: arm-none-eabi/thumb/v6-m/nofp/libgcc/libgcc.a(_clzsi2.o)
> > There are 19 section headers, starting at offset 0x43c:
> >
> > Section Headers:
> > [Nr] Name Type Addr Off Size ES Flg Lk Inf Al
> > [ 0] NULL 00000000 000000 000000 00 0 0 0
> > [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 2
> > [ 2] .data PROGBITS 00000000 000034 000000 00 WA 0 0 1
> > [ 3] .bss NOBITS 00000000 000034 000000 00 WA 0 0 1
> > ==> [ 4] .text.sorted[...] PROGBITS 00000000 000034 000034 00 AX 0 0 4
> > ...
> > Key to Flags:
> > W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
> > L (link order), O (extra OS processing required), G (group), T (TLS),
> > C (compressed), x (unknown), o (OS specific), E (exclude),
> > y (purecode), p (processor specific)
> >
> > When this "normal" libgcc is linked into an -mpure-code program
> > (e.g. with RUNTESTFLAGS), the linker script flattens all of the
> > sections together into a single output. The relevant portion of the
> > linker script:
> >
> > .text :
> > {
> > *(.text.unlikely .text.*_unlikely .text.unlikely.*)
> > *(.text.exit .text.exit.*)
> > *(.text.startup .text.startup.*)
> > *(.text.hot .text.hot.*)
> > *(SORT(.text.sorted.*)) // _clzsi2.o matches here
> > *(.text .stub .text.* .gnu.linkonce.t.*) // main.o matches here
> > /* .gnu.warning sections are handled specially by elf.em. */
> > *(.gnu.warning)
> > *(.glue_7t) *(.glue_7) *(.vfp11_veneer) *(.v4_bx)
> > }
> >
> > I can't pretend to know how the linker merges conflicting flags from the
> > various input sections, but the final binary has the attributes
> > "AXy" as expected from the top level compile (line 2):
> >
> > `readelf -Sl builtin-bitops-1.exe`
> >
> > There are 22 section headers, starting at offset 0x10934:
> >
> > Section Headers:
> > [Nr] Name Type Addr Off Size ES Flg Lk Inf Al
> > [ 0] NULL 00000000 000000 000000 00 0 0 0
> > [ 1] .init PROGBITS 00008000 008000 00000c 00 AX 0 0 4
> > ==> [ 2] .text PROGBITS 0000800c 00800c 00455c 00 AXy 0 0 4
> > [ 3] .fini PROGBITS 0000c568 00c568 00000c 00 AX 0 0 4
> > [ 4] .rodata PROGBITS 0000c574 00c574 000050 00 A 0 0 4
> > [ 5] .ARM.exidx ARM_EXIDX 0000c5c4 00c5c4 000008 00 AL 2 0 4
> > [ 6] .eh_frame PROGBITS 0000c5cc 00c5cc 000124 00 A 0 0 4
> > [ 7] .init_array INIT_ARRAY 0001c6f0 00c6f0 000004 04 WA 0 0 4
> > [ 8] .fini_array FINI_ARRAY 0001c6f4 00c6f4 000004 04 WA 0 0 4
> > [ 9] .data PROGBITS 0001c6f8 00c6f8 000a30 00 WA 0 0 8
> > [10] .bss NOBITS 0001d128 00d128 000114 00 WA 0 0 4
> > ...
> > Key to Flags:
> > W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
> > L (link order), O (extra OS processing required), G (group), T (TLS),
> > C (compressed), x (unknown), o (OS specific), E (exclude),
> > y (purecode), p (processor specific)
> >
> > There are 3 program headers, starting at offset 52
> >
> > Program Headers:
> > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
> > EXIDX 0x00c5c4 0x0000c5c4 0x0000c5c4 0x00008 0x00008 R 0x4
> > LOAD 0x000000 0x00000000 0x00000000 0x0c6f0 0x0c6f0 R E 0x10000
> > LOAD 0x00c6f0 0x0001c6f0 0x0001c6f0 0x00a38 0x00b4c RW 0x10000
> >
> > Section to Segment mapping:
> > Segment Sections...
> > 00 .ARM.exidx
> > ==> 01 .init .text .fini .rodata .ARM.exidx .eh_frame
> > 02 .init_array .fini_array .data .bss
> >
> > A segment contains complete sections, not the other way around. As you
> > can see above, the first LOAD segment contains the entire ".text" plus
> > some other sections. Thus, SHF_ARM_PURECODE flag really appears
> > to apply to all of "text", even though the bits linked in from libgcc
> > weren't built or flagged this way.
> >
> > >
> > > > [2] The existing pure-code tests are compile-only and cover just the
> > > > disassembled 'main.o'. There is no test of a complete executable
> > > > and there is no execution/simulation.
> > > That's something I did manually: run the full gcc testsuite, forcing -mpure-code
> > > in RUNTESTFLAGS. This way, all execution tests are compiled with -mpure-code,
> > > and this is how I found several bugs.
> > >
> > > > [3] While other parts of binutils may understand SHF_ARM_PURECODE, I
> > > > don't think the simulator checks section flags or throws exceptions.
> > > Indeed, I know of no public simulator that honors this flag. I do have
> > > one though.
> > >
> > > > [4] builtin-bitops-1 modified this way will always fail due to the array
> > > > data definitions (longs, longlongs, etc). GCC can't translate those
> > > > to instructions. While the ".data" section would presumably be
> > > > readable, scan-assembler-not doesn't know the difference.
> > > Sure, adding such scan-assembler-not is not suitable for any existing testcase.
> > > That's why it is only in place for testcases dedicated to -mpure-code.
> > >
> > > > [5] Even if the simulator were modified to throw exceptions, this will
> > > > continue to fail because _mainCRTStartup uses a literal pool.
> > > Yes, in general, and that's why I mentioned problems with newlib
> > > earlier in this thread.
> > >
> > > However, the simulator I use only throws an exception for code executed with
> > > SHF_ARM_PURECODE. Code in the "regular" code segment is not checked.
> > > So this does not catch errors in hand-written assembly code using regular .text,
> > > but it enables to run larger validations (such as the whole GCC testsuite)
> > > without having to fix all of newlib.
> > > Not perfect, as it left the issues in libgcc we are discussing, but it
> > > helped me fix
> > > several bugs in -mpure-code.
> >
> > Yet again I suspect that you have a custom linker script, or there's
> > some other major difference. Using the public releases of binutils,
> > newlib, etc, my experiences just aren't lining up with yours.
>
> Yep, the linker script makes the difference.
>
> >
> > > Thanks,
> > >
> > > Christophe
> > >
> > > >
> > > > > Thanks,
> > > > >
> > > > > Christophe
> > > > >
> > > > > --snip--
> >
> > If the test server farm is free at some point, would you mind running
> > another set of regression tests on my v5 patch series?
>
> Sure. Given the number of sub-patches, can you send it to me as a
> single patch file
> (git format) that I can directly apply to GCC trunk?
> My mailer does not want to help with saving each patch as a proper
> patch file :-(
>
The validation results came back clean (no regression found).
Thanks
Christophe
> Thanks
>
> Christophe
>
> >
> > Regards,
> > Daniel
next prev parent reply other threads:[~2021-01-25 17:49 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-12 23:04 [PATCH] " Daniel Engel
2020-11-26 9:14 ` Christophe Lyon
2020-12-02 3:32 ` Daniel Engel
2020-12-16 17:15 ` Christophe Lyon
2021-01-06 11:20 ` [PATCH v3] " Daniel Engel
2021-01-06 17:05 ` Richard Earnshaw
2021-01-07 0:59 ` Daniel Engel
2021-01-07 12:56 ` Richard Earnshaw
2021-01-07 13:27 ` Christophe Lyon
2021-01-07 16:44 ` Richard Earnshaw
2021-01-09 12:28 ` Daniel Engel
2021-01-09 13:09 ` Christophe Lyon
2021-01-09 18:04 ` Daniel Engel
2021-01-11 14:49 ` Richard Earnshaw
2021-01-09 18:48 ` Daniel Engel
2021-01-11 16:07 ` Christophe Lyon
2021-01-11 16:18 ` Daniel Engel
2021-01-11 16:39 ` Christophe Lyon
2021-01-15 11:40 ` Daniel Engel
2021-01-15 12:30 ` Christophe Lyon
2021-01-16 16:14 ` Daniel Engel
2021-01-21 10:29 ` Christophe Lyon
2021-01-21 20:35 ` Daniel Engel
2021-01-22 18:28 ` Christophe Lyon
2021-01-25 17:48 ` Christophe Lyon [this message]
2021-01-25 23:36 ` Daniel Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAKdteOYptCFvha9X_aOZzDptRkmdEyuiGtpC-an3+sRX+UbYig@mail.gmail.com \
--to=christophe.lyon@linaro.org \
--cc=Richard.Earnshaw@foss.arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=libgcc@danielengel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).