In GCC 10.2, -O2 optimization enables more than docs suggest

public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed

* In GCC 10.2, -O2 optimization enables more than docs suggest
@ 2021-01-16 20:48 Brent Roman
  2021-01-20 15:56 ` Richard Earnshaw
  0 siblings, 1 reply; 10+ messages in thread
From: Brent Roman @ 2021-01-16 20:48 UTC (permalink / raw)
  To: gcc-help

My very old, highly modified Matz Ruby 1.87 interpreter stopped working 
when Debian switched from GCC-9 to GCC-10

The Ruby interpreter binary output from GCC-10 segfaults.

However, if I reduce optimization from -O2 to -O1, the resulting 
binaries work fine.

I'd like to find which specific -O2 optimization is causing the failure 
when run through gcc-10.

But, when I specified -O1 followed by options explicitly enabling all 
the specific optimizations that are supposed to be enabled by -O2,
the resulting binary works fine.  Conversely, when I specify -O2, 
followed by explicit options to *disable* all those options, the 
resulting binary fails.

Does anyone know what -O2 enables aside from the options documented on:

https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Optimize-Options.html#Optimize-Options

Here's an example of a gcc invocation with -O2 followed by disabling all 
the -O2 specific optimizations:

gcc -O2 -g -Wclobbered -fno-stack-protector -fno-align-functions 
-fno-align-jumps  -fno-align-labels  -fno-align-loops -fno-caller-saves  
-fno-code-hoisting  -fno-crossjumping -fno-cse-follow-jumps  
-fno-cse-skip-blocks -fno-delete-null-pointer-checks  -fno-devirtualize 
-fno-devirtualize-speculatively  -fno-expensive-optimizations 
-fno-finite-loops  -fno-gcse  -fno-gcse-lm -fno-hoist-adjacent-loads  
-fno-inline-functions -fno-inline-small-functions  
-fno-indirect-inlining -fno-ipa-bit-cp  -fno-ipa-cp  -fno-ipa-icf  
-fno-ipa-ra -fno-ipa-sra  -fno-ipa-vrp 
-fno-isolate-erroneous-paths-dereference  -fno-lra-remat 
-fno-optimize-sibling-calls  -fno-optimize-strlen -fno-partial-inlining  
-fno-peephole2 -fno-reorder-blocks-and-partition  -fno-reorder-functions 
-fno-rerun-cse-after-loop   -fno-schedule-insns -fno-schedule-insns2  
-fno-sched-interblock  -fno-sched-spec -fno-store-merging  
-fno-strict-aliasing  -fno-thread-jumps -fno-tree-builtin-call-dce  
-fno-tree-pre -fno-tree-switch-conversion  -fno-tree-tail-merge 
-fno-tree-vrp      -DRUBY_EXPORT -D_GNU_SOURCE=1  -I. -I.    -c main.c


Thanks!


-- 
  Brent Roman                                   MBARI
  Software Engineer               Tel: (831) 775-1808
  mailto:brent@mbari.org  http://www.mbari.org/~brent


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-16 20:48 In GCC 10.2, -O2 optimization enables more than docs suggest Brent Roman
@ 2021-01-20 15:56 ` Richard Earnshaw
  2021-01-20 16:53   ` mark_at_yahoo
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Earnshaw @ 2021-01-20 15:56 UTC (permalink / raw)
  To: Brent Roman, gcc-help

On 16/01/2021 20:48, Brent Roman wrote:
> My very old, highly modified Matz Ruby 1.87 interpreter stopped working
> when Debian switched from GCC-9 to GCC-10
> 
> The Ruby interpreter binary output from GCC-10 segfaults.
> 
> However, if I reduce optimization from -O2 to -O1, the resulting
> binaries work fine.
> 
> I'd like to find which specific -O2 optimization is causing the failure
> when run through gcc-10.
> 
> But, when I specified -O1 followed by options explicitly enabling all
> the specific optimizations that are supposed to be enabled by -O2,
> the resulting binary works fine.  Conversely, when I specify -O2,
> followed by explicit options to *disable* all those options, the
> resulting binary fails.
> 
> Does anyone know what -O2 enables aside from the options documented on:
> 
> https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Optimize-Options.html#Optimize-Options
> 
> 
> Here's an example of a gcc invocation with -O2 followed by disabling all
> the -O2 specific optimizations:
> 
> gcc -O2 -g -Wclobbered -fno-stack-protector -fno-align-functions
> -fno-align-jumps  -fno-align-labels  -fno-align-loops -fno-caller-saves 
> -fno-code-hoisting  -fno-crossjumping -fno-cse-follow-jumps 
> -fno-cse-skip-blocks -fno-delete-null-pointer-checks  -fno-devirtualize
> -fno-devirtualize-speculatively  -fno-expensive-optimizations
> -fno-finite-loops  -fno-gcse  -fno-gcse-lm -fno-hoist-adjacent-loads 
> -fno-inline-functions -fno-inline-small-functions 
> -fno-indirect-inlining -fno-ipa-bit-cp  -fno-ipa-cp  -fno-ipa-icf 
> -fno-ipa-ra -fno-ipa-sra  -fno-ipa-vrp
> -fno-isolate-erroneous-paths-dereference  -fno-lra-remat
> -fno-optimize-sibling-calls  -fno-optimize-strlen -fno-partial-inlining 
> -fno-peephole2 -fno-reorder-blocks-and-partition  -fno-reorder-functions
> -fno-rerun-cse-after-loop   -fno-schedule-insns -fno-schedule-insns2 
> -fno-sched-interblock  -fno-sched-spec -fno-store-merging 
> -fno-strict-aliasing  -fno-thread-jumps -fno-tree-builtin-call-dce 
> -fno-tree-pre -fno-tree-switch-conversion  -fno-tree-tail-merge
> -fno-tree-vrp      -DRUBY_EXPORT -D_GNU_SOURCE=1  -I. -I.    -c main.c
> 
> 
> Thanks!
> 
> 

Sorry, it's not as simple as that.  There are places in the compiler
where the optimization level (O1, O2, O3) is just tested with something like

  if (optimize >= level)

for some level.

R.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-20 15:56 ` Richard Earnshaw
@ 2021-01-20 16:53   ` mark_at_yahoo
  2021-01-20 18:50     ` Jonathan Wakely
  2021-01-20 21:17     ` David Brown
  0 siblings, 2 replies; 10+ messages in thread
From: mark_at_yahoo @ 2021-01-20 16:53 UTC (permalink / raw)
  To: gcc-help

On 1/20/21 7:56 AM, Richard Earnshaw via Gcc-help wrote:
> On 16/01/2021 20:48, Brent Roman wrote:
>> Here's an example of a gcc invocation with -O2 followed by disabling all
>> the -O2 specific optimizations:
>> ...
> 
> Sorry, it's not as simple as that.  There are places in the compiler
> where the optimization level (O1, O2, O3) is just tested with something like
> 
>    if (optimize >= level)
> 
> for some level.
> 
> R.
> 

Just chiming in with an opinion here. I've had the same problem and came 
to the same conclusion ("-f" options do not fully replace/override "-O") 
although I didn't know the compiler source was that explicit about it 
(thanks for the info).

I realize this is very unlikely to change but find the situation 
unfortunate. My use-case is with the GNU Arm Embedded Toolchain port of 
GCC and my https://github.com/thanks4opensource/regbits development 
system. The latter creates C++ header files with literally thousands of 
constexpr objects of which only a handful are used in a typical program. 
If compiled O1 or above, the linker only allocates storage for the 
objects that are used. At O0 it allocates all of them which makes the 
resulting binary far too large to fit in a typical embedded processor's 
memory space. But O0 is very useful for assembly-level debugging in GDB 
(often required in embedded development) because the generated code is 
much simpler and easier to correlate with the original C++ source.

I've only had limited success coming up with a set of -f options to add 
to O0 to eliminate the unused objects but retain the un-optimized binary 
code. The above explains why, but it would be nice if the -O options 
really were just a set of -f ones and users could customize to their 
needs. Without implementing my specific "-O0.5" option. ;)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-20 16:53   ` mark_at_yahoo
@ 2021-01-20 18:50     ` Jonathan Wakely
  2021-01-20 19:06       ` mark_at_yahoo
  2021-01-20 21:17     ` David Brown
  1 sibling, 1 reply; 10+ messages in thread
From: Jonathan Wakely @ 2021-01-20 18:50 UTC (permalink / raw)
  To: mark_at_yahoo; +Cc: gcc-help

On Wed, 20 Jan 2021, 16:54 mark_at_yahoo via Gcc-help, <gcc-help@gcc.gnu.org>
wrote:

> On 1/20/21 7:56 AM, Richard Earnshaw via Gcc-help wrote:
> > On 16/01/2021 20:48, Brent Roman wrote:
> >> Here's an example of a gcc invocation with -O2 followed by disabling all
> >> the -O2 specific optimizations:
> >> ...
> >
> > Sorry, it's not as simple as that.  There are places in the compiler
> > where the optimization level (O1, O2, O3) is just tested with something
> like
> >
> >    if (optimize >= level)
> >
> > for some level.
> >
> > R.
> >
>
> Just chiming in with an opinion here. I've had the same problem and came
> to the same conclusion ("-f" options do not fully replace/override "-O")
> although I didn't know the compiler source was that explicit about it
> (thanks for the info).
>
> I realize this is very unlikely to change but find the situation
> unfortunate. My use-case is with the GNU Arm Embedded Toolchain port of
> GCC and my https://github.com/thanks4opensource/regbits development
> system. The latter creates C++ header files with literally thousands of
> constexpr objects of which only a handful are used in a typical program.
> If compiled O1 or above, the linker only allocates storage for the
> objects that are used. At O0 it allocates all of them which makes the
> resulting binary far too large to fit in a typical embedded processor's
> memory space. But O0 is very useful for assembly-level debugging in GDB
> (often required in embedded development) because the generated code is
> much simpler and easier to correlate with the original C++ source.
>
> I've only had limited success coming up with a set of -f options to add
> to O0 to eliminate the unused objects but retain the un-optimized binary
> code.


That's a different situation though. With -O0 **NO** optimization is done,
at all. Any -f options for optimization passes are ignored entirely.

The difference between -O1 and -O2 is the set of -f flags that differ, and
some specific checks for >= -O2. But the difference between -O0 and -O1 is
the difference between zero and non-zero. Completely off, or on.

The above explains why, but it would be nice if the -O options
> really were just a set of -f ones and users could customize to their
> needs. Without implementing my specific "-O0.5" option. ;)
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-20 18:50     ` Jonathan Wakely
@ 2021-01-20 19:06       ` mark_at_yahoo
  2021-01-20 19:44         ` Jonathan Wakely
  0 siblings, 1 reply; 10+ messages in thread
From: mark_at_yahoo @ 2021-01-20 19:06 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-help

On 1/20/21 10:50 AM, Jonathan Wakely wrote:

> That's a different situation though. With -O0 **NO** optimization is done,
> at all. Any -f options for optimization passes are ignored entirely.
> 
> The difference between -O1 and -O2 is the set of -f flags that differ, and
> some specific checks for >= -O2. But the difference between -O0 and -O1 is
> the difference between zero and non-zero. Completely off, or on.
> 

That's interesting, Jonathan. You know far better than I do, and I 
haven't tried it in over a year (on an older version of GCC-ARM), but I 
swore the following did do something:

     OPTIMIZE_FLAG := -O0 \
		     -fbranch-count-reg \
		     -fcombine-stack-adjustments \
		     -fcompare-elim \
		     -fcprop-registers \
		     -fdefer-pop \
		     -fforward-propagate \
		     -fguess-branch-probability \
		     -fif-conversion \
		     -fif-conversion2 \
		     -finline \
		     -finline-functions-called-once \
		     -fipa-profile \
		     -fipa-pure-const \
		     -fipa-reference \
		     -fmerge-constants \
		     -fmove-loop-invariants \
		     -fomit-frame-pointer \
		     -freorder-blocks \
		     -fsched-pressure \
		     -fsection-anchors \
		     -fshrink-wrap \
		     -fsplit-wide-types \
		     -fssa-phiopt \
		     -ftoplevel-reorder \
		     -ftree-bit-ccp \
		     -ftree-builtin-call-dce \
		     -ftree-ccp \
		     -ftree-ch \
		     -ftree-coalesce-vars \
		     -ftree-copy-prop \
		     -ftree-dce \
		     -ftree-dominator-opts \
		     -ftree-dse \
		     -ftree-fre \
		     -ftree-pta \
		     -ftree-sink \
		     -ftree-slsr \
		     -ftree-sra \
		     -ftree-ter \
		     -fvar-tracking \
		     -fvar-tracking-assignments

Maybe because you said "optimization passes"? The ones above (or some of 
them) are for other passes?

-- 
MARK
markrubn@yahoo.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-20 19:06       ` mark_at_yahoo
@ 2021-01-20 19:44         ` Jonathan Wakely
  2021-01-20 20:45           ` mark_at_yahoo
  0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wakely @ 2021-01-20 19:44 UTC (permalink / raw)
  To: mark_at_yahoo; +Cc: gcc-help

On Wed, 20 Jan 2021 at 19:06, mark_at_yahoo <markrubn@yahoo.com> wrote:
>
> On 1/20/21 10:50 AM, Jonathan Wakely wrote:
>
> > That's a different situation though. With -O0 **NO** optimization is done,
> > at all. Any -f options for optimization passes are ignored entirely.
> >
> > The difference between -O1 and -O2 is the set of -f flags that differ, and
> > some specific checks for >= -O2. But the difference between -O0 and -O1 is
> > the difference between zero and non-zero. Completely off, or on.
> >
>
> That's interesting, Jonathan. You know far better than I do, and I
> haven't tried it in over a year (on an older version of GCC-ARM), but I
> swore the following did do something:

It might have done something, but it didn't optimize anything (and so
most of those options were ignored).

https://gcc.gnu.org/wiki/FAQ#optimization-options

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-20 19:44         ` Jonathan Wakely
@ 2021-01-20 20:45           ` mark_at_yahoo
  0 siblings, 0 replies; 10+ messages in thread
From: mark_at_yahoo @ 2021-01-20 20:45 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-help

On 1/20/21 11:44 AM, Jonathan Wakely wrote:

> It might have done something, but it didn't optimize anything (and so
> most of those options were ignored).

Apologies for beating a dead horse, and I'll bow out after this, but 
when I said "did something" I meant that it solved my problem of wanting 
an un-optimized binary as per O0 but without instantiation and memory 
allocation of unused constexpr objects as per O1 and above. Whether 
that's an optimization or not seems a matter of semantics.

> https://gcc.gnu.org/wiki/FAQ#optimization-options

Yes, --help=optimizers with -O1 and adding various permutations of the 
ones it output is what I did. You've strongly implied that this 
shouldn't work (assuming the constexpr elimination is done in an 
optimization pass) so at some point I'll revisit the issue with a 
current GCC version and report if I find something interesting.

There were some issues with my "O0 plus -f options" so I eventually 
learned to "bite the bullet" and debug optimized/obfuscated (and that's 
a good thing) O1 assembly output and didn't work on it further.

-- 
MARK
markrubn@yahoo.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-20 16:53   ` mark_at_yahoo
  2021-01-20 18:50     ` Jonathan Wakely
@ 2021-01-20 21:17     ` David Brown
  2021-01-21  3:53       ` mark_at_yahoo
  1 sibling, 1 reply; 10+ messages in thread
From: David Brown @ 2021-01-20 21:17 UTC (permalink / raw)
  To: mark_at_yahoo, gcc-help

On 20/01/2021 17:53, mark_at_yahoo via Gcc-help wrote:
> On 1/20/21 7:56 AM, Richard Earnshaw via Gcc-help wrote:
>> On 16/01/2021 20:48, Brent Roman wrote:
>>> Here's an example of a gcc invocation with -O2 followed by disabling all
>>> the -O2 specific optimizations:
>>> ...
>>
>> Sorry, it's not as simple as that.  There are places in the compiler
>> where the optimization level (O1, O2, O3) is just tested with
>> something like
>>
>>    if (optimize >= level)
>>
>> for some level.
>>
>> R.
>>
> 
> Just chiming in with an opinion here. I've had the same problem and came
> to the same conclusion ("-f" options do not fully replace/override "-O")
> although I didn't know the compiler source was that explicit about it
> (thanks for the info).
> 
> I realize this is very unlikely to change but find the situation
> unfortunate. My use-case is with the GNU Arm Embedded Toolchain port of
> GCC and my https://github.com/thanks4opensource/regbits development
> system. The latter creates C++ header files with literally thousands of
> constexpr objects of which only a handful are used in a typical program.
> If compiled O1 or above, the linker only allocates storage for the
> objects that are used. At O0 it allocates all of them which makes the
> resulting binary far too large to fit in a typical embedded processor's
> memory space. But O0 is very useful for assembly-level debugging in GDB
> (often required in embedded development) because the generated code is
> much simpler and easier to correlate with the original C++ source.
> 

I also work on embedded systems (usually with the ARM gcc toolchain
these days, but at times I use many others).

I /never/ use -O0, precisely because I find it absolutely terrible for
assembly level debugging.  You can't see the wood for the trees, as all
local variables are on the stack, and even the simplest of C expressions
ends up with large numbers of assembly instructions.  In my experience -
and this is obviously very subjective - using -O1 gives far more
readable assembly code while avoiding the kinds of code re-arrangement
and re-ordering of -O2 that makes assembly-level debugging difficult.
(-Og is an alternative for modern gcc versions, which can give most of
the speed of -O2 but is a little easier for debugging).

Another major benefit of -O1 is that it enables much more code analysis,
which in turn enables much better static checking - I am a big fan of
warning flags and having the compiler tell me of likely problems before
I get as far as testing and debugging.

(Your project here looks very interesting - I'm going to have a good
look at it when I get the chance.  I won't be able to use it directly,
as a pure GPL license basically makes it unusable for anything but
learning or hobby use, but as it matches ideas I have had myself I am
interested in how it works.)

> I've only had limited success coming up with a set of -f options to add
> to O0 to eliminate the unused objects but retain the un-optimized binary
> code. The above explains why, but it would be nice if the -O options
> really were just a set of -f ones and users could customize to their
> needs. Without implementing my specific "-O0.5" option. ;)
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-20 21:17     ` David Brown
@ 2021-01-21  3:53       ` mark_at_yahoo
  2021-01-21  9:02         ` David Brown
  0 siblings, 1 reply; 10+ messages in thread
From: mark_at_yahoo @ 2021-01-21  3:53 UTC (permalink / raw)
  To: David Brown, gcc-help

On 1/20/21 1:17 PM, David Brown wrote:

> I /never/ use -O0, precisely because I find it absolutely terrible for
> assembly level debugging.  You can't see the wood for the trees, as all
> local variables are on the stack, and even the simplest of C expressions
> ends up with large numbers of assembly instructions.  In my experience -
> and this is obviously very subjective - using -O1 gives far more
> readable assembly code while avoiding the kinds of code re-arrangement
> and re-ordering of -O2 that makes assembly-level debugging difficult.
> (-Og is an alternative for modern gcc versions, which can give most of
> the speed of -O2 but is a little easier for debugging).

Interesting. My recollection is that -O0, regardless of variables being 
on the stack, was more "linear": Each C statement was followed 
more-or-less by the assembly code required to implement it, then the 
next C statement and so on. In particular, variables in -O1 can get 
tucked away into registers and "disappear" for long stretches of 
assembly before popping up again, and the spaghetti-code jumping for 
common code block elimination. Which is of course all good optimization, 
but makes things hard to follow.

But I'll have to revisit the issue again.

> Another major benefit of -O1 is that it enables much more code analysis,
> which in turn enables much better static checking - I am a big fan of
> warning flags and having the compiler tell me of likely problems before
> I get as far as testing and debugging.

Me, too ("big fan").

> (Your project here looks very interesting - I'm going to have a good
> look at it when I get the chance.  I won't be able to use it directly,
> as a pure GPL license basically makes it unusable for anything but
> learning or hobby use, but as it matches ideas I have had myself I am
> interested in how it works.)

Thanks. Yes, it's basically a simple idea, and I found out recently that 
others have attempted something similar (which I wish I'd known when I 
started doing it myself). This is now very off-topic for this list, but 
I'd like to get your input, including the GPL vs LPGPL issue (ironic 
given that this is a GNU mailing list). Maybe open an issue at the 
Github repository and we can discuss it there?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: In GCC 10.2, -O2 optimization enables more than docs suggest
  2021-01-21  3:53       ` mark_at_yahoo
@ 2021-01-21  9:02         ` David Brown
  0 siblings, 0 replies; 10+ messages in thread
From: David Brown @ 2021-01-21  9:02 UTC (permalink / raw)
  To: mark_at_yahoo, gcc-help

On 21/01/2021 04:53, mark_at_yahoo wrote:
> On 1/20/21 1:17 PM, David Brown wrote:
> 
>> I /never/ use -O0, precisely because I find it absolutely terrible for
>> assembly level debugging.  You can't see the wood for the trees, as all
>> local variables are on the stack, and even the simplest of C expressions
>> ends up with large numbers of assembly instructions.  In my experience -
>> and this is obviously very subjective - using -O1 gives far more
>> readable assembly code while avoiding the kinds of code re-arrangement
>> and re-ordering of -O2 that makes assembly-level debugging difficult.
>> (-Og is an alternative for modern gcc versions, which can give most of
>> the speed of -O2 but is a little easier for debugging).
> 
> Interesting. My recollection is that -O0, regardless of variables being 
> on the stack, was more "linear": Each C statement was followed 
> more-or-less by the assembly code required to implement it, then the 
> next C statement and so on.

Yes, that is true.  But I find (and again this is my own experience, 
which may not match others - it may also depend on the way you write 
your code) that the amount of stack manipulation code makes it 
impossible to see what is really happening.  For each "arithmetic" 
assembly line doing an add, you've got two or more loads and stores to 
the stack.  And when the "real" assembly is a load or a store, you can't 
see it amongst all the other stack loads and stores.

It is even worse if you use C++.  If you've got a few template classes 
and overloaded operators, -O0 might lead you down a path of a dozen 
nested function calls, each with their stack frames, when with -O1 you 
have just the one assembly instruction that is actually relevant.

With -O1, the assembly for an "add one function :

	int inc(int x) { return x + 1; }

is:

inc(int):
         adds    r0, r0, #1
        	bx      lr

That's simple, easy to understand, easy to follow, easy to step through 
at the assembly level.

With -O0, you get:

inc(int):
         push    {r7}
         sub     sp, sp, #12
         add     r7, sp, #0
         str     r0, [r7, #4]
         ldr     r3, [r7, #4]
         adds    r3, r3, #1
         mov     r0, r3
         adds    r7, r7, #12
         mov     sp, r7
         ldr     r7, [sp], #4
         bx      lr

I can't answer for anyone else, but I know which version /I/ would 
rather try to follow.

> In particular, variables in -O1 can get 
> tucked away into registers and "disappear" for long stretches of 
> assembly before popping up again, and the spaghetti-code jumping for 
> common code block elimination. Which is of course all good optimization, 
> but makes things hard to follow.

With -O1, you get very little "spaghetti-jumping" - generated code 
follows linear paths that match the source to a large extent.  -O2 is a 
different matter - that's when a lot more of the code re-ordering and 
re-arranging comes in.  Yes, -O1 puts data in registers - IMHO that is a 
/good/ thing for debugging because it makes the code simpler and 
clearer.  And yes, it makes some code and variables "disappear".  That 
can be a good thing (clearing away unnecessary detail), or a bad thing.

Sometimes I'll temporarily add a "volatile" qualifier to a variable, or 
a "no_inline" attribute to a function in order to make debugging easier. 
  That's part of the process.  I actually do almost all of my 
compilation with -O2 as the starting point (and various fine-tuning 
flags) - and I do my debugging with that build.  I strongly dislike the 
idea of different debug/release builds, and do not make a 
differentiation.  If I need a lower optimisation to help trace an 
awkward problem, I'll add a "#pragma optimize 1" line to the relevant code.

> 
> But I'll have to revisit the issue again.
> 
> 
>> Another major benefit of -O1 is that it enables much more code analysis,
>> which in turn enables much better static checking - I am a big fan of
>> warning flags and having the compiler tell me of likely problems before
>> I get as far as testing and debugging.
> 
> Me, too ("big fan").

Well, remember that "-O0" greatly limits these warnings.

IMHO, gcc should introduced a new warning that is enabled by default on 
-O0, giving the message "You are using the world's most powerful 
compiler, but you are crippling it with your choice of flags."  There 
should also be a warning if "-Wall" is not enabled.

(OK, perhaps that would be going a little too far...)

> 
> 
>> (Your project here looks very interesting - I'm going to have a good
>> look at it when I get the chance.  I won't be able to use it directly,
>> as a pure GPL license basically makes it unusable for anything but
>> learning or hobby use, but as it matches ideas I have had myself I am
>> interested in how it works.)
> 
> Thanks. Yes, it's basically a simple idea, and I found out recently that 
> others have attempted something similar (which I wish I'd known when I 
> started doing it myself). This is now very off-topic for this list, but 
> I'd like to get your input, including the GPL vs LPGPL issue (ironic 
> given that this is a GNU mailing list). Maybe open an issue at the 
> Github repository and we can discuss it there?

I'll look through the project first, and then get back to you - either 
through Github or your email address.  In the meantime, you could look 
at the licencing for FreeRTOS to see if its "GPL with exception" licence 
suits you.  (gcc also has a kind of "GPL with exception" licence, 
otherwise it could not be used for anything other than GPL'ed code.)

mvh.,

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-01-21  9:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-16 20:48 In GCC 10.2, -O2 optimization enables more than docs suggest Brent Roman
2021-01-20 15:56 ` Richard Earnshaw
2021-01-20 16:53   ` mark_at_yahoo
2021-01-20 18:50     ` Jonathan Wakely
2021-01-20 19:06       ` mark_at_yahoo
2021-01-20 19:44         ` Jonathan Wakely
2021-01-20 20:45           ` mark_at_yahoo
2021-01-20 21:17     ` David Brown
2021-01-21  3:53       ` mark_at_yahoo
2021-01-21  9:02         ` David Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).