public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Does GCC record optimization information into binary file?
@ 2026-01-28 15:41 Qing Zhao
  2026-01-28 22:53 ` Siddhesh Poyarekar
  2026-01-29  8:49 ` Richard Biener
  0 siblings, 2 replies; 22+ messages in thread
From: Qing Zhao @ 2026-01-28 15:41 UTC (permalink / raw)
  To: gcc

Hi, 

Does GCC provide any option to record optimization information, such as inlining, loop transformation,
 profiling consistency, etc into specific sections of binary code? 

thanks.

Qing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-28 15:41 Does GCC record optimization information into binary file? Qing Zhao
@ 2026-01-28 22:53 ` Siddhesh Poyarekar
  2026-01-29 16:58   ` David Malcolm
  2026-01-29  8:49 ` Richard Biener
  1 sibling, 1 reply; 22+ messages in thread
From: Siddhesh Poyarekar @ 2026-01-28 22:53 UTC (permalink / raw)
  To: Qing Zhao, gcc; +Cc: David Malcolm

On 2026-01-28 10:41, Qing Zhao via Gcc wrote:
> Does GCC provide any option to record optimization information, such as inlining, loop transformation,
>   profiling consistency, etc into specific sections of binary code?

I may be misremembering this, but I think David had some ideas about 
doing something like this in SARIF.

Sid

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-28 15:41 Does GCC record optimization information into binary file? Qing Zhao
  2026-01-28 22:53 ` Siddhesh Poyarekar
@ 2026-01-29  8:49 ` Richard Biener
  2026-01-29  9:00   ` Jakub Jelinek
  1 sibling, 1 reply; 22+ messages in thread
From: Richard Biener @ 2026-01-29  8:49 UTC (permalink / raw)
  To: Qing Zhao; +Cc: gcc

On Wed, Jan 28, 2026 at 4:43 PM Qing Zhao via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hi,
>
> Does GCC provide any option to record optimization information, such as inlining, loop transformation,
>  profiling consistency, etc into specific sections of binary code?

No.

>
> thanks.
>
> Qing

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29  8:49 ` Richard Biener
@ 2026-01-29  9:00   ` Jakub Jelinek
  2026-01-29 10:30     ` Florian Weimer
  2026-01-29 16:08     ` Qing Zhao
  0 siblings, 2 replies; 22+ messages in thread
From: Jakub Jelinek @ 2026-01-29  9:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: Qing Zhao, gcc

On Thu, Jan 29, 2026 at 09:49:40AM +0100, Richard Biener via Gcc wrote:
> On Wed, Jan 28, 2026 at 4:43 PM Qing Zhao via Gcc <gcc@gcc.gnu.org> wrote:
> > Does GCC provide any option to record optimization information, such as inlining, loop transformation,
> >  profiling consistency, etc into specific sections of binary code?
> 
> No.

Well, inlining is recorded in debug info.  And I don't see what benefit
woiuld be to encode the rest, without source information like that is not
really useful and with source around you can always recompile and look for
compiler dumps or opt info, debug info provides details how a particular
TU has been compiled (with what options etc.).

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29  9:00   ` Jakub Jelinek
@ 2026-01-29 10:30     ` Florian Weimer
  2026-01-29 15:57       ` Qing Zhao
  2026-01-29 16:08     ` Qing Zhao
  1 sibling, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2026-01-29 10:30 UTC (permalink / raw)
  To: Jakub Jelinek via Gcc; +Cc: Richard Biener, Jakub Jelinek, Qing Zhao

* Jakub Jelinek via Gcc:

> On Thu, Jan 29, 2026 at 09:49:40AM +0100, Richard Biener via Gcc wrote:
>> On Wed, Jan 28, 2026 at 4:43 PM Qing Zhao via Gcc <gcc@gcc.gnu.org> wrote:
>> > Does GCC provide any option to record optimization information, such as inlining, loop transformation,
>> >  profiling consistency, etc into specific sections of binary code?
>> 
>> No.
>
> Well, inlining is recorded in debug info.  And I don't see what benefit
> woiuld be to encode the rest, without source information like that is not
> really useful and with source around you can always recompile and look for
> compiler dumps or opt info, debug info provides details how a particular
> TU has been compiled (with what options etc.).

It's hard to recompile in a generic fashion because build systems are so
varied.  Depending on what people are trying to achieve, that's more of
a theoretical option.

(Capturing source code so that it becomes available for recompilation
would be interesting for many other applications, too.)

Thanks,
Florian


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 10:30     ` Florian Weimer
@ 2026-01-29 15:57       ` Qing Zhao
  2026-01-29 16:44         ` Richard Biener
  2026-02-02 15:16         ` Florian Weimer
  0 siblings, 2 replies; 22+ messages in thread
From: Qing Zhao @ 2026-01-29 15:57 UTC (permalink / raw)
  To: Florian Weimer, richard.guenther, jakub Jelinek; +Cc: Jakub Jelinek via Gcc

Our internal customer is asking for this feature. 

They used icc currently, with icc, then can get inline list, prof_used list, etc. from the binary. 

The old studio compiler also provided such feature, there was compiler commentary section in the binary to record
many of the important compiler transformations along with the line number information. 

If there is any performance issue with the application, such information will be very useful. Especially for large applications.

Qing


> On Jan 29, 2026, at 05:30, Florian Weimer <fweimer@redhat.com> wrote:
> 
> * Jakub Jelinek via Gcc:
> 
>> On Thu, Jan 29, 2026 at 09:49:40AM +0100, Richard Biener via Gcc wrote:
>>> On Wed, Jan 28, 2026 at 4:43 PM Qing Zhao via Gcc <gcc@gcc.gnu.org> wrote:
>>>> Does GCC provide any option to record optimization information, such as inlining, loop transformation,
>>>> profiling consistency, etc into specific sections of binary code?
>>> 
>>> No.
>> 
>> Well, inlining is recorded in debug info.  And I don't see what benefit
>> woiuld be to encode the rest, without source information like that is not
>> really useful and with source around you can always recompile and look for
>> compiler dumps or opt info, debug info provides details how a particular
>> TU has been compiled (with what options etc.).
> 
> It's hard to recompile in a generic fashion because build systems are so
> varied.  Depending on what people are trying to achieve, that's more of
> a theoretical option.
> 
> (Capturing source code so that it becomes available for recompilation
> would be interesting for many other applications, too.)
> 
> Thanks,
> Florian
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29  9:00   ` Jakub Jelinek
  2026-01-29 10:30     ` Florian Weimer
@ 2026-01-29 16:08     ` Qing Zhao
  2026-01-29 16:39       ` Jakub Jelinek
  1 sibling, 1 reply; 22+ messages in thread
From: Qing Zhao @ 2026-01-29 16:08 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc



> On Jan 29, 2026, at 04:00, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Thu, Jan 29, 2026 at 09:49:40AM +0100, Richard Biener via Gcc wrote:
>> On Wed, Jan 28, 2026 at 4:43 PM Qing Zhao via Gcc <gcc@gcc.gnu.org> wrote:
>>> Does GCC provide any option to record optimization information, such as inlining, loop transformation,
>>> profiling consistency, etc into specific sections of binary code?
>> 
>> No.
> 
> Well, inlining is recorded in debug info.

Okay, that’s good to know. So, user can examine the debug section in binary to get the inlining information?

How about the profiling information, especially which routines are optimized by using the profiling info? Is such info recorded in binary?

>  And I don't see what benefit
> woiuld be to encode the rest, without source information like that is not
> really useful and with source around you can always recompile and look for
> compiler dumps or opt info, debug info provides details how a particular
> TU has been compiled (with what options etc.).

Our internal customer is specially interested in recording whether the profiling information is used by the optimization into binary. 

Qing

> 
> Jakub
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 16:08     ` Qing Zhao
@ 2026-01-29 16:39       ` Jakub Jelinek
  2026-01-29 17:21         ` Qing Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2026-01-29 16:39 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Richard Biener, gcc

On Thu, Jan 29, 2026 at 04:08:46PM +0000, Qing Zhao wrote:
> > Well, inlining is recorded in debug info.
> 
> Okay, that’s good to know. So, user can examine the debug section in binary to get the inlining information?

There is information on which routines have been inlined into what, just
readelf -wi and look for DW_TAG_inlined_subroutine tags, with
DW_AT_abstract_origin pointing to what has been inlined at that point.
There are no details why the inliner chose to inline something and not
something else.

> How about the profiling information, especially which routines are optimized by using the profiling info? Is such info recorded in binary?

That is very fuzzy, we don't really track what exact optimizations we've
done based on profile info and what for other reasons.
You can look again at DW_AT_producer in debug info, if a function is in
a TU compiled with -fprofile-use, then presumably it has been optimized
using the profiling info.  Though, that just mostly means when reading the
*.gcda files it initialized edge probabilities and bb counts from the
observed counts rather than from heuristics.  And optimizations then just
use those probabilities/counts to decide on optimizations.  Sure,
-fprofile-use doesn't do just that, but still.  You can find out if
probabilities/counts were *.gcda based or heuristics based, but not whether
anything actually used that info in some way and for what.  Especially not
whether the probabilities/counts were different enough from heuristic based
ones to result in different optimization decisions.  You'd need to compile
twice, once without -fprofile-use, and compare.

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 15:57       ` Qing Zhao
@ 2026-01-29 16:44         ` Richard Biener
  2026-01-29 17:00           ` Qing Zhao
  2026-02-02 15:16         ` Florian Weimer
  1 sibling, 1 reply; 22+ messages in thread
From: Richard Biener @ 2026-01-29 16:44 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Florian Weimer, jakub Jelinek, Jakub Jelinek via Gcc



> Am 29.01.2026 um 16:57 schrieb Qing Zhao <qing.zhao@oracle.com>:
> 
> Our internal customer is asking for this feature.
> 
> They used icc currently, with icc, then can get inline list, prof_used list, etc. from the binary.
> 
> The old studio compiler also provided such feature, there was compiler commentary section in the binary to record
> many of the important compiler transformations along with the line number information.
> 
> If there is any performance issue with the application, such information will be very useful. Especially for large applications.

The best bet is probably -fopt-info which I think can do alternate outputs, eventually JSON.  That’s not in the binary itself, of course, but you could place it there with some linker script.

Richard 

> Qing
> 
> 
>> On Jan 29, 2026, at 05:30, Florian Weimer <fweimer@redhat.com> wrote:
>> 
>> * Jakub Jelinek via Gcc:
>> 
>>> On Thu, Jan 29, 2026 at 09:49:40AM +0100, Richard Biener via Gcc wrote:
>>>>> On Wed, Jan 28, 2026 at 4:43 PM Qing Zhao via Gcc <gcc@gcc.gnu.org> wrote:
>>>>>> Does GCC provide any option to record optimization information, such as inlining, loop transformation,
>>>>>> profiling consistency, etc into specific sections of binary code?
>>>>> 
>>>>> No.
>>> 
>>> Well, inlining is recorded in debug info.  And I don't see what benefit
>>> woiuld be to encode the rest, without source information like that is not
>>> really useful and with source around you can always recompile and look for
>>> compiler dumps or opt info, debug info provides details how a particular
>>> TU has been compiled (with what options etc.).
>> 
>> It's hard to recompile in a generic fashion because build systems are so
>> varied.  Depending on what people are trying to achieve, that's more of
>> a theoretical option.
>> 
>> (Capturing source code so that it becomes available for recompilation
>> would be interesting for many other applications, too.)
>> 
>> Thanks,
>> Florian
>> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-28 22:53 ` Siddhesh Poyarekar
@ 2026-01-29 16:58   ` David Malcolm
  2026-01-29 17:11     ` David Malcolm
  2026-01-30 16:27     ` Qing Zhao
  0 siblings, 2 replies; 22+ messages in thread
From: David Malcolm @ 2026-01-29 16:58 UTC (permalink / raw)
  To: Siddhesh Poyarekar, Qing Zhao, gcc

On Wed, 2026-01-28 at 17:53 -0500, Siddhesh Poyarekar wrote:
> On 2026-01-28 10:41, Qing Zhao via Gcc wrote:
> > Does GCC provide any option to record optimization information,
> > such as inlining, loop transformation,
> >   profiling consistency, etc into specific sections of binary code?
> 
> I may be misremembering this, but I think David had some ideas about 
> doing something like this in SARIF.
> 

Several thoughts here:

(a) I've written a prototype that embeds SARIF as an ELF section in the
generated object file, rather like debuginfo (my idea at the time being
that a binary could contain within it its build flags and other
metadata, and its diagnostics, etc).  I don't think I posted it to the
mailing list though.

(b) A long time ago I prototyped a gcc implementation of llvm's idea of
optimization remarks, to send info optimization through the diagnostics
subsystem, but IIRC that work ended up as the revamp of optinfo (in GCC
9?; see my Cauldron 2018 talk on optimization records), which
generalized some of the internals of how we track optimization info. 
The machine-readable output is a custom json-based format.

(c) SARIF would probably be a good fit for optimization records; it's
machine-readable, and has a rich vocabulary for source locations, code
constructs, machine locations, etc; IDEs and other tooling understand
it, so they'd get a source-level view of optimization info "for free".
Note that currently our SARIF output captures the contents of every
source file referred to by any diagnostics, but we could e.g. capture
every source file/header used during the compile, and could capture
e.g. SHA1 sums rather than file content.

(d) I've added the ability to add custom info to diagnostic sinks; see
e.g. capturing CFG information in 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e20eee3897ae8cd0f2212dad0710d64df8f1a956

(e) I've added a new publish/subscribe framework to GCC for loosely
coupled notifications that would probably help with the implementation
(to avoid needing to have the diagnostics subsystem "know" too much
about the optimizer).

So possible GCC 17 material might be:

(d) add a new sink to the optinfo subsystem that adds a new pub/sub
channel about optimization info, and sends notifications about the
optimization records there

(e) add a new option to -fdiagnostics-add-output to capture optinfo,
which when enabled subscribes the diagnostic sink to the optinfo
notifications channel.  Or we just skip (d) and work more directly with
optinfo, but (d) allows some extra flexibility e.g. for plugins that
listen for optimization decisions.

(f) potentially add a new option to the SARIF sink to support embedding
the data in an ELF section, rather than writing to a file (as per (a)
above).

Brainstorming, the user might be able to do something like:

-fdiagnostics-add-output=sarif:elf-section=optimizations,optinfo=inline

or whatnot, and have an ELF section capturing the decisions made by the
inliner.

Or we could have an option to send optinfo as diagnostics, like LLVM's
optimization records (and (b) above), and have the diagnostics sinks
handle them that way (text, SARIF, HTML).

Dave


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 16:44         ` Richard Biener
@ 2026-01-29 17:00           ` Qing Zhao
  0 siblings, 0 replies; 22+ messages in thread
From: Qing Zhao @ 2026-01-29 17:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: Florian Weimer, jakub Jelinek, Jakub Jelinek via Gcc



> On Jan 29, 2026, at 11:44, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> 
> 
>> Am 29.01.2026 um 16:57 schrieb Qing Zhao <qing.zhao@oracle.com>:
>> 
>> Our internal customer is asking for this feature.
>> 
>> They used icc currently, with icc, then can get inline list, prof_used list, etc. from the binary.
>> 
>> The old studio compiler also provided such feature, there was compiler commentary section in the binary to record
>> many of the important compiler transformations along with the line number information.
>> 
>> If there is any performance issue with the application, such information will be very useful. Especially for large applications.
> 
> The best bet is probably -fopt-info which I think can do alternate outputs, eventually JSON.

I see that -fsave-optimization-record can dump the optimization info along with source line number
info into a SRCFILE.opt-record.json.gz
https://gcc.gnu.org/onlinedocs/gcc/Developer-Options.html#index-fsave-optimization-record

But it was mentioned that this option is in experimental stage. 
Can I recommend this one to our customer now instead of -fopt-info-options=filename?

>  That’s not in the binary itself, of course, but you could place it there with some linker script.

Yes, that’s right.

thanks.

Qing
> 
> Richard 
> 
>> Qing
>> 
>> 
>>> On Jan 29, 2026, at 05:30, Florian Weimer <fweimer@redhat.com> wrote:
>>> 
>>> * Jakub Jelinek via Gcc:
>>> 
>>>> On Thu, Jan 29, 2026 at 09:49:40AM +0100, Richard Biener via Gcc wrote:
>>>>>> On Wed, Jan 28, 2026 at 4:43 PM Qing Zhao via Gcc <gcc@gcc.gnu.org> wrote:
>>>>>>> Does GCC provide any option to record optimization information, such as inlining, loop transformation,
>>>>>>> profiling consistency, etc into specific sections of binary code?
>>>>>> 
>>>>>> No.
>>>> 
>>>> Well, inlining is recorded in debug info.  And I don't see what benefit
>>>> woiuld be to encode the rest, without source information like that is not
>>>> really useful and with source around you can always recompile and look for
>>>> compiler dumps or opt info, debug info provides details how a particular
>>>> TU has been compiled (with what options etc.).
>>> 
>>> It's hard to recompile in a generic fashion because build systems are so
>>> varied.  Depending on what people are trying to achieve, that's more of
>>> a theoretical option.
>>> 
>>> (Capturing source code so that it becomes available for recompilation
>>> would be interesting for many other applications, too.)
>>> 
>>> Thanks,
>>> Florian



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 16:58   ` David Malcolm
@ 2026-01-29 17:11     ` David Malcolm
  2026-01-30 16:27     ` Qing Zhao
  1 sibling, 0 replies; 22+ messages in thread
From: David Malcolm @ 2026-01-29 17:11 UTC (permalink / raw)
  To: Siddhesh Poyarekar, Qing Zhao, gcc

On Thu, 2026-01-29 at 11:58 -0500, David Malcolm wrote:
> On Wed, 2026-01-28 at 17:53 -0500, Siddhesh Poyarekar wrote:
> > On 2026-01-28 10:41, Qing Zhao via Gcc wrote:
> > > Does GCC provide any option to record optimization information,
> > > such as inlining, loop transformation,
> > >   profiling consistency, etc into specific sections of binary
> > > code?
> > 
> > I may be misremembering this, but I think David had some ideas
> > about 
> > doing something like this in SARIF.
> > 
> 
> Several thoughts here:
> 
> (a) I've written a prototype that embeds SARIF as an ELF section in
> the
> generated object file, rather like debuginfo (my idea at the time
> being
> that a binary could contain within it its build flags and other
> metadata, and its diagnostics, etc).  I don't think I posted it to
> the
> mailing list though.
> 
> (b) A long time ago I prototyped a gcc implementation of llvm's idea
> of
> optimization remarks, to send info optimization through the
> diagnostics
> subsystem, but IIRC that work ended up as the revamp of optinfo (in
> GCC
> 9?; see my Cauldron 2018 talk on optimization records), which
> generalized some of the internals of how we track optimization info. 
> The machine-readable output is a custom json-based format.
> 
> (c) SARIF would probably be a good fit for optimization records; it's
> machine-readable, and has a rich vocabulary for source locations,
> code
> constructs, machine locations, etc; IDEs and other tooling understand
> it, so they'd get a source-level view of optimization info "for
> free".
> Note that currently our SARIF output captures the contents of every
> source file referred to by any diagnostics, but we could e.g. capture
> every source file/header used during the compile, and could capture
> e.g. SHA1 sums rather than file content.
> 
> (d) I've added the ability to add custom info to diagnostic sinks;
> see
> e.g. capturing CFG information in 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e20eee3897ae8cd0f2212dad0710d64df8f1a956
> 
> (e) I've added a new publish/subscribe framework to GCC for loosely
> coupled notifications that would probably help with the
> implementation
> (to avoid needing to have the diagnostics subsystem "know" too much
> about the optimizer).
> 
> So possible GCC 17 material might be:
> 
> (d) add a new sink to the optinfo subsystem that adds a new pub/sub
> channel about optimization info, and sends notifications about the
> optimization records there
> 
> (e) add a new option to -fdiagnostics-add-output to capture optinfo,
> which when enabled subscribes the diagnostic sink to the optinfo
> notifications channel.  Or we just skip (d) and work more directly
> with
> optinfo, but (d) allows some extra flexibility e.g. for plugins that
> listen for optimization decisions.
> 
> (f) potentially add a new option to the SARIF sink to support
> embedding
> the data in an ELF section, rather than writing to a file (as per (a)
> above).
> 
> Brainstorming, the user might be able to do something like:
> 
> -fdiagnostics-add-output=sarif:elf-
> section=optimizations,optinfo=inline
> 
> or whatnot, and have an ELF section capturing the decisions made by
> the
> inliner.
> 
> Or we could have an option to send optinfo as diagnostics, like
> LLVM's
> optimization records (and (b) above), and have the diagnostics sinks
> handle them that way (text, SARIF, HTML).

Some more thoughts:

* raw JSON might be rather large for the SARIF, but it compresses well.
I've experimented with other serializations (CBOR), but the savings I
saw didn't justify adding a new binary format.  That said, protobuf
might be an interesting approach.

* the optinfo records have a nested structure (e.g. info about the
logic within the vectorizer) that seems similar to that of the
hierarchical C++ messages we now emit for template errors.  So I'd want
to explore reusing that framework; this could make vectorizer reports
much easier for users to read

Dave


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 16:39       ` Jakub Jelinek
@ 2026-01-29 17:21         ` Qing Zhao
  2026-01-29 17:27           ` Jakub Jelinek
  0 siblings, 1 reply; 22+ messages in thread
From: Qing Zhao @ 2026-01-29 17:21 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc



> On Jan 29, 2026, at 11:39, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Thu, Jan 29, 2026 at 04:08:46PM +0000, Qing Zhao wrote:
>>> Well, inlining is recorded in debug info.
>> 
>> Okay, that’s good to know. So, user can examine the debug section in binary to get the inlining information?
> 
> There is information on which routines have been inlined into what, just
> readelf -wi and look for DW_TAG_inlined_subroutine tags, with
> DW_AT_abstract_origin pointing to what has been inlined at that point.
> There are no details why the inliner chose to inline something and not
> something else.

Thanks a lot for the detailed information. This is helpful.
> 
>> How about the profiling information, especially which routines are optimized by using the profiling info? Is such info recorded in binary?
> 
> That is very fuzzy, we don't really track what exact optimizations we've
> done based on profile info and what for other reasons.
> You can look again at DW_AT_producer in debug info, if a function is in
> a TU compiled with -fprofile-use, then presumably it has been optimized
> using the profiling info.  Though, that just mostly means when reading the
> *.gcda files it initialized edge probabilities and bb counts from the
> observed counts rather than from heuristics.  And optimizations then just
> use those probabilities/counts to decide on optimizations.  Sure,
> -fprofile-use doesn't do just that, but still.  You can find out if
> probabilities/counts were *.gcda based or heuristics based, but not whether
> anything actually used that info in some way and for what.

So, you mean that the “DW_AT_producer” tag in the binary can distinguish 
whether a routine uses real profiling data or only heuristics?

If so, I guess that will be useful too. 

> Especially not
> whether the probabilities/counts were different enough from heuristic based
> ones to result in different optimization decisions.  You'd need to compile
> twice, once without -fprofile-use, and compare.

one question here: if one routine whose profiling data is available in *.gcda, but
the profiling data is bad due to some reason, as a result, the profiling data is ignored. 
Is there “DW_AT_producer” tag for such routine?

Qing
> 
> Jakub
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 17:21         ` Qing Zhao
@ 2026-01-29 17:27           ` Jakub Jelinek
  2026-01-29 18:58             ` Qing Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2026-01-29 17:27 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Richard Biener, gcc

On Thu, Jan 29, 2026 at 05:21:01PM +0000, Qing Zhao wrote:
> So, you mean that the “DW_AT_producer” tag in the binary can distinguish 
> whether a routine uses real profiling data or only heuristics?
> 
> If so, I guess that will be useful too. 
> 
> > Especially not
> > whether the probabilities/counts were different enough from heuristic based
> > ones to result in different optimization decisions.  You'd need to compile
> > twice, once without -fprofile-use, and compare.
> 
> one question here: if one routine whose profiling data is available in *.gcda, but
> the profiling data is bad due to some reason, as a result, the profiling data is ignored. 
> Is there “DW_AT_producer” tag for such routine?

DW_AT_producer is an attribute of DW_TAG_compile_unit (or similar tags), so
applies to all functions within the translation unit.
So it tells you whether a particular TU has been compiled with -fprofile-use
or not.  I believe bad profiling data isn't an on or off case, if it is only
slightly bad, -fprofile-correction can tweak it and still apply.  And
without -fprofile-correction there are warnings or errors.

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 17:27           ` Jakub Jelinek
@ 2026-01-29 18:58             ` Qing Zhao
  0 siblings, 0 replies; 22+ messages in thread
From: Qing Zhao @ 2026-01-29 18:58 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc



> On Jan 29, 2026, at 12:27, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Thu, Jan 29, 2026 at 05:21:01PM +0000, Qing Zhao wrote:
>> So, you mean that the “DW_AT_producer” tag in the binary can distinguish 
>> whether a routine uses real profiling data or only heuristics?
>> 
>> If so, I guess that will be useful too. 
>> 
>>> Especially not
>>> whether the probabilities/counts were different enough from heuristic based
>>> ones to result in different optimization decisions.  You'd need to compile
>>> twice, once without -fprofile-use, and compare.
>> 
>> one question here: if one routine whose profiling data is available in *.gcda, but
>> the profiling data is bad due to some reason, as a result, the profiling data is ignored. 
>> Is there “DW_AT_producer” tag for such routine?
> 
> DW_AT_producer is an attribute of DW_TAG_compile_unit (or similar tags), so
> applies to all functions within the translation unit.
> So it tells you whether a particular TU has been compiled with -fprofile-use
> or not.  I believe bad profiling data isn't an on or off case, if it is only
> slightly bad, -fprofile-correction can tweak it and still apply.  And
> without -fprofile-correction there are warnings or errors.

Okay, I see. 
Thanks a lot.

Qing
> 
> Jakub
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 16:58   ` David Malcolm
  2026-01-29 17:11     ` David Malcolm
@ 2026-01-30 16:27     ` Qing Zhao
  2026-01-30 19:57       ` David Malcolm
  1 sibling, 1 reply; 22+ messages in thread
From: Qing Zhao @ 2026-01-30 16:27 UTC (permalink / raw)
  To: David Malcolm; +Cc: Siddhesh Poyarekar, gcc

Hi, David,

Thanks a lot for your information. They are very interesting and promising. 

I do have two questions:

1. When you wrote the prototype that embeds SARIF as an ELF section, did you collect
any data on the code size increase of the final object files? 
2. What the major concerns when we decide whether to dump the optimization info to a 
separate file, or embed the optimization info into the object file?

Thanks a lot.

Qing

> On Jan 29, 2026, at 11:58, David Malcolm <dmalcolm@redhat.com> wrote:
> 
> On Wed, 2026-01-28 at 17:53 -0500, Siddhesh Poyarekar wrote:
>> On 2026-01-28 10:41, Qing Zhao via Gcc wrote:
>>> Does GCC provide any option to record optimization information,
>>> such as inlining, loop transformation,
>>>   profiling consistency, etc into specific sections of binary code?
>> 
>> I may be misremembering this, but I think David had some ideas about 
>> doing something like this in SARIF.
>> 
> 
> Several thoughts here:
> 
> (a) I've written a prototype that embeds SARIF as an ELF section in the
> generated object file, rather like debuginfo (my idea at the time being
> that a binary could contain within it its build flags and other
> metadata, and its diagnostics, etc).  I don't think I posted it to the
> mailing list though.
> 
> (b) A long time ago I prototyped a gcc implementation of llvm's idea of
> optimization remarks, to send info optimization through the diagnostics
> subsystem, but IIRC that work ended up as the revamp of optinfo (in GCC
> 9?; see my Cauldron 2018 talk on optimization records), which
> generalized some of the internals of how we track optimization info. 
> The machine-readable output is a custom json-based format.
> 
> (c) SARIF would probably be a good fit for optimization records; it's
> machine-readable, and has a rich vocabulary for source locations, code
> constructs, machine locations, etc; IDEs and other tooling understand
> it, so they'd get a source-level view of optimization info "for free".
> Note that currently our SARIF output captures the contents of every
> source file referred to by any diagnostics, but we could e.g. capture
> every source file/header used during the compile, and could capture
> e.g. SHA1 sums rather than file content.
> 
> (d) I've added the ability to add custom info to diagnostic sinks; see
> e.g. capturing CFG information in 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e20eee3897ae8cd0f2212dad0710d64df8f1a956
> 
> (e) I've added a new publish/subscribe framework to GCC for loosely
> coupled notifications that would probably help with the implementation
> (to avoid needing to have the diagnostics subsystem "know" too much
> about the optimizer).
> 
> So possible GCC 17 material might be:
> 
> (d) add a new sink to the optinfo subsystem that adds a new pub/sub
> channel about optimization info, and sends notifications about the
> optimization records there
> 
> (e) add a new option to -fdiagnostics-add-output to capture optinfo,
> which when enabled subscribes the diagnostic sink to the optinfo
> notifications channel.  Or we just skip (d) and work more directly with
> optinfo, but (d) allows some extra flexibility e.g. for plugins that
> listen for optimization decisions.
> 
> (f) potentially add a new option to the SARIF sink to support embedding
> the data in an ELF section, rather than writing to a file (as per (a)
> above).
> 
> Brainstorming, the user might be able to do something like:
> 
> -fdiagnostics-add-output=sarif:elf-section=optimizations,optinfo=inline
> 
> or whatnot, and have an ELF section capturing the decisions made by the
> inliner.
> 
> Or we could have an option to send optinfo as diagnostics, like LLVM's
> optimization records (and (b) above), and have the diagnostics sinks
> handle them that way (text, SARIF, HTML).
> 
> Dave
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-30 16:27     ` Qing Zhao
@ 2026-01-30 19:57       ` David Malcolm
  2026-01-30 20:00         ` David Malcolm
  0 siblings, 1 reply; 22+ messages in thread
From: David Malcolm @ 2026-01-30 19:57 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Siddhesh Poyarekar, gcc

On Fri, 2026-01-30 at 16:27 +0000, Qing Zhao wrote:
> Hi, David,
> 
> Thanks a lot for your information. They are very interesting and
> promising. 

Thanks.

> 
> I do have two questions:
> 
> 1. When you wrote the prototype that embeds SARIF as an ELF section,
> did you collect
> any data on the code size increase of the final object files? 

I didn't collect realistic data.

FWIW I've uploaded the patch I had to
https://dmalcolm.fedorapeople.org/gcc/2026-01-30/0001-Initial-proof-of-concept-of-writing-sarif-to-asm-plu.patch
but it's heavily bit-rotted against trunk.

By way of example, in the same directory is a test.s and a test.o
generated using the patch on a trivial C file:

$ cat test.c
int i;
static int j;

$ ./cc1 -quiet test.c \
    -fdiagnostics-add-output=sarif:section=.sarif.json,serialization=json \
    -fdiagnostics-add-output=sarif:section=.sarif.json5,serialization=json5 \
    -fdiagnostics-add-output=sarif:section=.sarif.cbor,serialization=cbor \
    -o test.s \
    -Wall

test.c:2:12: warning: ‘j’ defined but not used [-Wunused-variable]
    2 | static int j;
      |            ^

$ as test.s -o test.o

$ for s in json json5 cbor ; do eu-readelf test.o -x .sarif.$s | head ; done
Hex dump of section [7] '.sarif.json', 2987 bytes at offset 0x1078:
  0x00000000 7b222473 6368656d 61223a20 22687474 {"$schema": "htt
  0x00000010 70733a2f 2f646f63 732e6f61 7369732d ps://docs.oasis-
  0x00000020 6f70656e 2e6f7267 2f736172 69662f73 open.org/sarif/s
  0x00000030 61726966 2f76322e 312e302f 65727261 arif/v2.1.0/erra
  0x00000040 74613031 2f6f732f 73636865 6d61732f ta01/os/schemas/
  0x00000050 73617269 662d7363 68656d61 2d322e31 sarif-schema-2.1
  0x00000060 2e302e6a 736f6e22 2c0a2022 76657273 .0.json",. "vers
  0x00000070 696f6e22 3a202232 2e312e30 222c0a20 ion": "2.1.0",. 
Hex dump of section [6] '.sarif.json5', 2699 bytes at offset 0x5ed:
  0x00000000 7b222473 6368656d 61223a20 22687474 {"$schema": "htt
  0x00000010 70733a2f 2f646f63 732e6f61 7369732d ps://docs.oasis-
  0x00000020 6f70656e 2e6f7267 2f736172 69662f73 open.org/sarif/s
  0x00000030 61726966 2f76322e 312e302f 65727261 arif/v2.1.0/erra
  0x00000040 74613031 2f6f732f 73636865 6d61732f ta01/os/schemas/
  0x00000050 73617269 662d7363 68656d61 2d322e31 sarif-schema-2.1
  0x00000060 2e302e6a 736f6e22 2c0a2076 65727369 .0.json",. versi
  0x00000070 6f6e3a20 22322e31 2e30222c 0a207275 on: "2.1.0",. ru
Hex dump of section [5] '.sarif.cbor', 1410 bytes at offset 0x6b:
  0x00000000 a3672473 6368656d 61785a68 74747073 .g$schemaxZhttps
  0x00000010 3a2f2f64 6f63732e 6f617369 732d6f70 ://docs.oasis-op
  0x00000020 656e2e6f 72672f73 61726966 2f736172 en.org/sarif/sar
  0x00000030 69662f76 322e312e 302f6572 72617461 if/v2.1.0/errata
  0x00000040 30312f6f 732f7363 68656d61 732f7361 01/os/schemas/sa
  0x00000050 7269662d 73636865 6d612d32 2e312e30 rif-schema-2.1.0
  0x00000060 2e6a736f 6e677665 7273696f 6e65322e .jsongversione2.
  0x00000070 312e3064 72756e73 81a56474 6f6f6ca1 1.0druns..dtool.

Dumping the sections:

$ objcopy test.o /dev/null --dump-section .sarif.json=/dev/stdout | head
{"$schema": "https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/schemas/sarif-schema-2.1.0.json",
 "version": "2.1.0",
 "runs": [{"tool": {"driver": {"name": "GNU C23",
                               "fullName": "GNU C23 (GCC) version 16.0.0 20250505 (experimental) (x86_64-pc-linux-gnu)",
                               "version": "16.0.0 20250505 (experimental)",
                               "informationUri": "https://gcc.gnu.org/gcc-16/",
                               "rules": [{"id": "-Wunused-variable",
                                          "helpUri": "https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wno-unused-variable"}]}},
           "invocations": [{"arguments": ["./cc1",
                                          "-quiet",


$ objcopy test.o /dev/null --dump-section .sarif.cbor=/dev/stdout | cbor2pretty.rb  | head
a3                                      # map(3)
   67                                   # text(7)
      24736368656d61                    # "$schema"
   78 5a                                # text(90)
      68747470733a2f2f646f63732e6f617369732d6f70656e2e6f72672f73617269662f73617269662f76322e312e302f65727261746130312f6f732f736368656d61732f73617269662d736368656d612d322e312e302e6a736f6e # "https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/schemas/sarif-schema-2.1.0.json"
   67                                   # text(7)
      76657273696f6e                    # "version"
   65                                   # text(5)
      322e312e30                        # "2.1.0"
   64                                   # text(4)

but I think gzipping the json would be simpler and likely more space-
efficient than using CBOR.

Replaying the diagnostics in test.o using sarif-replay:

$ objcopy test.o /dev/null --dump-section .sarif.json=tmp.json \
    | LD_LIBRARY_PATH=. ./sarif-replay tmp.json
test.c:2:12: warning: ‘j’ defined but not used [-Wunused-variable]
    2 | static int j;
      |            ^

So presumably we could do something similar with optimization records.


> 2. What the major concerns when we decide whether to dump the
> optimization info to a 
> separate file, or embed the optimization info into the object file?

For my use-case, I was thinking of diagnostics and build metadata, as a
kind of "annobin on steroids".   I don't know what the pros/cons of
embedding vs separate file would be for optimization info.

Dave


> 
> Thanks a lot.
> 
> Qing
> 
> > On Jan 29, 2026, at 11:58, David Malcolm <dmalcolm@redhat.com>
> > wrote:
> > 
> > On Wed, 2026-01-28 at 17:53 -0500, Siddhesh Poyarekar wrote:
> > > On 2026-01-28 10:41, Qing Zhao via Gcc wrote:
> > > > Does GCC provide any option to record optimization information,
> > > > such as inlining, loop transformation,
> > > >   profiling consistency, etc into specific sections of binary
> > > > code?
> > > 
> > > I may be misremembering this, but I think David had some ideas
> > > about 
> > > doing something like this in SARIF.
> > > 
> > 
> > Several thoughts here:
> > 
> > (a) I've written a prototype that embeds SARIF as an ELF section in
> > the
> > generated object file, rather like debuginfo (my idea at the time
> > being
> > that a binary could contain within it its build flags and other
> > metadata, and its diagnostics, etc).  I don't think I posted it to
> > the
> > mailing list though.
> > 
> > (b) A long time ago I prototyped a gcc implementation of llvm's
> > idea of
> > optimization remarks, to send info optimization through the
> > diagnostics
> > subsystem, but IIRC that work ended up as the revamp of optinfo (in
> > GCC
> > 9?; see my Cauldron 2018 talk on optimization records), which
> > generalized some of the internals of how we track optimization
> > info. 
> > The machine-readable output is a custom json-based format.
> > 
> > (c) SARIF would probably be a good fit for optimization records;
> > it's
> > machine-readable, and has a rich vocabulary for source locations,
> > code
> > constructs, machine locations, etc; IDEs and other tooling
> > understand
> > it, so they'd get a source-level view of optimization info "for
> > free".
> > Note that currently our SARIF output captures the contents of every
> > source file referred to by any diagnostics, but we could e.g.
> > capture
> > every source file/header used during the compile, and could capture
> > e.g. SHA1 sums rather than file content.
> > 
> > (d) I've added the ability to add custom info to diagnostic sinks;
> > see
> > e.g. capturing CFG information in 
> > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e20eee3897ae8cd0f2212dad0710d64df8f1a956
> > 
> > (e) I've added a new publish/subscribe framework to GCC for loosely
> > coupled notifications that would probably help with the
> > implementation
> > (to avoid needing to have the diagnostics subsystem "know" too much
> > about the optimizer).
> > 
> > So possible GCC 17 material might be:
> > 
> > (d) add a new sink to the optinfo subsystem that adds a new pub/sub
> > channel about optimization info, and sends notifications about the
> > optimization records there
> > 
> > (e) add a new option to -fdiagnostics-add-output to capture
> > optinfo,
> > which when enabled subscribes the diagnostic sink to the optinfo
> > notifications channel.  Or we just skip (d) and work more directly
> > with
> > optinfo, but (d) allows some extra flexibility e.g. for plugins
> > that
> > listen for optimization decisions.
> > 
> > (f) potentially add a new option to the SARIF sink to support
> > embedding
> > the data in an ELF section, rather than writing to a file (as per
> > (a)
> > above).
> > 
> > Brainstorming, the user might be able to do something like:
> > 
> > -fdiagnostics-add-output=sarif:elf-
> > section=optimizations,optinfo=inline
> > 
> > or whatnot, and have an ELF section capturing the decisions made by
> > the
> > inliner.
> > 
> > Or we could have an option to send optinfo as diagnostics, like
> > LLVM's
> > optimization records (and (b) above), and have the diagnostics
> > sinks
> > handle them that way (text, SARIF, HTML).
> > 
> > Dave
> > 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-30 19:57       ` David Malcolm
@ 2026-01-30 20:00         ` David Malcolm
  0 siblings, 0 replies; 22+ messages in thread
From: David Malcolm @ 2026-01-30 20:00 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Siddhesh Poyarekar, gcc

On Fri, 2026-01-30 at 14:57 -0500, David Malcolm wrote:
> On Fri, 2026-01-30 at 16:27 +0000, Qing Zhao wrote:
> > Hi, David,
> > 
> > Thanks a lot for your information. They are very interesting and
> > promising. 
> 
> Thanks.
> 
> > 
> > I do have two questions:
> > 
> > 1. When you wrote the prototype that embeds SARIF as an ELF
> > section,
> > did you collect
> > any data on the code size increase of the final object files? 
> 
> I didn't collect realistic data.
> 
> FWIW I've uploaded the patch I had to
> https://dmalcolm.fedorapeople.org/gcc/2026-01-30/0001-Initial-proof-of-concept-of-writing-sarif-to-asm-plu.patch
> but it's heavily bit-rotted against trunk.
> 
> By way of example, in the same directory is a test.s and a test.o
> generated using the patch on a trivial C file:
> 
> $ cat test.c
> int i;
> static int j;
> 
> $ ./cc1 -quiet test.c \
>     -fdiagnostics-add-
> output=sarif:section=.sarif.json,serialization=json \
>     -fdiagnostics-add-
> output=sarif:section=.sarif.json5,serialization=json5 \
>     -fdiagnostics-add-
> output=sarif:section=.sarif.cbor,serialization=cbor \
>     -o test.s \
>     -Wall
> 
> test.c:2:12: warning: ‘j’ defined but not used [-Wunused-variable]
>     2 | static int j;
>       |            ^
> 
> $ as test.s -o test.o
> 
> $ for s in json json5 cbor ; do eu-readelf test.o -x .sarif.$s | head
> ; done
> Hex dump of section [7] '.sarif.json', 2987 bytes at offset 0x1078:
>   0x00000000 7b222473 6368656d 61223a20 22687474 {"$schema": "htt
>   0x00000010 70733a2f 2f646f63 732e6f61 7369732d ps://docs.oasis-
>   0x00000020 6f70656e 2e6f7267 2f736172 69662f73 open.org/sarif/s
>   0x00000030 61726966 2f76322e 312e302f 65727261 arif/v2.1.0/erra
>   0x00000040 74613031 2f6f732f 73636865 6d61732f ta01/os/schemas/
>   0x00000050 73617269 662d7363 68656d61 2d322e31 sarif-schema-2.1
>   0x00000060 2e302e6a 736f6e22 2c0a2022 76657273 .0.json",. "vers
>   0x00000070 696f6e22 3a202232 2e312e30 222c0a20 ion": "2.1.0",. 
> Hex dump of section [6] '.sarif.json5', 2699 bytes at offset 0x5ed:
>   0x00000000 7b222473 6368656d 61223a20 22687474 {"$schema": "htt
>   0x00000010 70733a2f 2f646f63 732e6f61 7369732d ps://docs.oasis-
>   0x00000020 6f70656e 2e6f7267 2f736172 69662f73 open.org/sarif/s
>   0x00000030 61726966 2f76322e 312e302f 65727261 arif/v2.1.0/erra
>   0x00000040 74613031 2f6f732f 73636865 6d61732f ta01/os/schemas/
>   0x00000050 73617269 662d7363 68656d61 2d322e31 sarif-schema-2.1
>   0x00000060 2e302e6a 736f6e22 2c0a2076 65727369 .0.json",. versi
>   0x00000070 6f6e3a20 22322e31 2e30222c 0a207275 on: "2.1.0",. ru
> Hex dump of section [5] '.sarif.cbor', 1410 bytes at offset 0x6b:
>   0x00000000 a3672473 6368656d 61785a68 74747073 .g$schemaxZhttps
>   0x00000010 3a2f2f64 6f63732e 6f617369 732d6f70 ://docs.oasis-op
>   0x00000020 656e2e6f 72672f73 61726966 2f736172 en.org/sarif/sar
>   0x00000030 69662f76 322e312e 302f6572 72617461 if/v2.1.0/errata
>   0x00000040 30312f6f 732f7363 68656d61 732f7361 01/os/schemas/sa
>   0x00000050 7269662d 73636865 6d612d32 2e312e30 rif-schema-2.1.0
>   0x00000060 2e6a736f 6e677665 7273696f 6e65322e .jsongversione2.
>   0x00000070 312e3064 72756e73 81a56474 6f6f6ca1 1.0druns..dtool.
> 
> Dumping the sections:
> 
> $ objcopy test.o /dev/null --dump-section .sarif.json=/dev/stdout |
> head
> {"$schema":
> "https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/schemas/s
> arif-schema-2.1.0.json",
>  "version": "2.1.0",
>  "runs": [{"tool": {"driver": {"name": "GNU C23",
>                                "fullName": "GNU C23 (GCC) version
> 16.0.0 20250505 (experimental) (x86_64-pc-linux-gnu)",
>                                "version": "16.0.0 20250505
> (experimental)",
>                                "informationUri":
> "https://gcc.gnu.org/gcc-16/",
>                                "rules": [{"id": "-Wunused-variable",
>                                           "helpUri":
> "https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wno-un
> used-variable"}]}},
>            "invocations": [{"arguments": ["./cc1",
>                                           "-quiet",
> 
> 
> $ objcopy test.o /dev/null --dump-section .sarif.cbor=/dev/stdout |
> cbor2pretty.rb  | head
> a3                                      # map(3)
>    67                                   # text(7)
>       24736368656d61                    # "$schema"
>    78 5a                                # text(90)
>      
> 68747470733a2f2f646f63732e6f617369732d6f70656e2e6f72672f73617269662f7
> 3617269662f76322e312e302f65727261746130312f6f732f736368656d61732f7361
> 7269662d736368656d612d322e312e302e6a736f6e #
> "https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/schemas/s
> arif-schema-2.1.0.json"
>    67                                   # text(7)
>       76657273696f6e                    # "version"
>    65                                   # text(5)
>       322e312e30                        # "2.1.0"
>    64                                   # text(4)
> 
> but I think gzipping the json would be simpler and likely more space-
> efficient than using CBOR.
> 
> Replaying the diagnostics in test.o using sarif-replay:
> 
> $ objcopy test.o /dev/null --dump-section .sarif.json=tmp.json \
>     | LD_LIBRARY_PATH=. ./sarif-replay tmp.json
> test.c:2:12: warning: ‘j’ defined but not used [-Wunused-variable]
>     2 | static int j;
>       |            ^
> 
> So presumably we could do something similar with optimization
> records.
> 
> 
> > 2. What the major concerns when we decide whether to dump the
> > optimization info to a 
> > separate file, or embed the optimization info into the object file?
> 
> For my use-case, I was thinking of diagnostics and build metadata, as
> a
> kind of "annobin on steroids".   I don't know what the pros/cons of
> embedding vs separate file would be for optimization info.

A more ambitious approach might be to try to encode SARIF as DWARF (via
some extension to SWARF), which might lead to the SARIF being able to
reference specific binary locations, and for a linker to be able to
consolidate repeated strings in the data.

Dave

> 
> Dave
> 
> 
> > 
> > Thanks a lot.
> > 
> > Qing
> > 
> > > On Jan 29, 2026, at 11:58, David Malcolm <dmalcolm@redhat.com>
> > > wrote:
> > > 
> > > On Wed, 2026-01-28 at 17:53 -0500, Siddhesh Poyarekar wrote:
> > > > On 2026-01-28 10:41, Qing Zhao via Gcc wrote:
> > > > > Does GCC provide any option to record optimization
> > > > > information,
> > > > > such as inlining, loop transformation,
> > > > >   profiling consistency, etc into specific sections of binary
> > > > > code?
> > > > 
> > > > I may be misremembering this, but I think David had some ideas
> > > > about 
> > > > doing something like this in SARIF.
> > > > 
> > > 
> > > Several thoughts here:
> > > 
> > > (a) I've written a prototype that embeds SARIF as an ELF section
> > > in
> > > the
> > > generated object file, rather like debuginfo (my idea at the time
> > > being
> > > that a binary could contain within it its build flags and other
> > > metadata, and its diagnostics, etc).  I don't think I posted it
> > > to
> > > the
> > > mailing list though.
> > > 
> > > (b) A long time ago I prototyped a gcc implementation of llvm's
> > > idea of
> > > optimization remarks, to send info optimization through the
> > > diagnostics
> > > subsystem, but IIRC that work ended up as the revamp of optinfo
> > > (in
> > > GCC
> > > 9?; see my Cauldron 2018 talk on optimization records), which
> > > generalized some of the internals of how we track optimization
> > > info. 
> > > The machine-readable output is a custom json-based format.
> > > 
> > > (c) SARIF would probably be a good fit for optimization records;
> > > it's
> > > machine-readable, and has a rich vocabulary for source locations,
> > > code
> > > constructs, machine locations, etc; IDEs and other tooling
> > > understand
> > > it, so they'd get a source-level view of optimization info "for
> > > free".
> > > Note that currently our SARIF output captures the contents of
> > > every
> > > source file referred to by any diagnostics, but we could e.g.
> > > capture
> > > every source file/header used during the compile, and could
> > > capture
> > > e.g. SHA1 sums rather than file content.
> > > 
> > > (d) I've added the ability to add custom info to diagnostic
> > > sinks;
> > > see
> > > e.g. capturing CFG information in 
> > > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e20eee3897ae8cd0f2212dad0710d64df8f1a956
> > > 
> > > (e) I've added a new publish/subscribe framework to GCC for
> > > loosely
> > > coupled notifications that would probably help with the
> > > implementation
> > > (to avoid needing to have the diagnostics subsystem "know" too
> > > much
> > > about the optimizer).
> > > 
> > > So possible GCC 17 material might be:
> > > 
> > > (d) add a new sink to the optinfo subsystem that adds a new
> > > pub/sub
> > > channel about optimization info, and sends notifications about
> > > the
> > > optimization records there
> > > 
> > > (e) add a new option to -fdiagnostics-add-output to capture
> > > optinfo,
> > > which when enabled subscribes the diagnostic sink to the optinfo
> > > notifications channel.  Or we just skip (d) and work more
> > > directly
> > > with
> > > optinfo, but (d) allows some extra flexibility e.g. for plugins
> > > that
> > > listen for optimization decisions.
> > > 
> > > (f) potentially add a new option to the SARIF sink to support
> > > embedding
> > > the data in an ELF section, rather than writing to a file (as per
> > > (a)
> > > above).
> > > 
> > > Brainstorming, the user might be able to do something like:
> > > 
> > > -fdiagnostics-add-output=sarif:elf-
> > > section=optimizations,optinfo=inline
> > > 
> > > or whatnot, and have an ELF section capturing the decisions made
> > > by
> > > the
> > > inliner.
> > > 
> > > Or we could have an option to send optinfo as diagnostics, like
> > > LLVM's
> > > optimization records (and (b) above), and have the diagnostics
> > > sinks
> > > handle them that way (text, SARIF, HTML).
> > > 
> > > Dave
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-01-29 15:57       ` Qing Zhao
  2026-01-29 16:44         ` Richard Biener
@ 2026-02-02 15:16         ` Florian Weimer
  2026-02-02 15:41           ` Jakub Jelinek
  1 sibling, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2026-02-02 15:16 UTC (permalink / raw)
  To: Qing Zhao; +Cc: richard.guenther, Jakub Jelinek

* Qing Zhao:

> Our internal customer is asking for this feature. 
>
> They used icc currently, with icc, then can get inline list, prof_used
> list, etc. from the binary.
>
> The old studio compiler also provided such feature, there was compiler
> commentary section in the binary to record many of the important
> compiler transformations along with the line number information.
>
> If there is any performance issue with the application, such
> information will be very useful. Especially for large applications.

We have similar needs and use annobin for that.  Its optimization
coverage is not great, though.  I think for guiding future GCC
development, it would be really interesting to know how many (e.g.)
memcpy calls are there because the source code was not built with source
fortification, or because it was and the compiler deemed the call to be
statically safe and not needing bounds checking, or it didn't have any
bounds information at all, so checking was impossible.  That part at
least intersects with the desire to track certain optimization
decisions.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-02-02 15:16         ` Florian Weimer
@ 2026-02-02 15:41           ` Jakub Jelinek
  2026-02-02 18:07             ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2026-02-02 15:41 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Qing Zhao, richard.guenther, Jakub Jelinek via Gcc

On Mon, Feb 02, 2026 at 04:16:39PM +0100, Florian Weimer wrote:
> * Qing Zhao:
> 
> > Our internal customer is asking for this feature. 
> >
> > They used icc currently, with icc, then can get inline list, prof_used
> > list, etc. from the binary.
> >
> > The old studio compiler also provided such feature, there was compiler
> > commentary section in the binary to record many of the important
> > compiler transformations along with the line number information.
> >
> > If there is any performance issue with the application, such
> > information will be very useful. Especially for large applications.
> 
> We have similar needs and use annobin for that.  Its optimization
> coverage is not great, though.  I think for guiding future GCC
> development, it would be really interesting to know how many (e.g.)
> memcpy calls are there because the source code was not built with source
> fortification, or because it was and the compiler deemed the call to be
> statically safe and not needing bounds checking, or it didn't have any
> bounds information at all, so checking was impossible.  That part at
> least intersects with the desire to track certain optimization
> decisions.

Part of that is already available in debug info.
If you look to debug info corresponding to memcpy calls (sure, that won't
cover cases where memcpy has been expanded inline), if they are in
DW_TAG_inlined_subroutine for the bits/string_fortified.h inline, then
it has been -D_FORTIFY_SOURCE={2,3} protected, if not, then it has not.

Just we don't record anywhere whether gimple_fold_builtin_memory_chk decided
to fold __builtin___memcpy_chk to __builtin_memcpy because the last
argument wasn't all ones and the length argument was guaranteed to be
smaller than that, or if it was because the objsz pass determined the
last argument is all ones and so we can't protect it.

That said, all this is complicated by jump threading, inlining, unrolling
and other optimizations that can result in one source memcpy call to be
changed into multiple calls in the IL (or on the other side DCE etc. which
can remove some calls).
So, dunno what you want to track exactly, key stuff on the locations of the
memcpy (if extern inline memcpy then its caller) and track for each location
how many cases were folded because of all ones objsz, how many because of
gimple_fold_builtin_memory_chk etc.?

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-02-02 15:41           ` Jakub Jelinek
@ 2026-02-02 18:07             ` Richard Biener
  2026-02-03 13:39               ` Siddhesh Poyarekar
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2026-02-02 18:07 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Florian Weimer, Qing Zhao, Jakub Jelinek via Gcc

On Mon, Feb 2, 2026 at 4:41 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Feb 02, 2026 at 04:16:39PM +0100, Florian Weimer wrote:
> > * Qing Zhao:
> >
> > > Our internal customer is asking for this feature.
> > >
> > > They used icc currently, with icc, then can get inline list, prof_used
> > > list, etc. from the binary.
> > >
> > > The old studio compiler also provided such feature, there was compiler
> > > commentary section in the binary to record many of the important
> > > compiler transformations along with the line number information.
> > >
> > > If there is any performance issue with the application, such
> > > information will be very useful. Especially for large applications.
> >
> > We have similar needs and use annobin for that.  Its optimization
> > coverage is not great, though.  I think for guiding future GCC
> > development, it would be really interesting to know how many (e.g.)
> > memcpy calls are there because the source code was not built with source
> > fortification, or because it was and the compiler deemed the call to be
> > statically safe and not needing bounds checking, or it didn't have any
> > bounds information at all, so checking was impossible.  That part at
> > least intersects with the desire to track certain optimization
> > decisions.
>
> Part of that is already available in debug info.
> If you look to debug info corresponding to memcpy calls (sure, that won't
> cover cases where memcpy has been expanded inline), if they are in
> DW_TAG_inlined_subroutine for the bits/string_fortified.h inline, then
> it has been -D_FORTIFY_SOURCE={2,3} protected, if not, then it has not.
>
> Just we don't record anywhere whether gimple_fold_builtin_memory_chk decided
> to fold __builtin___memcpy_chk to __builtin_memcpy because the last
> argument wasn't all ones and the length argument was guaranteed to be
> smaller than that, or if it was because the objsz pass determined the
> last argument is all ones and so we can't protect it.
>
> That said, all this is complicated by jump threading, inlining, unrolling
> and other optimizations that can result in one source memcpy call to be
> changed into multiple calls in the IL (or on the other side DCE etc. which
> can remove some calls).

I guess we'd need to track duplications  (and removals!) as well then.  In
the end it's probably the memcpy calls somehow surviving and then
(for a single source memcpy) the ratio of fortified vs. non-fortified copies.

That is, we do have to implement what we had as former -Wunreachable-code,
but "properly".  It's going to be impossible and require costly human review
I think.

> So, dunno what you want to track exactly, key stuff on the locations of the
> memcpy (if extern inline memcpy then its caller) and track for each location
> how many cases were folded because of all ones objsz, how many because of
> gimple_fold_builtin_memory_chk etc.?
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Does GCC record optimization information into binary file?
  2026-02-02 18:07             ` Richard Biener
@ 2026-02-03 13:39               ` Siddhesh Poyarekar
  0 siblings, 0 replies; 22+ messages in thread
From: Siddhesh Poyarekar @ 2026-02-03 13:39 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek
  Cc: Florian Weimer, Qing Zhao, Jakub Jelinek via Gcc

On 2026-02-02 13:07, Richard Biener via Gcc wrote:
> I guess we'd need to track duplications  (and removals!) as well then.  In
> the end it's probably the memcpy calls somehow surviving and then
> (for a single source memcpy) the ratio of fortified vs. non-fortified copies.

For fortification metrics specifically, I had used __bdos success as a 
proxy:

https://github.com/siddhesh/fortify-metrics

Sid

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2026-02-03 13:39 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-28 15:41 Does GCC record optimization information into binary file? Qing Zhao
2026-01-28 22:53 ` Siddhesh Poyarekar
2026-01-29 16:58   ` David Malcolm
2026-01-29 17:11     ` David Malcolm
2026-01-30 16:27     ` Qing Zhao
2026-01-30 19:57       ` David Malcolm
2026-01-30 20:00         ` David Malcolm
2026-01-29  8:49 ` Richard Biener
2026-01-29  9:00   ` Jakub Jelinek
2026-01-29 10:30     ` Florian Weimer
2026-01-29 15:57       ` Qing Zhao
2026-01-29 16:44         ` Richard Biener
2026-01-29 17:00           ` Qing Zhao
2026-02-02 15:16         ` Florian Weimer
2026-02-02 15:41           ` Jakub Jelinek
2026-02-02 18:07             ` Richard Biener
2026-02-03 13:39               ` Siddhesh Poyarekar
2026-01-29 16:08     ` Qing Zhao
2026-01-29 16:39       ` Jakub Jelinek
2026-01-29 17:21         ` Qing Zhao
2026-01-29 17:27           ` Jakub Jelinek
2026-01-29 18:58             ` Qing Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).