public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file")
@ 2022-06-06 16:52 hpa at zytor dot com
  2022-06-06 17:36 ` [Bug c/105863] " pinskia at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: hpa at zytor dot com @ 2022-06-06 16:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

            Bug ID: 105863
           Summary: RFE: __attribute__((incbin("file"))) or
                    __builtin_incbin("file")
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hpa at zytor dot com
  Target Milestone: ---

It is a *very* common operation to want to include a preexisting binary object
into a compiled project. There are a number of ways to do this, but they all
suffer significant shortcomings.

The most common ways are to use a preprocessor to convert the input binary
object to textual source, or to wrap an object in assembly code and use
.incbin.  The former is *extremely* inefficient, the latter has a number of
pitfalls, including the one described in bug 66871, the need for
platform-specific coding, sizeof() not being functional, etc.

I would like to suggest a variable attribute __attribute__((incbin("file")))
which statically initializes a variable to the contents of the given binary
file, or a __builtin_incbin("file") which would expand to an initializer; the
end result would look either like:

char foobar[] __attribute__((incbin("foobar.bin")));
char foobar[] = __builtin_incbin("foobar.bin");

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: __attribute__((incbin("file"))) or __builtin_incbin("file")
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
@ 2022-06-06 17:36 ` pinskia at gcc dot gnu.org
  2022-06-06 17:43 ` amonakov at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-06-06 17:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The c standard was standardizing this already so we should follow that instead
of making something new.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: __attribute__((incbin("file"))) or __builtin_incbin("file")
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
  2022-06-06 17:36 ` [Bug c/105863] " pinskia at gcc dot gnu.org
@ 2022-06-06 17:43 ` amonakov at gcc dot gnu.org
  2022-06-13 12:59 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-06-06 17:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #2 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
This is #embed, see https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2967.htm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: __attribute__((incbin("file"))) or __builtin_incbin("file")
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
  2022-06-06 17:36 ` [Bug c/105863] " pinskia at gcc dot gnu.org
  2022-06-06 17:43 ` amonakov at gcc dot gnu.org
@ 2022-06-13 12:59 ` rguenth at gcc dot gnu.org
  2023-06-05 19:55 ` [Bug c/105863] RFE: C23 #embed hpa at zytor dot com
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-06-13 12:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
            Version|unknown                     |13.0
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-06-13
           Severity|normal                      |enhancement

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
And it could assemble to include directives for the assembler as well (where
supported).

With toplevel asm and gas you could probably arrange things already.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: C23 #embed
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
                   ` (2 preceding siblings ...)
  2022-06-13 12:59 ` rguenth at gcc dot gnu.org
@ 2023-06-05 19:55 ` hpa at zytor dot com
  2023-06-22 15:16 ` mpolacek at gcc dot gnu.org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: hpa at zytor dot com @ 2023-06-05 19:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

--- Comment #4 from H. Peter Anvin <hpa at zytor dot com> ---
So I'm updating this to be C23 #embed, since that is a bit more general than
the typical incbin (at least conceptually it operates on the preprocessor
syntactic level; it does not of course preclude a shortcut between the
preprocessor and the compiler.)

However, C23 #embed has a *huge* problem; specifically it has exactly the same
problem that necessitated #pragma to be augmented with _Pragma(). Therefore, I
believe that an equivalent construct (_Embed()) is needed for #embed as well.

I have given this feedback to members of the C committee, but it was not
surprisingly too late for C23; I hope it will be considered for C2y and I
believe it would be a highly desirable extension in the meantime.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: C23 #embed
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
                   ` (3 preceding siblings ...)
  2023-06-05 19:55 ` [Bug c/105863] RFE: C23 #embed hpa at zytor dot com
@ 2023-06-22 15:16 ` mpolacek at gcc dot gnu.org
  2023-06-22 20:42 ` joseph at codesourcery dot com
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mpolacek at gcc dot gnu.org @ 2023-06-22 15:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

--- Comment #5 from Marek Polacek <mpolacek at gcc dot gnu.org> ---
Latest rev: https://open-std.org/JTC1/SC22/WG14/www/docs/n3017.htm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: C23 #embed
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
                   ` (4 preceding siblings ...)
  2023-06-22 15:16 ` mpolacek at gcc dot gnu.org
@ 2023-06-22 20:42 ` joseph at codesourcery dot com
  2024-05-15 15:33 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: joseph at codesourcery dot com @ 2023-06-22 20:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

--- Comment #6 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
The latest version should be taken to be what's in the working draft 
N3096, plus the editorial fixes from CD2 comments GB-081 through GB-084.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: C23 #embed
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
                   ` (5 preceding siblings ...)
  2023-06-22 20:42 ` joseph at codesourcery dot com
@ 2024-05-15 15:33 ` jakub at gcc dot gnu.org
  2024-05-15 16:16 ` hpa at zytor dot com
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-05-15 15:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
_Embed opens the pandorra box what should happen when you stringify it or try
to token paste it together with something else etc.

Anyway, for GCC implementation of what C23 specifies, I wonder if we shouldn't
implement it in separate steps, first in a dumb way of just expanding it always
into preprocessor token, a path that could perhaps then be kept for the smaller
sizes when it isn't worth doing something smart.
And only in the second step try to add optimizations to it (guess for C those
could be easier than for C++ because C doesn't try to tokenize everything
first, so for C when we peek at the large embed token outside of the language
contexts where we know how to handle those (e.g. most importantly inside of
aggregate initializers) we could simply replace the token with expanded form of
it, say if one uses
void foo (...);
void bar ()
{
  foo (
#embed "foo_arguments"
  );
}
it would work without having to bother too much about that specific case.
The LLVM current pull request for this is
https://github.com/llvm/llvm-project/pull/68620
I think we should try to use same options where reasonable.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: C23 #embed
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
                   ` (6 preceding siblings ...)
  2024-05-15 15:33 ` jakub at gcc dot gnu.org
@ 2024-05-15 16:16 ` hpa at zytor dot com
  2024-05-16 20:41 ` jsm28 at gcc dot gnu.org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: hpa at zytor dot com @ 2024-05-15 16:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

--- Comment #8 from H. Peter Anvin <hpa at zytor dot com> ---
Well, _Embed() would be an extension and it doesn't seem unreasonable to say
that _Embed() would be expanded after token pasting. After all, as has been
discussed in the C committee is that if #embed cannot be short-circuited the
value is significantly reduced.

That being said, it makes sense what you said.

Both the #pragma and #embed, as well as some other use cases, really call for
real procedural support in cpp. I have an idea for that that I would like to
present to the C committee; I don't think it is really in scope for this
request though.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: C23 #embed
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
                   ` (7 preceding siblings ...)
  2024-05-15 16:16 ` hpa at zytor dot com
@ 2024-05-16 20:41 ` jsm28 at gcc dot gnu.org
  2024-05-17 12:08 ` jakub at gcc dot gnu.org
  2024-05-20 20:35 ` jsm28 at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2024-05-16 20:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

--- Comment #9 from Joseph S. Myers <jsm28 at gcc dot gnu.org> ---
The most straightforward and most important case to optimize is the one where
the #embed expansion lies entirely inside a single character array initializer
(possibly with some integer constants before or after it in the initializer -
whether coming from the prefix and suffix parameters to #embed or otherwise) -
in which case the initializer can be converted internally to a STRING_CST.
Cases that aren't within a character array initializer like that are harder to
optimize (might require additional internal representation beyond the front
end), and probably also less important to optimize initially.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: C23 #embed
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
                   ` (8 preceding siblings ...)
  2024-05-16 20:41 ` jsm28 at gcc dot gnu.org
@ 2024-05-17 12:08 ` jakub at gcc dot gnu.org
  2024-05-20 20:35 ` jsm28 at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-05-17 12:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I think we should add a new tree next to STRING_CST for use inside of
CONSTRUCTORs.
STRING_CST in theory could be used e.g. inside of a constructor_elt, with say
RANGE_EXPR for the index and STRING_CST for the element.  But the problem is
that the
STRING_CST owns the whole string.  If somebody does:
const char arr[] = { 1, 2, 3, 42,
#embed "large_data.bin"
0, [100000] = 15, [200000] = 32 };
it would be nice if we could start with say that STRING_CST covering all of
50MB or how much of data, but then the designated initializer overrides mean
either that we need to expand it all to the huge number of INTEGER_CSTs, or if
we had some tree that can extract some substring from larger STRING_CST use
that (3 operands, the STRING_CST and offset from start and length), we could
just create two such small trees and build one INTEGER_CST in between.
Because if we just have STRING_CST, we'd need to create 2 new huge STRING_CSTs
when splitting something into halves.
SUBSTRING_CST?

For what to do with -E, if the amount of data is really small (dunno, 64 or 128
bytes or user parameter?), we should expand it in the preprocessed source as
integer tokens with commas in between, so
 124,231,0,15,24,86
but for larger I'd go with what I've proposed in the LLVM pull request, i.e.
emit in
the preprocessed source
#embed "."
__gnu__::__base64__("TG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVlciBhZGlwaXNjaW5nIGVsaXQuIE5hbSBzZWQgdGVsbHVzIGlkIG1hZ25hIGVsZW1lbnR1bSB0aW5jaWR1bnQuIEFsaXF1YW0gaWQgZG9sb3IuIFV0IHRlbXB1cyBwdXJ1cyBhdCBsb3JlbS4uLgo=")
or so, where the embed data would be base64 decoded from the string instead of
read from some other file.
Because
#embed_base64
or what has been proposed there would be something to diagnose with
-pedantic-errors as invalid,
while I think recognized #embed implementation-defined parameters aren't
strictly invalid (while unrecognized are).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug c/105863] RFE: C23 #embed
  2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
                   ` (9 preceding siblings ...)
  2024-05-17 12:08 ` jakub at gcc dot gnu.org
@ 2024-05-20 20:35 ` jsm28 at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2024-05-20 20:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863

--- Comment #11 from Joseph S. Myers <jsm28 at gcc dot gnu.org> ---
It makes the changes more complicated (everything that handles CONSTRUCTORs,
whether to output them to assembly or to extract values for optimization etc.,
needs to handle the new tree), but yes, having a new tree would allow more
efficient handling of additional cases that wouldn't be so efficient with
STRING_CST.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-05-20 20:35 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-06 16:52 [Bug c/105863] New: RFE: __attribute__((incbin("file"))) or __builtin_incbin("file") hpa at zytor dot com
2022-06-06 17:36 ` [Bug c/105863] " pinskia at gcc dot gnu.org
2022-06-06 17:43 ` amonakov at gcc dot gnu.org
2022-06-13 12:59 ` rguenth at gcc dot gnu.org
2023-06-05 19:55 ` [Bug c/105863] RFE: C23 #embed hpa at zytor dot com
2023-06-22 15:16 ` mpolacek at gcc dot gnu.org
2023-06-22 20:42 ` joseph at codesourcery dot com
2024-05-15 15:33 ` jakub at gcc dot gnu.org
2024-05-15 16:16 ` hpa at zytor dot com
2024-05-16 20:41 ` jsm28 at gcc dot gnu.org
2024-05-17 12:08 ` jakub at gcc dot gnu.org
2024-05-20 20:35 ` jsm28 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).