public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Scaling -fmacro-prefix-map= to thousands entries
@ 2023-10-04 21:19 Sergei Trofimovich
  2023-10-05  7:19 ` Richard Biener
  2023-10-05 11:20 ` Ben Boeckel
  0 siblings, 2 replies; 9+ messages in thread
From: Sergei Trofimovich @ 2023-10-04 21:19 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 5217 bytes --]

Hi gcc developers!

Tl;DR:

I would like to implement a scalable way to pass `-fmacro-prefix-map=`
for `NixOS` distribution to avoid leaking build-time paths generated by
`__FILE__` macros used by various libraries.

I need some guidance what path to take to be acceptable for `gcc`
upstream.

I have a few possible solutions and wonder what I should try to upstream
to GCC. The options I see:

1. Hardcode NixOS-specific way to mangle paths.

   Pros: simplest to implement, can be easily configured away if needed
   Cons: inflexible, `clang` might or might not accept the same hack

2. Extend `-fmacro-prefix-map=` (or add a new `-fmacro-prefix-map-file=`)
   to allow passing a file

   Pros: still not too hard to implement, generic enough to be used in
         other contexts.
   Cons: Will require client to construct the map file.

3. Have more flexible `-fmacro-prefix-map-regex=` option that allows
   patterns. Something like:

      -fmacro-prefix-map-regex=/nix/store/[a-z0-9]{32}-=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-

  Pros: at least for NixOS one option will be enough to cover all
        packages as they all share above template.
  Cons: pulls some form of regex with it's can of worms including escape
        delimiters, might not be flexible enough for other use cases.

4. Something else?

Which one(s) should I take to implement?

More words:

`NixOS` (and `nixpkgs` repository) install every software package into
an individual directory with unique prefix. Some examples:

    /nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev
    /nix/store/rb3q4kcyfg77cmkiwywx2aqdd3x5ch93-libmpc-1.3.1
    /nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2
    ...

It's a fundamental design decision to allow parallel package installs.

From dependency tracking standpoint it's highly undesirable to have
these absolute paths to be hardcoded into final executable binaries if
they are not used at runtime.

Example redundant path we would like not to have in final binaries:

    $ strings result/bin/nix | grep phjcmy025rd1ankw5y1b21xsdii83cyk
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/json.hpp
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/output/serializer.hpp
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/conversions/to_chars.hpp
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/lexer.hpp
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iter_impl.hpp
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/json_sax.hpp
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iteration_proxy.hpp
    /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/parser.hpp

Those paths are inserted via glibc's assert() uses of `__FILE__`
directive and thus hardcode header file paths from various packages
(like lttng-ust or nlohmann/json) into compiled binaries. Sometimes
`__FILE__` usage is mire creating than assert().

I would like to get rid of references to header files. I think
`-fmacro-prefix-map=` are ideal for this particular use case.

The prototype that creates equivalent of the following commands does
work for smaller packages:

    -fmacro-prefix-map=/nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.37-8-dev
    -fmacro-prefix-map=/nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gmp-with-cxx-6.3.0-dev
    -fmacro-prefix-map=/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-nlohmann_json-3.11.2
    ...

The above works for small amount of options (like, 100). But around 1000
options we start hitting linux limits on the single environment variable
or real-world packages like `qemu` with a ton of input depends.

The command-line limitations are in various places:
- `gcc` limitation of lifting all command line options into a single
  environment variable: https://gcc.gnu.org/PR111527
- `linux` limitation of constraining single environ variable to a value
  way below than full available environment space:
  https://lkml.org/lkml/2023/9/24/381

`linux` fix would buy us 50x more budged (A Lot) but it will not help
much other operating systems like `Darwin` where absolute environment
limit is a lot lower than `linux`.

I already implemented [1.] in https://github.com/NixOS/nixpkgs/pull/255192
(also attached `mangle-NIX_STORE-in-__FILE__.patch` 3.5K patch against
`master` as a proof of concept).

What would be the best way to scale up `-fmacro-prefix-map=` up to NixOS
needs for `gcc`? I would like to implement something sensible I could
upstream.

What do you think?

Thanks!

-- 

  Sergei

[-- Attachment #2: mangle-NIX_STORE-in-__FILE__.patch --]
[-- Type: text/x-patch, Size: 3476 bytes --]

From b10785c1be469319a09b10bc69db21159b0599ee Mon Sep 17 00:00:00 2001
From: Sergei Trofimovich <siarheit@google.com>
Date: Fri, 22 Sep 2023 22:41:49 +0100
Subject: [PATCH] gcc/file-prefix-map.cc: always mangle __FILE__ into invalid
 store path

Without the change `__FILE__` used in static inline functions in headers
embed paths to header files into executable images. For local headers
it's not a problem, but for headers in `/nix/store` this causes `-dev`
inputs to be retained in runtime closure.

Typical examples are `nix` -> `nlonhamm_json` and `pipewire` ->
`lttng-ust.dev`.

Ideally we would like to use `-fmacro-prefix-map=` feature of `gcc` as:

  -fmacro-prefix-map=/nix/store/$hash1-nlohmann-json-ver=/nix/store/eeee.eee-nlohmann-json-ver
  -fmacro-prefix-map=/nix/...

IN practice it quickly exhausts argument lengtth limit due to `gcc`
deficiency: https://gcc.gnu.org/PR111527

Until it;s fixed let's hardcode header mangling if $NIX_STORE variable
is present in the environment.

Tested as:

    $ printf "# 0 \"/nix/store/01234567890123456789012345678901-pppppp-vvvvvvv\" \nconst char * f(void) { return __FILE__; }" | NIX_STORE=/nix/store ./gcc/xgcc -Bgcc -x c - -S -o -
    ...
    .string "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-pppppp-vvvvvvv"
    ...

Mangled successfully.
---
 gcc/file-prefix-map.cc | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/gcc/file-prefix-map.cc b/gcc/file-prefix-map.cc
index 0e6db7c142a..da39404b9cd 100644
--- a/gcc/file-prefix-map.cc
+++ b/gcc/file-prefix-map.cc
@@ -69,6 +69,9 @@ add_prefix_map (file_prefix_map *&maps, const char *arg, const char *opt)
   maps = map;
 }
 
+/* Forward declatration for a $NIX_STORE remap hack below. */
+static file_prefix_map *macro_prefix_maps; /* -fmacro-prefix-map  */
+
 /* Perform user-specified mapping of filename prefixes.  Return the
    GC-allocated new name corresponding to FILENAME or FILENAME if no
    remapping was performed.  */
@@ -102,6 +105,29 @@ remap_filename (file_prefix_map *maps, const char *filename)
       break;
   if (!map)
     {
+      if (maps == macro_prefix_maps)
+	{
+	  /* Remap all fo $NIX_STORE/.{32} paths to
+	   * equivalent $NIX_STORE/e{32}.
+	   *
+	   * That way we avoid argument parameters explosion
+	   * and still avoid embedding headers into runtime closure:
+	   *   https://gcc.gnu.org/PR111527
+	   */
+	   char * nix_store = getenv("NIX_STORE");
+	   size_t nix_store_len = nix_store ? strlen(nix_store) : 0;
+	   const char * name = realname ? realname : filename;
+	   size_t name_len = strlen(name);
+	   if (nix_store && name_len >= nix_store_len + 1 + 32 && memcmp(name, nix_store, nix_store_len) == 0)
+	     {
+		s = (char *) ggc_alloc_atomic (name_len + 1);
+		memcpy(s, name, name_len + 1);
+		memset(s + nix_store_len + 1, 'e', 32);
+		if (realname != filename)
+		  free (const_cast <char *> (realname));
+		return s;
+	     }
+	}
       if (realname != filename)
 	free (const_cast <char *> (realname));
       return filename;
@@ -124,7 +150,6 @@ remap_filename (file_prefix_map *maps, const char *filename)
    ignore it in DW_AT_producer (gen_command_line_string in opts.cc).  */
 
 /* Linked lists of file_prefix_map structures.  */
-static file_prefix_map *macro_prefix_maps; /* -fmacro-prefix-map  */
 static file_prefix_map *debug_prefix_maps; /* -fdebug-prefix-map  */
 static file_prefix_map *profile_prefix_maps; /* -fprofile-prefix-map  */
 
-- 
2.42.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Scaling -fmacro-prefix-map= to thousands entries
  2023-10-04 21:19 Scaling -fmacro-prefix-map= to thousands entries Sergei Trofimovich
@ 2023-10-05  7:19 ` Richard Biener
  2023-10-05 11:59   ` Sergei Trofimovich
  2023-10-05 11:20 ` Ben Boeckel
  1 sibling, 1 reply; 9+ messages in thread
From: Richard Biener @ 2023-10-05  7:19 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: gcc

On Wed, Oct 4, 2023 at 11:20 PM Sergei Trofimovich via Gcc
<gcc@gcc.gnu.org> wrote:
>
> Hi gcc developers!
>
> Tl;DR:
>
> I would like to implement a scalable way to pass `-fmacro-prefix-map=`
> for `NixOS` distribution to avoid leaking build-time paths generated by
> `__FILE__` macros used by various libraries.
>
> I need some guidance what path to take to be acceptable for `gcc`
> upstream.
>
> I have a few possible solutions and wonder what I should try to upstream
> to GCC. The options I see:
>
> 1. Hardcode NixOS-specific way to mangle paths.
>
>    Pros: simplest to implement, can be easily configured away if needed
>    Cons: inflexible, `clang` might or might not accept the same hack
>
> 2. Extend `-fmacro-prefix-map=` (or add a new `-fmacro-prefix-map-file=`)
>    to allow passing a file
>
>    Pros: still not too hard to implement, generic enough to be used in
>          other contexts.
>    Cons: Will require client to construct the map file.
>
> 3. Have more flexible `-fmacro-prefix-map-regex=` option that allows
>    patterns. Something like:
>
>       -fmacro-prefix-map-regex=/nix/store/[a-z0-9]{32}-=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-
>
>   Pros: at least for NixOS one option will be enough to cover all
>         packages as they all share above template.
>   Cons: pulls some form of regex with it's can of worms including escape
>         delimiters, might not be flexible enough for other use cases.
>
> 4. Something else?
>
> Which one(s) should I take to implement?
>
> More words:
>
> `NixOS` (and `nixpkgs` repository) install every software package into
> an individual directory with unique prefix. Some examples:
>
>     /nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev
>     /nix/store/rb3q4kcyfg77cmkiwywx2aqdd3x5ch93-libmpc-1.3.1
>     /nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2
>     ...
>
> It's a fundamental design decision to allow parallel package installs.
>
> From dependency tracking standpoint it's highly undesirable to have
> these absolute paths to be hardcoded into final executable binaries if
> they are not used at runtime.
>
> Example redundant path we would like not to have in final binaries:
>
>     $ strings result/bin/nix | grep phjcmy025rd1ankw5y1b21xsdii83cyk
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/json.hpp
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/output/serializer.hpp
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/conversions/to_chars.hpp
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/lexer.hpp
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iter_impl.hpp
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/json_sax.hpp
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iteration_proxy.hpp
>     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/parser.hpp
>
> Those paths are inserted via glibc's assert() uses of `__FILE__`
> directive and thus hardcode header file paths from various packages
> (like lttng-ust or nlohmann/json) into compiled binaries. Sometimes
> `__FILE__` usage is mire creating than assert().
>
> I would like to get rid of references to header files. I think
> `-fmacro-prefix-map=` are ideal for this particular use case.
>
> The prototype that creates equivalent of the following commands does
> work for smaller packages:
>
>     -fmacro-prefix-map=/nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.37-8-dev
>     -fmacro-prefix-map=/nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gmp-with-cxx-6.3.0-dev
>     -fmacro-prefix-map=/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-nlohmann_json-3.11.2
>     ...
>
> The above works for small amount of options (like, 100). But around 1000
> options we start hitting linux limits on the single environment variable
> or real-world packages like `qemu` with a ton of input depends.
>
> The command-line limitations are in various places:
> - `gcc` limitation of lifting all command line options into a single
>   environment variable: https://gcc.gnu.org/PR111527
> - `linux` limitation of constraining single environ variable to a value
>   way below than full available environment space:
>   https://lkml.org/lkml/2023/9/24/381
>
> `linux` fix would buy us 50x more budged (A Lot) but it will not help
> much other operating systems like `Darwin` where absolute environment
> limit is a lot lower than `linux`.
>
> I already implemented [1.] in https://github.com/NixOS/nixpkgs/pull/255192
> (also attached `mangle-NIX_STORE-in-__FILE__.patch` 3.5K patch against
> `master` as a proof of concept).
>
> What would be the best way to scale up `-fmacro-prefix-map=` up to NixOS
> needs for `gcc`? I would like to implement something sensible I could
> upstream.
>
> What do you think?

Go for (2) which I think is the only way to truly solve the command-line
limitation issue (with less regular paths even regex wouldn't cut it).

Btw, I thought we have response files to deal with command-line limits,
why doesn't that work here?  I see the driver expands response files
but IIRC it also builds those when the command-line gets too large
and uses it for the environment and the cc1 invocation?  If it doesn't
do the latter why not fix it that way?

Richard.

> Thanks!
>
> --
>
>   Sergei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Scaling -fmacro-prefix-map= to thousands entries
  2023-10-04 21:19 Scaling -fmacro-prefix-map= to thousands entries Sergei Trofimovich
  2023-10-05  7:19 ` Richard Biener
@ 2023-10-05 11:20 ` Ben Boeckel
  2023-10-05 12:05   ` Sergei Trofimovich
  1 sibling, 1 reply; 9+ messages in thread
From: Ben Boeckel @ 2023-10-05 11:20 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: gcc

On Wed, Oct 04, 2023 at 22:19:32 +0100, Sergei Trofimovich via Gcc wrote:
> The prototype that creates equivalent of the following commands does
> work for smaller packages:
> 
>     -fmacro-prefix-map=/nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.37-8-dev
>     -fmacro-prefix-map=/nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gmp-with-cxx-6.3.0-dev
>     -fmacro-prefix-map=/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-nlohmann_json-3.11.2
>     ...
> 
> The above works for small amount of options (like, 100). But around 1000
> options we start hitting linux limits on the single environment variable
> or real-world packages like `qemu` with a ton of input depends.

Are you trying to pass this through via `CFLAGS` and friends?

> The command-line limitations are in various places:
> - `gcc` limitation of lifting all command line options into a single
>   environment variable: https://gcc.gnu.org/PR111527
> - `linux` limitation of constraining single environ variable to a value
>   way below than full available environment space:
>   https://lkml.org/lkml/2023/9/24/381
> 
> `linux` fix would buy us 50x more budged (A Lot) but it will not help
> much other operating systems like `Darwin` where absolute environment
> limit is a lot lower than `linux`.
> 
> I already implemented [1.] in https://github.com/NixOS/nixpkgs/pull/255192
> (also attached `mangle-NIX_STORE-in-__FILE__.patch` 3.5K patch against
> `master` as a proof of concept).
> 
> What would be the best way to scale up `-fmacro-prefix-map=` up to NixOS
> needs for `gcc`? I would like to implement something sensible I could
> upstream.

How about `CFLAGS=@macro_prefix_map.args` and writing that file in the
same codepath where you generate the flags today. It'll work with just
about every compiler and tools like `ccache` will understand that it is
an input that affects the build and properly take the file's contents
into account.

--Ben

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Scaling -fmacro-prefix-map= to thousands entries
  2023-10-05  7:19 ` Richard Biener
@ 2023-10-05 11:59   ` Sergei Trofimovich
  2023-10-05 12:05     ` Richard Biener
                       ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Sergei Trofimovich @ 2023-10-05 11:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc

On Thu, Oct 05, 2023 at 09:19:15AM +0200, Richard Biener wrote:
> On Wed, Oct 4, 2023 at 11:20 PM Sergei Trofimovich via Gcc
> <gcc@gcc.gnu.org> wrote:
> >
> > Hi gcc developers!
> >
> > Tl;DR:
> >
> > I would like to implement a scalable way to pass `-fmacro-prefix-map=`
> > for `NixOS` distribution to avoid leaking build-time paths generated by
> > `__FILE__` macros used by various libraries.
> >
> > I need some guidance what path to take to be acceptable for `gcc`
> > upstream.
> >
> > I have a few possible solutions and wonder what I should try to upstream
> > to GCC. The options I see:
> >
> > 1. Hardcode NixOS-specific way to mangle paths.
> >
> >    Pros: simplest to implement, can be easily configured away if needed
> >    Cons: inflexible, `clang` might or might not accept the same hack
> >
> > 2. Extend `-fmacro-prefix-map=` (or add a new `-fmacro-prefix-map-file=`)
> >    to allow passing a file
> >
> >    Pros: still not too hard to implement, generic enough to be used in
> >          other contexts.
> >    Cons: Will require client to construct the map file.
> >
> > 3. Have more flexible `-fmacro-prefix-map-regex=` option that allows
> >    patterns. Something like:
> >
> >       -fmacro-prefix-map-regex=/nix/store/[a-z0-9]{32}-=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-
> >
> >   Pros: at least for NixOS one option will be enough to cover all
> >         packages as they all share above template.
> >   Cons: pulls some form of regex with it's can of worms including escape
> >         delimiters, might not be flexible enough for other use cases.
> >
> > 4. Something else?
> >
> > Which one(s) should I take to implement?
> >
> > More words:
> >
> > `NixOS` (and `nixpkgs` repository) install every software package into
> > an individual directory with unique prefix. Some examples:
> >
> >     /nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev
> >     /nix/store/rb3q4kcyfg77cmkiwywx2aqdd3x5ch93-libmpc-1.3.1
> >     /nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2
> >     ...
> >
> > It's a fundamental design decision to allow parallel package installs.
> >
> > From dependency tracking standpoint it's highly undesirable to have
> > these absolute paths to be hardcoded into final executable binaries if
> > they are not used at runtime.
> >
> > Example redundant path we would like not to have in final binaries:
> >
> >     $ strings result/bin/nix | grep phjcmy025rd1ankw5y1b21xsdii83cyk
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/json.hpp
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/output/serializer.hpp
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/conversions/to_chars.hpp
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/lexer.hpp
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iter_impl.hpp
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/json_sax.hpp
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iteration_proxy.hpp
> >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/parser.hpp
> >
> > Those paths are inserted via glibc's assert() uses of `__FILE__`
> > directive and thus hardcode header file paths from various packages
> > (like lttng-ust or nlohmann/json) into compiled binaries. Sometimes
> > `__FILE__` usage is mire creating than assert().
> >
> > I would like to get rid of references to header files. I think
> > `-fmacro-prefix-map=` are ideal for this particular use case.
> >
> > The prototype that creates equivalent of the following commands does
> > work for smaller packages:
> >
> >     -fmacro-prefix-map=/nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.37-8-dev
> >     -fmacro-prefix-map=/nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gmp-with-cxx-6.3.0-dev
> >     -fmacro-prefix-map=/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-nlohmann_json-3.11.2
> >     ...
> >
> > The above works for small amount of options (like, 100). But around 1000
> > options we start hitting linux limits on the single environment variable
> > or real-world packages like `qemu` with a ton of input depends.
> >
> > The command-line limitations are in various places:
> > - `gcc` limitation of lifting all command line options into a single
> >   environment variable: https://gcc.gnu.org/PR111527
> > - `linux` limitation of constraining single environ variable to a value
> >   way below than full available environment space:
> >   https://lkml.org/lkml/2023/9/24/381
> >
> > `linux` fix would buy us 50x more budged (A Lot) but it will not help
> > much other operating systems like `Darwin` where absolute environment
> > limit is a lot lower than `linux`.
> >
> > I already implemented [1.] in https://github.com/NixOS/nixpkgs/pull/255192
> > (also attached `mangle-NIX_STORE-in-__FILE__.patch` 3.5K patch against
> > `master` as a proof of concept).
> >
> > What would be the best way to scale up `-fmacro-prefix-map=` up to NixOS
> > needs for `gcc`? I would like to implement something sensible I could
> > upstream.
> >
> > What do you think?
> 
> Go for (2) which I think is the only way to truly solve the command-line
> limitation issue (with less regular paths even regex wouldn't cut it).

Sounds good. Do you have any preference over specific syntax? My
suggestions would be:

1. `-fmacro-prefix-map=file-name`: if `file-name` there is not in `key=val`
   format then treat it as file
2. `-fmacro-prefix-map=@file-name`: use @ as a signal to use file
3. `fmacro-prefix-map-file=file-name`: use a new option

> Btw, I thought we have response files to deal with command-line limits,
> why doesn't that work here?  I see the driver expands response files
> but IIRC it also builds those when the command-line gets too large
> and uses it for the environment and the cc1 invocation?  If it doesn't
> do the latter why not fix it that way?

Yeah, in theory response files would extend the limit. In practice `gcc`
always extends response files internally into a single
`COLLECT_GCC_OPTIONS` option and hits the environment variable limit
very early:

    https://gcc.gnu.org/PR111527

Example reproducer:

    $ for i in `seq 1 1000`; do printf -- "-fmacro-prefix-map=%0*d=%0*d\n" 200 1 200 2; done > a.rsp
    $ touch a.c; gcc @a.rsp -c a.c
    gcc: fatal error: cannot execute 'cc1': execv: Argument list too long
    compilation terminated.

And if you want to look at the gory details:

    $ strace -f -etrace=execve -s 1000000 -v -v -v  gcc @a.rsp -c a.c
    ...
    [pid    78] execve("cc1", ["cc1", "-quiet", "a.c", "-quiet", "-dumpbase", "a.c", "-dumpbase-ext", ".c", "-mtune=generic", "-march=x86-64",
    "-fmacro-prefix-map=...=...",
    "-fmacro-prefix-map=...=...",
    ...],
    [...,
     "COLLECT_GCC=gcc",
     "COLLECT_GCC_OPTIONS='-fmacro-prefix-map=...=...' '-fmacro-prefix-map=...=...' ... '-c' '-mtune=generic' '-march=x86-64'"]) = -1 E2BIG (Argument list too long)

Note how `gcc` not only expands response file into an argument list
(that is not too bad) but also duplicates the whole list as a single
`COLLECT_GCC_OPTIONS=...` environment variable with added quoting on
top.

Would be nice if `gcc` just passed response files around as is :)

-- 

  Sergei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Scaling -fmacro-prefix-map= to thousands entries
  2023-10-05 11:20 ` Ben Boeckel
@ 2023-10-05 12:05   ` Sergei Trofimovich
  0 siblings, 0 replies; 9+ messages in thread
From: Sergei Trofimovich @ 2023-10-05 12:05 UTC (permalink / raw)
  To: Ben Boeckel; +Cc: gcc

On Thu, Oct 05, 2023 at 07:20:35AM -0400, Ben Boeckel wrote:
> On Wed, Oct 04, 2023 at 22:19:32 +0100, Sergei Trofimovich via Gcc wrote:
> > The prototype that creates equivalent of the following commands does
> > work for smaller packages:
> > 
> >     -fmacro-prefix-map=/nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.37-8-dev
> >     -fmacro-prefix-map=/nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gmp-with-cxx-6.3.0-dev
> >     -fmacro-prefix-map=/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-nlohmann_json-3.11.2
> >     ...
> > 
> > The above works for small amount of options (like, 100). But around 1000
> > options we start hitting linux limits on the single environment variable
> > or real-world packages like `qemu` with a ton of input depends.
> 
> Are you trying to pass this through via `CFLAGS` and friends?

Roughly via `CFLAGS`. `nixpkgs` uses it's own private
`NIX_CFLAGS_COMPILE` variable which gets extended in `gcc-wrapper` shel
wrapper as expcit list of arguments to real `gcc-binary`. It's almost
like `CFLAGS` but is expected to be transaparent to most build systems.

> > The command-line limitations are in various places:
> > - `gcc` limitation of lifting all command line options into a single
> >   environment variable: https://gcc.gnu.org/PR111527
> > - `linux` limitation of constraining single environ variable to a value
> >   way below than full available environment space:
> >   https://lkml.org/lkml/2023/9/24/381
> > 
> > `linux` fix would buy us 50x more budged (A Lot) but it will not help
> > much other operating systems like `Darwin` where absolute environment
> > limit is a lot lower than `linux`.
> > 
> > I already implemented [1.] in https://github.com/NixOS/nixpkgs/pull/255192
> > (also attached `mangle-NIX_STORE-in-__FILE__.patch` 3.5K patch against
> > `master` as a proof of concept).
> > 
> > What would be the best way to scale up `-fmacro-prefix-map=` up to NixOS
> > needs for `gcc`? I would like to implement something sensible I could
> > upstream.
> 
> How about `CFLAGS=@macro_prefix_map.args` and writing that file in the
> same codepath where you generate the flags today. It'll work with just
> about every compiler and tools like `ccache` will understand that it is
> an input that affects the build and properly take the file's contents
> into account.

That was my initial attempt. I'll duplicate my response from 
https://gcc.gnu.org/pipermail/gcc/2023-October/242639.html here as is:

"""
Yeah, in theory response files would extend the limit. In practice `gcc`
always extends response files internally into a single
`COLLECT_GCC_OPTIONS` option and hits the environment variable limit
very early:

    https://gcc.gnu.org/PR111527

Example reproducer:

    $ for i in `seq 1 1000`; do printf -- "-fmacro-prefix-map=%0*d=%0*d\n" 200 1 200 2; done > a.rsp
    $ touch a.c; gcc @a.rsp -c a.c
    gcc: fatal error: cannot execute 'cc1': execv: Argument list too long
    compilation terminated.

And if you want to look at the gory details:

    $ strace -f -etrace=execve -s 1000000 -v -v -v  gcc @a.rsp -c a.c
    ...
    [pid    78] execve("cc1", ["cc1", "-quiet", "a.c", "-quiet", "-dumpbase", "a.c", "-dumpbase-ext", ".c", "-mtune=generic", "-march=x86-64",
    "-fmacro-prefix-map=...=...",
    "-fmacro-prefix-map=...=...",
    ...],
    [...,
     "COLLECT_GCC=gcc",
     "COLLECT_GCC_OPTIONS='-fmacro-prefix-map=...=...' '-fmacro-prefix-map=...=...' ... '-c' '-mtune=generic' '-march=x86-64'"]) = -1 E2BIG (Argument list too long)

Note how `gcc` not only expands response file into an argument list
(that is not too bad) but also duplicates the whole list as a single
`COLLECT_GCC_OPTIONS=...` environment variable with added quoting on
top.

Would be nice if `gcc` just passed response files around as is :)
"""

-- 

  Sergei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Scaling -fmacro-prefix-map= to thousands entries
  2023-10-05 11:59   ` Sergei Trofimovich
@ 2023-10-05 12:05     ` Richard Biener
  2023-10-05 12:14     ` Arsen Arsenović
  2023-10-05 15:59     ` Ben Boeckel
  2 siblings, 0 replies; 9+ messages in thread
From: Richard Biener @ 2023-10-05 12:05 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: gcc

On Thu, Oct 5, 2023 at 1:59 PM Sergei Trofimovich <slyich@gmail.com> wrote:
>
> On Thu, Oct 05, 2023 at 09:19:15AM +0200, Richard Biener wrote:
> > On Wed, Oct 4, 2023 at 11:20 PM Sergei Trofimovich via Gcc
> > <gcc@gcc.gnu.org> wrote:
> > >
> > > Hi gcc developers!
> > >
> > > Tl;DR:
> > >
> > > I would like to implement a scalable way to pass `-fmacro-prefix-map=`
> > > for `NixOS` distribution to avoid leaking build-time paths generated by
> > > `__FILE__` macros used by various libraries.
> > >
> > > I need some guidance what path to take to be acceptable for `gcc`
> > > upstream.
> > >
> > > I have a few possible solutions and wonder what I should try to upstream
> > > to GCC. The options I see:
> > >
> > > 1. Hardcode NixOS-specific way to mangle paths.
> > >
> > >    Pros: simplest to implement, can be easily configured away if needed
> > >    Cons: inflexible, `clang` might or might not accept the same hack
> > >
> > > 2. Extend `-fmacro-prefix-map=` (or add a new `-fmacro-prefix-map-file=`)
> > >    to allow passing a file
> > >
> > >    Pros: still not too hard to implement, generic enough to be used in
> > >          other contexts.
> > >    Cons: Will require client to construct the map file.
> > >
> > > 3. Have more flexible `-fmacro-prefix-map-regex=` option that allows
> > >    patterns. Something like:
> > >
> > >       -fmacro-prefix-map-regex=/nix/store/[a-z0-9]{32}-=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-
> > >
> > >   Pros: at least for NixOS one option will be enough to cover all
> > >         packages as they all share above template.
> > >   Cons: pulls some form of regex with it's can of worms including escape
> > >         delimiters, might not be flexible enough for other use cases.
> > >
> > > 4. Something else?
> > >
> > > Which one(s) should I take to implement?
> > >
> > > More words:
> > >
> > > `NixOS` (and `nixpkgs` repository) install every software package into
> > > an individual directory with unique prefix. Some examples:
> > >
> > >     /nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev
> > >     /nix/store/rb3q4kcyfg77cmkiwywx2aqdd3x5ch93-libmpc-1.3.1
> > >     /nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2
> > >     ...
> > >
> > > It's a fundamental design decision to allow parallel package installs.
> > >
> > > From dependency tracking standpoint it's highly undesirable to have
> > > these absolute paths to be hardcoded into final executable binaries if
> > > they are not used at runtime.
> > >
> > > Example redundant path we would like not to have in final binaries:
> > >
> > >     $ strings result/bin/nix | grep phjcmy025rd1ankw5y1b21xsdii83cyk
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/json.hpp
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/output/serializer.hpp
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/conversions/to_chars.hpp
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/lexer.hpp
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iter_impl.hpp
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/json_sax.hpp
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/iterators/iteration_proxy.hpp
> > >     /nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2/include/nlohmann/detail/input/parser.hpp
> > >
> > > Those paths are inserted via glibc's assert() uses of `__FILE__`
> > > directive and thus hardcode header file paths from various packages
> > > (like lttng-ust or nlohmann/json) into compiled binaries. Sometimes
> > > `__FILE__` usage is mire creating than assert().
> > >
> > > I would like to get rid of references to header files. I think
> > > `-fmacro-prefix-map=` are ideal for this particular use case.
> > >
> > > The prototype that creates equivalent of the following commands does
> > > work for smaller packages:
> > >
> > >     -fmacro-prefix-map=/nix/store/y8wfrgk7br5rfz4221lfb9v8w3n0cnyd-glibc-2.37-8-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.37-8-dev
> > >     -fmacro-prefix-map=/nix/store/8n240jfdmsb3lnc2qa2vb9dwk638j1lp-gmp-with-cxx-6.3.0-dev=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gmp-with-cxx-6.3.0-dev
> > >     -fmacro-prefix-map=/nix/store/phjcmy025rd1ankw5y1b21xsdii83cyk-nlohmann_json-3.11.2=/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-nlohmann_json-3.11.2
> > >     ...
> > >
> > > The above works for small amount of options (like, 100). But around 1000
> > > options we start hitting linux limits on the single environment variable
> > > or real-world packages like `qemu` with a ton of input depends.
> > >
> > > The command-line limitations are in various places:
> > > - `gcc` limitation of lifting all command line options into a single
> > >   environment variable: https://gcc.gnu.org/PR111527
> > > - `linux` limitation of constraining single environ variable to a value
> > >   way below than full available environment space:
> > >   https://lkml.org/lkml/2023/9/24/381
> > >
> > > `linux` fix would buy us 50x more budged (A Lot) but it will not help
> > > much other operating systems like `Darwin` where absolute environment
> > > limit is a lot lower than `linux`.
> > >
> > > I already implemented [1.] in https://github.com/NixOS/nixpkgs/pull/255192
> > > (also attached `mangle-NIX_STORE-in-__FILE__.patch` 3.5K patch against
> > > `master` as a proof of concept).
> > >
> > > What would be the best way to scale up `-fmacro-prefix-map=` up to NixOS
> > > needs for `gcc`? I would like to implement something sensible I could
> > > upstream.
> > >
> > > What do you think?
> >
> > Go for (2) which I think is the only way to truly solve the command-line
> > limitation issue (with less regular paths even regex wouldn't cut it).
>
> Sounds good. Do you have any preference over specific syntax? My
> suggestions would be:
>
> 1. `-fmacro-prefix-map=file-name`: if `file-name` there is not in `key=val`
>    format then treat it as file
> 2. `-fmacro-prefix-map=@file-name`: use @ as a signal to use file
> 3. `fmacro-prefix-map-file=file-name`: use a new option

I'd prefer (2)

> > Btw, I thought we have response files to deal with command-line limits,
> > why doesn't that work here?  I see the driver expands response files
> > but IIRC it also builds those when the command-line gets too large
> > and uses it for the environment and the cc1 invocation?  If it doesn't
> > do the latter why not fix it that way?
>
> Yeah, in theory response files would extend the limit. In practice `gcc`
> always extends response files internally into a single
> `COLLECT_GCC_OPTIONS` option and hits the environment variable limit
> very early:
>
>     https://gcc.gnu.org/PR111527
>
> Example reproducer:
>
>     $ for i in `seq 1 1000`; do printf -- "-fmacro-prefix-map=%0*d=%0*d\n" 200 1 200 2; done > a.rsp
>     $ touch a.c; gcc @a.rsp -c a.c
>     gcc: fatal error: cannot execute 'cc1': execv: Argument list too long
>     compilation terminated.
>
> And if you want to look at the gory details:
>
>     $ strace -f -etrace=execve -s 1000000 -v -v -v  gcc @a.rsp -c a.c
>     ...
>     [pid    78] execve("cc1", ["cc1", "-quiet", "a.c", "-quiet", "-dumpbase", "a.c", "-dumpbase-ext", ".c", "-mtune=generic", "-march=x86-64",
>     "-fmacro-prefix-map=...=...",
>     "-fmacro-prefix-map=...=...",
>     ...],
>     [...,
>      "COLLECT_GCC=gcc",
>      "COLLECT_GCC_OPTIONS='-fmacro-prefix-map=...=...' '-fmacro-prefix-map=...=...' ... '-c' '-mtune=generic' '-march=x86-64'"]) = -1 E2BIG (Argument list too long)
>
> Note how `gcc` not only expands response file into an argument list
> (that is not too bad) but also duplicates the whole list as a single
> `COLLECT_GCC_OPTIONS=...` environment variable with added quoting on
> top.
>
> Would be nice if `gcc` just passed response files around as is :)

That's not possible in general since specs processing can alter the
command-line.  What it could do
is create an alternate response file with (all?) arguments when a
certain limit is exceeded
(or the original command-line included response files).  That could be
referenced from COLLECT_GCC_OPTIONS
as well but of course that would require patching all
COLLECT_GCC_OPTIONS consumers
(for example lto-wrapper doesn't handle response files there).  So
it's not even a half-way solution
(unless the env limit is way higher).

Richard.

> --
>
>   Sergei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Scaling -fmacro-prefix-map= to thousands entries
  2023-10-05 11:59   ` Sergei Trofimovich
  2023-10-05 12:05     ` Richard Biener
@ 2023-10-05 12:14     ` Arsen Arsenović
  2023-10-06  6:55       ` Richard Biener
  2023-10-05 15:59     ` Ben Boeckel
  2 siblings, 1 reply; 9+ messages in thread
From: Arsen Arsenović @ 2023-10-05 12:14 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: Richard Biener, gcc

[-- Attachment #1: Type: text/plain, Size: 2378 bytes --]

Hi,

Sergei Trofimovich via Gcc <gcc@gcc.gnu.org> writes:

> Sounds good. Do you have any preference over specific syntax? My
> suggestions would be:
>
> 1. `-fmacro-prefix-map=file-name`: if `file-name` there is not in `key=val`
>    format then treat it as file
> 2. `-fmacro-prefix-map=@file-name`: use @ as a signal to use file
> 3. `fmacro-prefix-map-file=file-name`: use a new option
>
>> Btw, I thought we have response files to deal with command-line limits,
>> why doesn't that work here?  I see the driver expands response files
>> but IIRC it also builds those when the command-line gets too large
>> and uses it for the environment and the cc1 invocation?  If it doesn't
>> do the latter why not fix it that way?
>
> Yeah, in theory response files would extend the limit. In practice `gcc`
> always extends response files internally into a single
> `COLLECT_GCC_OPTIONS` option and hits the environment variable limit
> very early:
>
>     https://gcc.gnu.org/PR111527

Doesn't it make more sense to fix this?  The issue is more general than
just this option (even if manifesting like so today).

Can the COLLECT_GCC_OPTIONS consumers deal with argfiles?

> Example reproducer:
>
>     $ for i in `seq 1 1000`; do printf -- "-fmacro-prefix-map=%0*d=%0*d\n" 200 1 200 2; done > a.rsp
>     $ touch a.c; gcc @a.rsp -c a.c
>     gcc: fatal error: cannot execute 'cc1': execv: Argument list too long
>     compilation terminated.
>
> And if you want to look at the gory details:
>
>     $ strace -f -etrace=execve -s 1000000 -v -v -v  gcc @a.rsp -c a.c
>     ...
>     [pid    78] execve("cc1", ["cc1", "-quiet", "a.c", "-quiet", "-dumpbase", "a.c", "-dumpbase-ext", ".c", "-mtune=generic", "-march=x86-64",
>     "-fmacro-prefix-map=...=...",
>     "-fmacro-prefix-map=...=...",
>     ...],
>     [...,
>      "COLLECT_GCC=gcc",
>      "COLLECT_GCC_OPTIONS='-fmacro-prefix-map=...=...' '-fmacro-prefix-map=...=...' ... '-c' '-mtune=generic' '-march=x86-64'"]) = -1 E2BIG (Argument list too long)
>
> Note how `gcc` not only expands response file into an argument list
> (that is not too bad) but also duplicates the whole list as a single
> `COLLECT_GCC_OPTIONS=...` environment variable with added quoting on
> top.
>
> Would be nice if `gcc` just passed response files around as is :)


-- 
Arsen Arsenović

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 251 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Scaling -fmacro-prefix-map= to thousands entries
  2023-10-05 11:59   ` Sergei Trofimovich
  2023-10-05 12:05     ` Richard Biener
  2023-10-05 12:14     ` Arsen Arsenović
@ 2023-10-05 15:59     ` Ben Boeckel
  2 siblings, 0 replies; 9+ messages in thread
From: Ben Boeckel @ 2023-10-05 15:59 UTC (permalink / raw)
  To: Sergei Trofimovich; +Cc: Richard Biener, gcc

On Thu, Oct 05, 2023 at 12:59:21 +0100, Sergei Trofimovich via Gcc wrote:
> Sounds good. Do you have any preference over specific syntax? My
> suggestions would be:
> 
> 1. `-fmacro-prefix-map=file-name`: if `file-name` there is not in `key=val`
>    format then treat it as file
> 2. `-fmacro-prefix-map=@file-name`: use @ as a signal to use file
> 3. `fmacro-prefix-map-file=file-name`: use a new option

Whatever is picked, please let `ccache` and `sccache` know so that they
can include this argument's contents in their hashes. `distcc` and other
distributed build tools need to know to send this file to wherever
compilation is happening as well (though they may actually not care as
they may send around preprocessed source?).

Thanks,

--Ben

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Scaling -fmacro-prefix-map= to thousands entries
  2023-10-05 12:14     ` Arsen Arsenović
@ 2023-10-06  6:55       ` Richard Biener
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Biener @ 2023-10-06  6:55 UTC (permalink / raw)
  To: Arsen Arsenović; +Cc: Sergei Trofimovich, gcc

On Thu, Oct 5, 2023 at 2:17 PM Arsen Arsenović <arsen@aarsen.me> wrote:
>
> Hi,
>
> Sergei Trofimovich via Gcc <gcc@gcc.gnu.org> writes:
>
> > Sounds good. Do you have any preference over specific syntax? My
> > suggestions would be:
> >
> > 1. `-fmacro-prefix-map=file-name`: if `file-name` there is not in `key=val`
> >    format then treat it as file
> > 2. `-fmacro-prefix-map=@file-name`: use @ as a signal to use file
> > 3. `fmacro-prefix-map-file=file-name`: use a new option
> >
> >> Btw, I thought we have response files to deal with command-line limits,
> >> why doesn't that work here?  I see the driver expands response files
> >> but IIRC it also builds those when the command-line gets too large
> >> and uses it for the environment and the cc1 invocation?  If it doesn't
> >> do the latter why not fix it that way?
> >
> > Yeah, in theory response files would extend the limit. In practice `gcc`
> > always extends response files internally into a single
> > `COLLECT_GCC_OPTIONS` option and hits the environment variable limit
> > very early:
> >
> >     https://gcc.gnu.org/PR111527
>
> Doesn't it make more sense to fix this?  The issue is more general than
> just this option (even if manifesting like so today).
>
> Can the COLLECT_GCC_OPTIONS consumers deal with argfiles?

No.  The traditional consumer is collect2, nowadays the main consumer
is lto-wrapper and the lto-plugin.  There's
parse_options_from_collect_gcc_options but
for example collect2 does its own parsing.  There might be other tools out
in the wild interpreting COLLECT_GCC_OPTIONS.

We might make life of the tools easy if either all of COLLECT_GCC_OPTIONS
is fully expanded or it is a single @file "argument" (but with
otherwise identical,
quoted content).

Richard.

> > Example reproducer:
> >
> >     $ for i in `seq 1 1000`; do printf -- "-fmacro-prefix-map=%0*d=%0*d\n" 200 1 200 2; done > a.rsp
> >     $ touch a.c; gcc @a.rsp -c a.c
> >     gcc: fatal error: cannot execute 'cc1': execv: Argument list too long
> >     compilation terminated.
> >
> > And if you want to look at the gory details:
> >
> >     $ strace -f -etrace=execve -s 1000000 -v -v -v  gcc @a.rsp -c a.c
> >     ...
> >     [pid    78] execve("cc1", ["cc1", "-quiet", "a.c", "-quiet", "-dumpbase", "a.c", "-dumpbase-ext", ".c", "-mtune=generic", "-march=x86-64",
> >     "-fmacro-prefix-map=...=...",
> >     "-fmacro-prefix-map=...=...",
> >     ...],
> >     [...,
> >      "COLLECT_GCC=gcc",
> >      "COLLECT_GCC_OPTIONS='-fmacro-prefix-map=...=...' '-fmacro-prefix-map=...=...' ... '-c' '-mtune=generic' '-march=x86-64'"]) = -1 E2BIG (Argument list too long)
> >
> > Note how `gcc` not only expands response file into an argument list
> > (that is not too bad) but also duplicates the whole list as a single
> > `COLLECT_GCC_OPTIONS=...` environment variable with added quoting on
> > top.
> >
> > Would be nice if `gcc` just passed response files around as is :)
>
>
> --
> Arsen Arsenović

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-10-06  6:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-04 21:19 Scaling -fmacro-prefix-map= to thousands entries Sergei Trofimovich
2023-10-05  7:19 ` Richard Biener
2023-10-05 11:59   ` Sergei Trofimovich
2023-10-05 12:05     ` Richard Biener
2023-10-05 12:14     ` Arsen Arsenović
2023-10-06  6:55       ` Richard Biener
2023-10-05 15:59     ` Ben Boeckel
2023-10-05 11:20 ` Ben Boeckel
2023-10-05 12:05   ` Sergei Trofimovich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).