public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
From: Ben Boeckel <ben.boeckel@kitware.com>
To: Jason Merrill <jason@redhat.com>
Cc: gcc-patches@gcc.gnu.org, nathan@acm.org, fortran@gcc.gnu.org,
	gcc@gcc.gnu.org, brad.king@kitware.com
Subject: Re: [PATCH v5 3/5] p1689r5: initial support
Date: Tue, 20 Jun 2023 15:46:49 -0400	[thread overview]
Message-ID: <20230620194649.GA186848@farprobe> (raw)
In-Reply-To: <77ee5db6-e45e-5178-4807-5b2fef29e8c7@redhat.com>

On Tue, Feb 14, 2023 at 16:50:27 -0500, Jason Merrill wrote:
> On 1/25/23 13:06, Ben Boeckel wrote:
> > - header-unit information fields
> > 
> > Header units (including the standard library headers) are 100%
> > unsupported right now because the `-E` mechanism wants to import their
> > BMIs. A new mode (i.e., something more workable than existing `-E`
> > behavior) that mocks up header units as if they were imported purely
> > from their path and content would be required.
> 
> I notice that the cpp dependency generation tries (in open_file_failed) 
> to continue after encountering a missing file, is that not sufficient 
> for header units?  Or adjustable to be sufficient?

No. Header units can introduce macros which can be used to modify the
set of modules that are imported. Included headers are "discovered"
dependencies and don't modify the build graph (just add more files that
trigger a rebuild) and can be collected during compilation. Module
dependencies are needed to get the build correct in the first place in
order to:

- order module compilations in the build graph so that imported modules
  are ready before anything using them; and
- computing the set of flags needed for telling the compiler where
  imported modules' CMI files should be located.

> > - non-utf8 paths
> > 
> > The current standard says that paths that are not unambiguously
> > represented using UTF-8 are not supported (because these cases are rare
> > and the extra complication is not worth it at this time). Future
> > versions of the format might have ways of encoding non-UTF-8 paths. For
> > now, this patch just doesn't support non-UTF-8 paths (ignoring the
> > "unambiguously represetable in UTF-8" case).
> 
> typo "representable"

Fixed.

> > diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> > index c68a2a27469..1c14ce3fe8e 100644
> > --- a/gcc/c-family/c-opts.cc
> > +++ b/gcc/c-family/c-opts.cc
> > @@ -77,6 +77,9 @@ static bool verbose;
> >   /* Dependency output file.  */
> >   static const char *deps_file;
> >   
> > +/* Enhanced dependency output file.  */
> 
> Maybe "structured", as in the docs?  It isn't really a direct 
> enhancement of the makefile dependencies.

Agreed. I'll also add a link to p1689r5 as a comment for what
"structured" means where it is parsed out.

> > +  if (cpp_opts->deps.format != DEPS_FMT_NONE)
> > +    {
> > +      if (!fdeps_file)
> > +	fdeps_stream = out_stream;
> > +      else if (fdeps_file[0] == '-' && fdeps_file[1] == '\0')
> > +	fdeps_stream = stdout;
> 
> You probably want to check that deps_stream and fdeps_stream don't end 
> up as the same stream.

Hmm. But `stdout` is probably fine to use for both though. Basically:

    if (fdeps_stream == out_stream && fdeps_stream != stdout)
      make_diagnostic_noise ();

> > @@ -1374,6 +1410,8 @@ handle_deferred_opts (void)
> >   
> >   	if (opt->code == OPT_MT || opt->code == OPT_MQ)
> >   	  deps_add_target (deps, opt->arg, opt->code == OPT_MQ);
> > +	else if (opt->code == OPT_fdep_output_)
> > +	  deps_add_output (deps, opt->arg, true);
> 
> How about fdeps_add_target?

Renamed.

> > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> > index ef371ca8c26..630781fdf8a 100644
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -256,6 +256,18 @@ MT
> >   C ObjC C++ ObjC++ Joined Separate MissingArgError(missing makefile target after %qs)
> >   -MT <target>	Add a target that does not require quoting.
> >   
> > +fdep-format=
> > +C ObjC C++ ObjC++ NoDriverArg Joined MissingArgError(missing format after %qs)
> > +Format for output dependency information.  Supported (\"p1689r5\").
> 
> I think we want "structured" here, as well.

Fixed.

> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 06d77983e30..b61c3ebd3ec 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -2791,6 +2791,21 @@ is @option{-fpermitted-flt-eval-methods=c11}.  The default when in a GNU
> >   dialect (@option{-std=gnu11} or similar) is
> >   @option{-fpermitted-flt-eval-methods=ts-18661-3}.
> >   
> > +@item -fdep-file=@var{file}
> > +@opindex fdep-file
> > +Where to write structured dependency information.
> > +
> > +@item -fdep-format=@var{format}
> > +@opindex fdep-format
> > +The format to use for structured dependency information. @samp{p1689r5} is the
> > +only supported format right now.  Note that when this argument is specified, the
> > +output of @samp{-MF} is stripped of some information (namely C++ modules) so
> > +that it does not use extended makefile syntax not understood by most tools.
> > +
> > +@item -fdep-output=@var{file}
> > +@opindex fdep-output
> > +Analogous to @option{-MT} but for structured dependency information.
> 
> Please add more detail about how these are intended to be used.

Will do.

> > diff --git a/gcc/testsuite/g++.dg/modules/p1689-1.C b/gcc/testsuite/g++.dg/modules/p1689-1.C
> > new file mode 100644
> > index 00000000000..245e30d09ce
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/modules/p1689-1.C
> > @@ -0,0 +1,18 @@
> > +// { dg-additional-options -E }
> > +// { dg-additional-options -MT }
> > +// { dg-additional-options p1689-1.json }
> > +// { dg-additional-options -MD }
> > +// { dg-additional-options -fmodules-ts }
> > +// { dg-additional-options -fdep-format=p1689r5 }
> > +// { dg-additional-options -fdep-output=p1689-1.o }
> > +// { dg-additional-options -fdep-file=p1689-1.json }
> 
> Note that with dg-additional-options you can put multiple flags in "" 
> rather than put each on its own line.

Ah, that would indeed help.

> So, I take it this is the common use case you have in mind, generating 
> Make dependencies for the p1689 file?  When are you thinking the Make 
> dependencies for the .o are generated?  At build time?

Yes. If an included file changes, the scanning should be performed
again. The compilation will have its own `-MF` as well (which should
point to the same files plus the CMI files it ends up reading).

> I'm a bit surprised you're using .json rather than an extension that 
> indicates what the information is.

I can change that; the filename doesn't *really* matter (e.g., CMake
uses `.ddi` for "dynamic dependency information").

> > diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
> > index 8df071e1587..aa1c065104a 100644
> > --- a/libcpp/include/cpplib.h
> > +++ b/libcpp/include/cpplib.h
> > @@ -302,6 +302,9 @@ typedef CPPCHAR_SIGNED_T cppchar_signed_t;
> >   /* Style of header dependencies to generate.  */
> >   enum cpp_deps_style { DEPS_NONE = 0, DEPS_USER, DEPS_SYSTEM };
> >   
> > +/* Format of header dependencies to generate.  */
> > +enum cpp_deps_format { DEPS_FMT_NONE = 0, DEPS_FMT_P1689R5 };
> 
>  From the earlier review:
> 
> > > Why not add this to cpp_deps_style?
> >
> > It is orthogonal. The `-M` flags and `-fdeps-*` flags are similar, but
> > `-fdeps-*` dependencies are fundamentally different than `-M`
> > dependencies. The comment does need updated though.
> 
> Makes sense, but "cpp_deps_style" and "cpp_deps_format" don't sound 
> orthogonal, they sound overlapping.  Maybe just call it 
> "cpp_fdeps_format" to match all the other fdeps, and fix the comment; 
> maybe just add "structured" here, too.

Also needs to mention that it is for module dependencies, not header
dependencies.

> > `-M` is about discovered dependencies: those that you find out while
> > doing work. `-fdep-*` is about ordering dependencies: extracting
> > information from file content in order to even order future work around.
> 
> I'm not sure I see the distinction; Makefiles also express ordering 
> dependencies.  In both cases, you want to find out from the files what 
> order you will want to process them in when building the project.

Makefiles can express ordering dependencies, but not the `-M` snippets;
these are for files that, if changed, should trigger a rebuild. This is
fundamentally different than module dependencies which instead indicate
which *compiles* (or CMI generation if using a 2-phase setup) need to
complete before compilation (or CMI generation) of the scanned TU can be
performed. Generally generated headers will be ordered manually in the
build system description. However, maintaining that same level for
in-source dependency information on a per-source level is a *far* higher
burden.

> For modules, there's currently a problem arising from the lack of either 
> on-demand BMI generation (apart from Nathan's P1602 prototype) or 
> fallback to consuming modules as source directly, so indeed it's 
> currently important to support a separate discovery phase.  But that 
> seems like a separate question from the output file format; the 
> discovery phase could emit Makefile dependencies and not the p1689 
> format, if that makes more sense for the individual project.

I don't know how to specify header unit dependencies in Makefile syntax
meaningfully. For example, how does one specify that a header named
`<foo.h>`, found at `/full/path/to/foo.h` is required using this syntax?
Note that one or the other is not suitable as importing `<to/foo.h>` may
find the same file.

Not to mention that there are problems with some paths not being
representable (or needing lots of escaping rules to be considered), and
encoding specifications in Makefiles.

In any case, if something does pop up and *really* prefers Makefile
syntax over JSON, `-fdeps-format=make` could be made. This is all
internal build system stuff, so if anything does start caring, I suspect
it will be `automake` (though one would also need to communicate with
other compilers if that's really wanted…which are all likely to support
P1689 already).

> For some projects, I imagine the discovery commands could also be the 
> same as the build commands, and you just need to keep trying until the 
> build succeeds.  I wonder if there's a way to automatically retry if 
> dependency files got updated...

This is the socket or fifo option for the `-fmodule-mapper=` flag. The
problem is that there is an unbounded number of compilations that can be
suspended while the build tool "searches" for one that claims to supply
a module being requested. If it does not actually exist (or is "hidden"
behind some circular dependency), all available compilation rules must
be attempted before saying "nope, doesn't exist". I don't think it's
really scalable myself.

> > @@ -589,6 +592,9 @@ struct cpp_options
> >       /* Style of header dependencies to generate.  */
> >       enum cpp_deps_style style;
> >   
> > +    /* Format of header dependencies to generate.  */
> > +    enum cpp_deps_format format;
> 
> And this should probably be fdeps_format.

Done

> > diff --git a/libcpp/include/mkdeps.h b/libcpp/include/mkdeps.h
> > index 920e2791334..33c7437a481 100644
> > --- a/libcpp/include/mkdeps.h
> > +++ b/libcpp/include/mkdeps.h
> > @@ -53,20 +53,29 @@ extern void deps_add_default_target (class mkdeps *, const char *);
> >   
> >   /* Adds a module target.  The module name and cmi name are copied.  */
> >   extern void deps_add_module_target (struct mkdeps *, const char *module,
> > -				    const char *cmi, bool is_header);
> > +				    const char *cmi, bool is_header,
> > +				    bool is_exported);
> >   
> >   /* Adds a module dependency.  The module name is copied.  */
> >   extern void deps_add_module_dep (struct mkdeps *, const char *module);
> >   
> > +/* Add an output.  */
> > +extern void deps_add_output (struct mkdeps *, const char *, bool);
> > +
> >   /* Add a dependency (appears on the right side of the colon) to the
> >      deps list.  Dependencies will be printed in the order that they
> >      were entered with this function.  By convention, the first
> >      dependency entered should be the primary source file.  */
> >   extern void deps_add_dep (class mkdeps *, const char *);
> >   
> > -/* Write out a deps buffer to a specified file.  The last argument
> > -   is the number of columns to word-wrap at (0 means don't wrap).  */
> > -extern void deps_write (const cpp_reader *, FILE *, unsigned int);
> > +/* Write out a deps buffer to a specified file.  The third argument
> > +   is the number of columns to word-wrap at (0 means don't wrap).
> > +   The last argument indicates whether to output extra information
> > +   (namely modules).  */
> > +extern void deps_write (const struct cpp_reader *, FILE *, unsigned int);
> 
> Seems like this change isn't needed anymore.

Indeed.

> > +/* Write out a deps buffer to a specified file in P1689R5 format.  */
> > +extern void deps_write_p1689r5 (const struct mkdeps *, FILE *);
> >   
> >   /* Write out a deps buffer to a file, in a form that can be read back
> >      with deps_restore.  Returns nonzero on error, in which case the
> > diff --git a/libcpp/init.cc b/libcpp/init.cc
> > index c508f06112a..d34fd6fdeef 100644
> > --- a/libcpp/init.cc
> > +++ b/libcpp/init.cc
> > @@ -855,7 +855,7 @@ read_original_directory (cpp_reader *pfile)
> >      Maybe it should also reset state, such that you could call
> >      cpp_start_read with a new filename to restart processing.  */
> >   void
> > -cpp_finish (cpp_reader *pfile, FILE *deps_stream)
> > +cpp_finish (struct cpp_reader *pfile, FILE *deps_stream, FILE *fdeps_stream)
> >   {
> >     /* Warn about unused macros before popping the final buffer.  */
> >     if (CPP_OPTION (pfile, warn_unused_macros))
> > @@ -869,8 +869,15 @@ cpp_finish (cpp_reader *pfile, FILE *deps_stream)
> >     while (pfile->buffer)
> >       _cpp_pop_buffer (pfile);
> >   
> > -  if (deps_stream)
> > -    deps_write (pfile, deps_stream, 72);
> > +  cpp_deps_format deps_format = CPP_OPTION (pfile, deps.format);
> > +  if (deps_format == DEPS_FMT_P1689R5 && fdeps_stream)
> > +    deps_write_p1689r5 (pfile->deps, fdeps_stream);
> > +
> > +  if (CPP_OPTION (pfile, deps.style) != DEPS_NONE
> > +      && deps_stream)
> 
> Why does the Make handling need to change?  deps_stream should be null 
> if we aren't emitting Make deps, right?

Yes. This might be leftovers from the first time I did it as a `-M` flag
(before I realized that `-M` as-is is still useful even for the
scanning).

> > diff --git a/libcpp/mkdeps.cc b/libcpp/mkdeps.cc
> > index 8f9585c3c0a..081287c94ba 100644
> > --- a/libcpp/mkdeps.cc
> > +++ b/libcpp/mkdeps.cc
> > @@ -103,6 +107,8 @@ public:
> >   public:
> >     vec<const char *> targets;
> >     vec<const char *> deps;
> > +  const char * primary_output;
> > +  vec<const char *> outputs;
> 
> fdeps_targets, I think

Done

> > @@ -288,6 +295,21 @@ deps_add_default_target (class mkdeps *d, const char *tgt)
> >       }
> >   }
> >   
> > +/* Adds an output O.  We make a copy, so it need not be a permanent
> > +   string.  */
> > +void
> > +deps_add_output (struct mkdeps *d, const char *o, bool is_primary)
> 
> As mentioned above, I think this should be fdeps_add_target.  And please 
> add more commentary about why this wouldn't just be the same as -MT.

Done.

> > @@ -395,10 +418,15 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> >     if (colmax && colmax < 34)
> >       colmax = 34;
> >   
> > +  /* Write out C++ modules information if no other `-fdeps-format=`
> > +   * option is given. */
> 
> Stray *

Done.

> > +  cpp_deps_format deps_format = CPP_OPTION (pfile, deps.format);
> > +  bool write_make_modules_deps = deps_format == DEPS_FMT_NONE;
> 
> Maybe && CPP_OPTION (pfile, deps.modules)?
> 
> > +
> >     if (d->deps.size ())
> >       {
> >         column = make_write_vec (d->targets, fp, 0, colmax, d->quote_lwm);
> > -      if (CPP_OPTION (pfile, deps.modules) && d->cmi_name)
> > +      if (write_make_modules_deps && CPP_OPTION (pfile, deps.modules) && d->cmi_name)
> 
> So you don't need to check it here.

Done.

> > @@ -412,7 +440,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> >     if (!CPP_OPTION (pfile, deps.modules))
> 
> And then this can be if (!write_make_modules_deps) and you don't need to 
> check the flag in the rest of the ifs.

Indeed. Done.

> > @@ -468,11 +496,118 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> >   /* Really we should be opening fp here.  */
> >   
> >   void
> > -deps_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> > +deps_write (const struct cpp_reader *pfile, FILE *fp, unsigned int colmax)
> 
> As with the prototype, this change seems unneeded.

Done.

> > <snip JSON output diff>
> 
> Is there a reason not to use the gcc/json.h interface for JSON output?

This is `libcpp`; is that not a dependency cycle?

--Ben

  parent reply	other threads:[~2023-06-20 19:46 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-25 21:06 [PATCH v5 0/5] P1689R5 support Ben Boeckel
2023-01-25 21:06 ` [PATCH v5 1/5] libcpp: reject codepoints above 0x10FFFF Ben Boeckel
2023-02-13 15:53   ` Jason Merrill
2023-05-12 14:26     ` Ben Boeckel
2023-01-25 21:06 ` [PATCH v5 2/5] libcpp: add a function to determine UTF-8 validity of a C string Ben Boeckel
2023-10-23 15:16   ` David Malcolm
2023-10-23 15:24     ` Jason Merrill
2023-10-23 15:28       ` David Malcolm
2023-01-25 21:06 ` [PATCH v5 3/5] p1689r5: initial support Ben Boeckel
2023-02-14 21:50   ` Jason Merrill
2023-05-12 14:24     ` Ben Boeckel
2023-06-19 21:33       ` Jason Merrill
2023-06-20 16:51         ` Ben Boeckel
2023-06-20 19:46     ` Ben Boeckel [this message]
2023-06-23 18:31       ` Jason Merrill
2023-06-25 17:08         ` Ben Boeckel
2023-01-25 21:06 ` [PATCH v5 4/5] c++modules: report imported CMI files as dependencies Ben Boeckel
2023-02-13 18:33   ` Jason Merrill
2023-05-12 14:26     ` Ben Boeckel
2023-06-22 21:21   ` Jason Merrill
2023-06-23  2:45     ` Ben Boeckel
2023-06-23 12:12       ` Nathan Sidwell
2023-06-25 16:36         ` Ben Boeckel
2023-07-18 20:52           ` Jason Merrill
2023-07-18 21:12             ` Nathan Sidwell
2023-07-19  0:01             ` Ben Boeckel
2023-07-19 21:11               ` Nathan Sidwell
2023-07-20  0:47                 ` Ben Boeckel
2023-07-20 21:00                   ` Nathan Sidwell
2023-07-21 14:57                     ` Ben Boeckel
2023-07-21 20:23                       ` Nathan Sidwell
2023-07-24  0:26                         ` Ben Boeckel
2023-07-28  1:13                           ` Jason Merrill
2023-07-29 14:25                             ` Ben Boeckel
2023-01-25 21:06 ` [PATCH v5 5/5] c++modules: report module mapper files as a dependency Ben Boeckel
2023-06-23 14:44   ` Jason Merrill
2023-06-25 16:42     ` Ben Boeckel
2023-02-02 14:04 ` [PATCH v5 0/5] P1689R5 support Ben Boeckel
2023-02-02 20:24 ` Harald Anlauf
2023-02-03  4:00   ` Ben Boeckel
2023-02-03  4:07 ` Andrew Pinski
2023-02-03  8:58   ` Jonathan Wakely
2023-02-03  9:10     ` Jonathan Wakely
2023-02-03 14:52       ` Ben Boeckel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230620194649.GA186848@farprobe \
    --to=ben.boeckel@kitware.com \
    --cc=brad.king@kitware.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=gcc@gcc.gnu.org \
    --cc=jason@redhat.com \
    --cc=nathan@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).