From: Jason Merrill <jason@redhat.com>
To: Ben Boeckel <me@benboeckel.net>, gcc-patches@gcc.gnu.org
Cc: nathan@acm.org, fortran@gcc.gnu.org, gcc@gcc.gnu.org,
brad.king@kitware.com, Ben Boeckel <ben.boeckel@kitware.com>
Subject: Re: [PATCH RESEND 1/1] p1689r5: initial support
Date: Mon, 10 Oct 2022 17:04:09 -0400 [thread overview]
Message-ID: <78b88b1d-b328-a140-3a27-d33a3d96f3b9@redhat.com> (raw)
In-Reply-To: <20221004151200.1275636-2-ben.boeckel@kitware.com>
On 10/4/22 11:12, Ben Boeckel wrote:
> This patch implements support for [P1689R5][] to communicate to a build
> system the C++20 module dependencies to build systems so that they may
> build `.gcm` files in the proper order.
Thanks!
> Support is communicated through the following three new flags:
>
> - `-fdeps-format=` specifies the format for the output. Currently named
> `p1689r5`.
>
> - `-fdeps-file=` specifies the path to the file to write the format to.
Do you expect users to want to emit Makefile (-MM) and P1689
dependencies from the same compilation?
> - `-fdep-output=` specifies the `.o` that will be written for the TU
> that is scanned. This is required so that the build system can
> correlate the dependency output with the actual compilation that will
> occur.
The dependency machinery already needs to be able to figure out the name
of the output file, can't we reuse that instead of specifying it on the
command line?
> CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
> using an experimental feature selection (to allow for future usage
> evolution without committing to how it works today). While it remains
> experimental, docs may be found in CMake's documentation for
> experimental features.
>
> Future work may include using this format for Fortran module
> dependencies as well, however this is still pending work.
>
> [P1689R5]: https://isocpp.org/files/papers/P1689R5.html
> [cmake-experimental]: https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst
>
> TODO:
>
> - header-unit information fields
>
> Header units (including the standard library headers) are 100%
> unsupported right now because the `-E` mechanism wants to import their
> BMIs. A new mode (i.e., something more workable than existing `-E`
> behavior) that mocks up header units as if they were imported purely
> from their path and content would be required.
>
> - non-utf8 paths
>
> The current standard says that paths that are not unambiguously
> represented using UTF-8 are not supported (because these cases are rare
> and the extra complication is not worth it at this time). Future
> versions of the format might have ways of encoding non-UTF-8 paths. For
> now, this patch just doesn't support non-UTF-8 paths (ignoring the
> "unambiguously represetable in UTF-8" case).
>
> - figure out why junk gets placed at the end of the file
>
> Sometimes it seems like the file gets a lot of `NUL` bytes appended to
> it. It happens rarely and seems to be the result of some
> `ftruncate`-style call which results in extra padding in the contents.
> Noting it here as an observation at least.
>
> Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
> ---
> gcc/ChangeLog | 9 ++
> gcc/c-family/ChangeLog | 6 +
> gcc/c-family/c-opts.cc | 40 ++++++-
> gcc/c-family/c.opt | 12 ++
> gcc/cp/ChangeLog | 5 +
> gcc/cp/module.cc | 3 +-
> gcc/doc/invoke.texi | 15 +++
> gcc/fortran/ChangeLog | 5 +
> gcc/fortran/cpp.cc | 4 +-
> gcc/genmatch.cc | 2 +-
> gcc/input.cc | 4 +-
> libcpp/ChangeLog | 11 ++
> libcpp/include/cpplib.h | 12 +-
> libcpp/include/mkdeps.h | 17 ++-
> libcpp/init.cc | 14 ++-
> libcpp/mkdeps.cc | 235 ++++++++++++++++++++++++++++++++++++++--
> 16 files changed, 368 insertions(+), 26 deletions(-)
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 6dded16c0e3..2d61de6adde 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,12 @@
> +2022-09-20 Ben Boeckel <ben.boeckel@kitware.com>
> +
> + * doc/invoke.texi: Document -fdeps-format=, -fdep-file=, and
> + -fdep-output= flags.
> + * genmatch.cc (main): Add new preprocessor parameter used for C++
> + module tracking.
> + * input.cc (test_lexer): Add new preprocessor parameter used for C++
> + module tracking.
> +
> 2022-09-19 Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
>
> * targhooks.cc (default_zero_call_used_regs): Improve sorry
> diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
> index ba3d76dd6cb..569dcd96e8c 100644
> --- a/gcc/c-family/ChangeLog
> +++ b/gcc/c-family/ChangeLog
> @@ -1,3 +1,9 @@
> +2022-09-20 Ben Boeckel <ben.boeckel@kitware.com>
> +
> + * c-opts.cc (c_common_handle_option): Add fdeps_file variable and
> + -fdeps-format=, -fdep-file=, and -fdep-output= parsing.
> + * c.opt: Add -fdeps-format=, -fdep-file=, and -fdep-output= flags.
> +
> 2022-09-15 Richard Biener <rguenther@suse.de>
>
> * c-common.h (build_void_list_node): Remove.
> diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> index babaa2fc157..617d0e93696 100644
> --- a/gcc/c-family/c-opts.cc
> +++ b/gcc/c-family/c-opts.cc
> @@ -77,6 +77,9 @@ static bool verbose;
> /* Dependency output file. */
> static const char *deps_file;
>
> +/* Enhanced dependency output file. */
> +static const char *fdeps_file;
> +
> /* The prefix given by -iprefix, if any. */
> static const char *iprefix;
>
> @@ -360,6 +363,23 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,
> deps_file = arg;
> break;
>
> + case OPT_fdep_format_:
> + if (!strcmp (arg, "p1689r5"))
> + cpp_opts->deps.format = DEPS_FMT_P1689R5;
> + else
> + error ("%<-fdep-format=%> unknown format %s", arg);
> + break;
> +
> + case OPT_fdep_file_:
> + deps_seen = true;
> + fdeps_file = arg;
> + break;
> +
> + case OPT_fdep_output_:
> + deps_seen = true;
> + defer_opt (code, arg);
> + break;
> +
> case OPT_MF:
> deps_seen = true;
> deps_file = arg;
> @@ -1279,6 +1299,7 @@ void
> c_common_finish (void)
> {
> FILE *deps_stream = NULL;
> + FILE *fdeps_stream = NULL;
>
> /* Note that we write the dependencies even if there are errors. This is
> useful for handling outdated generated headers that now trigger errors
> @@ -1307,9 +1328,24 @@ c_common_finish (void)
> locations with input_location, which would be incorrect now. */
> override_libcpp_locations = false;
>
> + if (cpp_opts->deps.format != DEPS_FMT_NONE)
> + {
> + if (!fdeps_file)
> + fdeps_stream = out_stream;
> + else if (fdeps_file[0] == '-' && fdeps_file[1] == '\0')
> + fdeps_stream = stdout;
> + else
> + {
> + fdeps_stream = fopen (fdeps_file, "w");
> + if (!fdeps_stream)
> + fatal_error (input_location, "opening dependency file %s: %m",
> + fdeps_file);
> + }
> + }
> +
> /* For performance, avoid tearing down cpplib's internal structures
> with cpp_destroy (). */
> - cpp_finish (parse_in, deps_stream);
> + cpp_finish (parse_in, deps_stream, fdeps_stream);
>
> if (deps_stream && deps_stream != out_stream && deps_stream != stdout
> && (ferror (deps_stream) || fclose (deps_stream)))
> @@ -1381,6 +1417,8 @@ handle_deferred_opts (void)
>
> if (opt->code == OPT_MT || opt->code == OPT_MQ)
> deps_add_target (deps, opt->arg, opt->code == OPT_MQ);
> + else if (opt->code == OPT_fdep_output_)
> + deps_add_output (deps, opt->arg, true);
> }
> }
>
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 1c7f89eeb94..b0b91ef73a3 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -256,6 +256,18 @@ MT
> C ObjC C++ ObjC++ Joined Separate MissingArgError(missing makefile target after %qs)
> -MT <target> Add a target that does not require quoting.
>
> +fdep-format=
> +C ObjC C++ ObjC++ NoDriverArg Joined MissingArgError(missing format after %qs)
> +Format for output dependency information. Supported (\"p1689r5\").
> +
> +fdep-file=
> +C ObjC C++ ObjC++ NoDriverArg Joined MissingArgError(missing output path after %qs)
> +File for output dependency information.
> +
> +fdep-output=
> +C ObjC C++ ObjC++ NoDriverArg Joined MissingArgError(missing path after %qs)
> +-fdep-output=obj.o Output file for the compile step.
> +
> P
> C ObjC C++ ObjC++
> Do not generate #line directives.
> diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
> index dc4ce202c35..eb5fd11bff7 100644
> --- a/gcc/cp/ChangeLog
> +++ b/gcc/cp/ChangeLog
> @@ -1,3 +1,8 @@
> +2022-09-20 Ben Boeckel <ben.boeckel@kitware.com>
> +
> + * module.cc (preprocessed_module): Pass whether the module is
> + exported to dependency tracking.
> +
> 2022-09-17 Patrick Palka <ppalka@redhat.com>
>
> * module.cc (friend_from_decl_list): Don't consider
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> index 9a9ef4e3332..9a2d1da0af1 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -19751,7 +19751,8 @@ preprocessed_module (cpp_reader *reader)
> && (module->is_interface () || module->is_partition ()))
> deps_add_module_target (deps, module->get_flatname (),
> maybe_add_cmi_prefix (module->filename),
> - module->is_header());
> + module->is_header (),
> + module->is_exported ());
> else
> deps_add_module_dep (deps, module->get_flatname ());
> }
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index aa5655764a0..890bb58b056 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -2776,6 +2776,21 @@ is @option{-fpermitted-flt-eval-methods=c11}. The default when in a GNU
> dialect (@option{-std=gnu11} or similar) is
> @option{-fpermitted-flt-eval-methods=ts-18661-3}.
>
> +@item -fdep-file=@var{file}
> +@opindex fdep-file
> +Where to write structured dependency information.
> +
> +@item -fdep-format=@var{format}
> +@opindex fdep-format
> +The format to use for structured dependency information. @samp{p1689r5} is the
> +only supported format right now. Note that when this argument is specified, the
> +output of @samp{-MF} is stripped of some information (namely C++ modules) so
> +that it does not use extended makefile syntax not understood by most tools.
> +
> +@item -fdep-output=@var{file}
> +@opindex fdep-output
> +Analogous to @option{-MT} but for structured dependency information.
> +
> @item -fplan9-extensions
> @opindex fplan9-extensions
> Accept some non-standard constructs used in Plan 9 code.
> diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
> index f5f8ac04286..b0942f876a0 100644
> --- a/gcc/fortran/ChangeLog
> +++ b/gcc/fortran/ChangeLog
> @@ -1,3 +1,8 @@
> +2022-09-20 Ben Boeckel <ben.boeckel@kitware.com>
> +
> + * cpp.cc (gfc_cpp_done): Add new preprocessor parameter used for C++
> + module tracking.
> +
> 2022-09-19 Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>
>
> * libgfortran.h: Declare GFC_FPE_AWAY.
> diff --git a/gcc/fortran/cpp.cc b/gcc/fortran/cpp.cc
> index 364bd0d2a85..0b9df9c02cd 100644
> --- a/gcc/fortran/cpp.cc
> +++ b/gcc/fortran/cpp.cc
> @@ -712,7 +712,7 @@ gfc_cpp_done (void)
> FILE *f = fopen (gfc_cpp_option.deps_filename, "w");
> if (f)
> {
> - cpp_finish (cpp_in, f);
> + cpp_finish (cpp_in, f, NULL);
> fclose (f);
> }
> else
> @@ -721,7 +721,7 @@ gfc_cpp_done (void)
> xstrerror (errno));
> }
> else
> - cpp_finish (cpp_in, stdout);
> + cpp_finish (cpp_in, stdout, NULL);
> }
>
> cpp_undef_all (cpp_in);
> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
> index a0b22c50ae3..b9ddf8ad669 100644
> --- a/gcc/genmatch.cc
> +++ b/gcc/genmatch.cc
> @@ -5246,7 +5246,7 @@ main (int argc, char **argv)
> dt.gen (stdout, gimple);
>
> /* Finalize. */
> - cpp_finish (r, NULL);
> + cpp_finish (r, NULL, NULL);
> cpp_destroy (r);
>
> delete operators;
> diff --git a/gcc/input.cc b/gcc/input.cc
> index 060ca160126..a563f7ca4bc 100644
> --- a/gcc/input.cc
> +++ b/gcc/input.cc
> @@ -2260,7 +2260,7 @@ test_lexer (const line_table_case &case_)
> ASSERT_NE (tok, NULL);
> ASSERT_EQ (tok->type, CPP_EOF);
>
> - cpp_finish (parser, NULL);
> + cpp_finish (parser, NULL, NULL);
> cpp_destroy (parser);
> }
>
> @@ -2291,7 +2291,7 @@ class cpp_reader_ptr
>
> ~cpp_reader_ptr ()
> {
> - cpp_finish (m_ptr, NULL);
> + cpp_finish (m_ptr, NULL, NULL);
> cpp_destroy (m_ptr);
> }
>
> diff --git a/libcpp/ChangeLog b/libcpp/ChangeLog
> index 5984915a49e..52e20886037 100644
> --- a/libcpp/ChangeLog
> +++ b/libcpp/ChangeLog
> @@ -1,3 +1,14 @@
> +2022-09-20 Ben Boeckel <ben.boeckel@kitware.com>
> +
> + * include/cpplib.h: Add cpp_deps_format enum.
> + * include/cpplib.h (cpp_options): Add format field.
> + * include/cpplib.h (cpp_finish): Add dependency stream parameter.
> + * include/mkdeps.h (deps_add_module_target): Add new preprocessor
> + parameter used for C++ module tracking.
> + * init.cc (cpp_finish): Add new preprocessor parameter used for C++
> + module tracking.
> + * mkdeps.cc (mkdeps): Implement P1689R5 output.
> +
> 2022-09-08 Lewis Hyatt <lhyatt@gmail.com>
>
> * line-map.cc (location_adhoc_data_update): Remove reliance on
> diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
> index 2db1e9cbdfb..90787230a9e 100644
> --- a/libcpp/include/cpplib.h
> +++ b/libcpp/include/cpplib.h
> @@ -298,6 +298,9 @@ typedef CPPCHAR_SIGNED_T cppchar_signed_t;
> /* Style of header dependencies to generate. */
> enum cpp_deps_style { DEPS_NONE = 0, DEPS_USER, DEPS_SYSTEM };
>
> +/* Format of header dependencies to generate. */
> +enum cpp_deps_format { DEPS_FMT_NONE = 0, DEPS_FMT_P1689R5 };
Why not add this to cpp_deps_style?
> +
> /* The possible normalization levels, from most restrictive to least. */
> enum cpp_normalize_level {
> /* In NFKC. */
> @@ -581,6 +584,9 @@ struct cpp_options
> /* Style of header dependencies to generate. */
> enum cpp_deps_style style;
>
> + /* Format of header dependencies to generate. */
> + enum cpp_deps_format format;
> +
> /* Assume missing files are generated files. */
> bool missing_files;
>
> @@ -1104,9 +1110,9 @@ extern void cpp_post_options (cpp_reader *);
> extern void cpp_init_iconv (cpp_reader *);
>
> /* Call this to finish preprocessing. If you requested dependency
> - generation, pass an open stream to write the information to,
> - otherwise NULL. It is your responsibility to close the stream. */
> -extern void cpp_finish (cpp_reader *, FILE *deps_stream);
> + generation, pass open stream(s) to write the information to,
> + otherwise NULL. It is your responsibility to close the stream(s). */
> +extern void cpp_finish (cpp_reader *, FILE *deps_stream, FILE *fdeps_stream);
>
> /* Call this to release the handle at the end of preprocessing. Any
> use of the handle after this function returns is invalid. */
> diff --git a/libcpp/include/mkdeps.h b/libcpp/include/mkdeps.h
> index 96d64641b1a..396afa60f25 100644
> --- a/libcpp/include/mkdeps.h
> +++ b/libcpp/include/mkdeps.h
> @@ -53,20 +53,29 @@ extern void deps_add_default_target (class mkdeps *, const char *);
>
> /* Adds a module target. The module name and cmi name are copied. */
> extern void deps_add_module_target (struct mkdeps *, const char *module,
> - const char *cmi, bool is_header);
> + const char *cmi, bool is_header,
> + bool is_exported);
>
> /* Adds a module dependency. The module name is copied. */
> extern void deps_add_module_dep (struct mkdeps *, const char *module);
>
> +/* Add an output. */
> +extern void deps_add_output (struct mkdeps *, const char *, bool);
> +
> /* Add a dependency (appears on the right side of the colon) to the
> deps list. Dependencies will be printed in the order that they
> were entered with this function. By convention, the first
> dependency entered should be the primary source file. */
> extern void deps_add_dep (class mkdeps *, const char *);
>
> -/* Write out a deps buffer to a specified file. The last argument
> - is the number of columns to word-wrap at (0 means don't wrap). */
> -extern void deps_write (const cpp_reader *, FILE *, unsigned int);
> +/* Write out a deps buffer to a specified file. The third argument
> + is the number of columns to word-wrap at (0 means don't wrap).
> + The last argument indicates whether to output extra information
> + (namely modules). */
> +extern void deps_write (const struct cpp_reader *, FILE *, unsigned int, int);
> +
> +/* Write out a deps buffer to a specified file in P1689R5 format. */
> +extern void deps_write_p1689r5 (const struct mkdeps *, FILE *);
>
> /* Write out a deps buffer to a file, in a form that can be read back
> with deps_restore. Returns nonzero on error, in which case the
> diff --git a/libcpp/init.cc b/libcpp/init.cc
> index d3b4f00994b..d575afe23c7 100644
> --- a/libcpp/init.cc
> +++ b/libcpp/init.cc
> @@ -853,7 +853,7 @@ read_original_directory (cpp_reader *pfile)
> Maybe it should also reset state, such that you could call
> cpp_start_read with a new filename to restart processing. */
> void
> -cpp_finish (cpp_reader *pfile, FILE *deps_stream)
> +cpp_finish (struct cpp_reader *pfile, FILE *deps_stream, FILE *fdeps_stream)
> {
> /* Warn about unused macros before popping the final buffer. */
> if (CPP_OPTION (pfile, warn_unused_macros))
> @@ -867,8 +867,16 @@ cpp_finish (cpp_reader *pfile, FILE *deps_stream)
> while (pfile->buffer)
> _cpp_pop_buffer (pfile);
>
> - if (deps_stream)
> - deps_write (pfile, deps_stream, 72);
> + cpp_deps_format deps_format = CPP_OPTION (pfile, deps.format);
> + if (deps_format == DEPS_FMT_P1689R5 && fdeps_stream)
> + deps_write_p1689r5 (pfile->deps, fdeps_stream);
> +
> + if (CPP_OPTION (pfile, deps.style) != DEPS_NONE
> + && deps_stream)
> + {
> + deps_write (pfile, deps_stream,
> + 72, deps_format == DEPS_FMT_NONE);
> + }
>
> /* Report on headers that could use multiple include guards. */
> if (CPP_OPTION (pfile, print_include_names))
> diff --git a/libcpp/mkdeps.cc b/libcpp/mkdeps.cc
> index 30e87d8b4d7..e1253202d9f 100644
> --- a/libcpp/mkdeps.cc
> +++ b/libcpp/mkdeps.cc
> @@ -81,7 +81,8 @@ public:
> };
>
> mkdeps ()
> - : module_name (NULL), cmi_name (NULL), is_header_unit (false), quote_lwm (0)
> + : primary_output (NULL), module_name (NULL), cmi_name (NULL)
> + , is_header_unit (false), is_exported (false), quote_lwm (0)
> {
> }
> ~mkdeps ()
> @@ -90,6 +91,9 @@ public:
>
> for (i = targets.size (); i--;)
> free (const_cast <char *> (targets[i]));
> + free (const_cast <char *> (primary_output));
> + for (i = outputs.size (); i--;)
> + free (const_cast <char *> (outputs[i]));
> for (i = deps.size (); i--;)
> free (const_cast <char *> (deps[i]));
> for (i = vpath.size (); i--;)
> @@ -103,6 +107,8 @@ public:
> public:
> vec<const char *> targets;
> vec<const char *> deps;
> + const char * primary_output;
> + vec<const char *> outputs;
> vec<velt> vpath;
> vec<const char *> modules;
>
> @@ -110,6 +116,7 @@ public:
> const char *module_name;
> const char *cmi_name;
> bool is_header_unit;
> + bool is_exported;
> unsigned short quote_lwm;
> };
>
> @@ -288,6 +295,21 @@ deps_add_default_target (class mkdeps *d, const char *tgt)
> }
> }
>
> +/* Adds an output O. We make a copy, so it need not be a permanent
> + string. */
> +void
> +deps_add_output (struct mkdeps *d, const char *o, bool is_primary)
> +{
> + o = apply_vpath (d, o);
> + if (is_primary)
> + {
> + if (d->primary_output)
> + d->outputs.push (d->primary_output);
> + d->primary_output = xstrdup (o);
> + } else
> + d->outputs.push (xstrdup (o));
> +}
> +
> void
> deps_add_dep (class mkdeps *d, const char *t)
> {
> @@ -325,12 +347,13 @@ deps_add_vpath (class mkdeps *d, const char *vpath)
>
> void
> deps_add_module_target (struct mkdeps *d, const char *m,
> - const char *cmi, bool is_header_unit)
> + const char *cmi, bool is_header_unit, bool is_exported)
> {
> gcc_assert (!d->module_name);
>
> d->module_name = xstrdup (m);
> d->is_header_unit = is_header_unit;
> + d->is_exported = is_exported;
> d->cmi_name = xstrdup (cmi);
> }
>
> @@ -387,7 +410,7 @@ make_write_vec (const mkdeps::vec<const char *> &vec, FILE *fp,
> .PHONY targets for all the dependencies too. */
>
> static void
> -make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> +make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax, int extra)
Instead of adding the "extra" parameter...
> {
> const mkdeps *d = pfile->deps;
>
> @@ -398,7 +421,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> if (d->deps.size ())
> {
> column = make_write_vec (d->targets, fp, 0, colmax, d->quote_lwm);
> - if (CPP_OPTION (pfile, deps.modules) && d->cmi_name)
> + if (extra && CPP_OPTION (pfile, deps.modules) && d->cmi_name)
> column = make_write_name (d->cmi_name, fp, column, colmax);
> fputs (":", fp);
> column++;
> @@ -412,7 +435,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> if (!CPP_OPTION (pfile, deps.modules))
> return;
...how about checking CPP_OPTION for p1689r5 mode here?
>
> - if (d->modules.size ())
> + if (extra && d->modules.size ())
> {
> column = make_write_vec (d->targets, fp, 0, colmax, d->quote_lwm);
> if (d->cmi_name)
> @@ -423,7 +446,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> fputs ("\n", fp);
> }
>
> - if (d->module_name)
> + if (extra && d->module_name)
> {
> if (d->cmi_name)
> {
> @@ -455,7 +478,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> }
> }
>
> - if (d->modules.size ())
> + if (extra && d->modules.size ())
> {
> column = fprintf (fp, "CXX_IMPORTS +=");
> make_write_vec (d->modules, fp, column, colmax, 0, ".c++m");
> @@ -468,9 +491,203 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> /* Really we should be opening fp here. */
>
> void
> -deps_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> +deps_write (const struct cpp_reader *pfile, FILE *fp, unsigned int colmax,
> + int extra)
> +{
> + make_write (pfile, fp, colmax, extra);
> +}
> +
> +static bool
> +is_utf8 (const char *name)
Can we share utf8 parsing code with decode_utf8_char in pretty-print.cc?
> +{
> + int byte_length = 0;
> + int expected_continuations = 0;
> + uint32_t codepoint = 0;
> +
> + for (const char* c = name; *c; c++)
> + {
> + int byte = *c;
> +
> + // Check for a continuation character.
> + if ((byte & 0xc0) == 0x80) // 10xx_xxxx
> + {
> + --expected_continuations;
> +
> + if (expected_continuations < 0)
> + return false;
> +
> + // Add the encoded bits to the codepoint.
> + codepoint = (codepoint << 6) + (byte & ~0x3f);
> +
> + continue;
> + }
> +
> + // Make sure we don't expect more continuation characters
> + // once we don't find one.
> + if (expected_continuations != 0)
> + return false;
> +
> + // Invalid because if they appear, it is an attempt to encode an ASCII
> + // character in 2 bytes.
> + if (byte == 0xc0 || byte == 0xc1) // 1100_0000 || 1100_0001
> + return false;
> +
> + // Invalid because they would encode codepoints greater than allowed
> + // (0x10FFFF).
> + if (byte > 0xf4)
> + return false;
> +
> + // ED-prefixed sequence encoding UTF-16 surrogate halves.
> + if (0xD0FF <= codepoint && codepoint <= 0xDFFF)
> + return false;
> +
> + // 0x10FFFF is the highest valid codepoint for UTF-8.
> + if (0x10FFFF < codepoint)
> + return false;
> +
> + // Overlong encoding of a codepoint.
> + if (byte_length == 2 && codepoint < 0x0800)
> + return false;
> + if (byte_length == 3 && codepoint < 0x10000)
> + return false;
> +
> + // Single byte character.
> + if ((byte & 0x80) == 0x00) // 0xxx_xxxx
> + {
> + codepoint = byte;
> + byte_length = expected_continuations = 0;
> + }
> + // Prefix codes.
> + else if ((byte & 0xe0) == 0xc0) // 110x_xxxx
> + {
> + codepoint = byte & 0x1f;
> + byte_length = expected_continuations = 1;
> + }
> + else if ((byte & 0xf0) == 0xe0) // 1110_xxxx
> + {
> + codepoint = byte & 0x0f;
> + byte_length = expected_continuations = 2;
> + }
> + else if ((byte & 0xf8) == 0xf0) // 1111_0xxx
> + {
> + codepoint = byte & 0x07;
> + byte_length = expected_continuations = 3;
> + }
> + else
> + return false;
> + }
> +
> + if (expected_continuations != 0)
> + return false;
> +
> + return true;
> +}
> +
> +static void
> +p1689r5_write_filepath (const char *name, FILE *fp)
> {
> - make_write (pfile, fp, colmax);
> + if (is_utf8 (name))
> + {
> + fputc ('"', fp);
> + for (const char* c = name; *c; c++)
> + {
> + // Escape control characters.
> + if (ISCNTRL (*c))
> + fprintf (fp, "\\u%04x", *c);
> + // JSON escape characters.
> + else if (*c == '"' || *c == '\\')
> + {
> + fputc ('\\', fp);
> + fputc (*c, fp);
> + }
> + // Everything else.
> + else
> + fputc (*c, fp);
> + }
> + fputc ('"', fp);
> + }
> + else
> + {
> + // TODO: print an error
> + }
> +}
> +
> +static void
> +p1689r5_write_vec (const mkdeps::vec<const char *> &vec, FILE *fp)
> +{
> + for (unsigned ix = 0; ix != vec.size (); ix++)
> + {
> + p1689r5_write_filepath (vec[ix], fp);
> + if (ix < vec.size () - 1)
> + fputc (',', fp);
> + fputc ('\n', fp);
> + }
> +}
> +
> +void
> +deps_write_p1689r5 (const struct mkdeps *d, FILE *fp)
> +{
> + fputs ("{\n", fp);
> +
> + fputs ("\"rules\": [\n", fp);
> + fputs ("{\n", fp);
> +
> + if (d->primary_output)
> + {
> + fputs ("\"primary-output\": ", fp);
> + p1689r5_write_filepath (d->primary_output, fp);
> + fputs (",\n", fp);
> + }
> +
> + if (d->outputs.size ())
> + {
> + fputs ("\"outputs\": [\n", fp);
> + p1689r5_write_vec (d->outputs, fp);
> + fputs ("],\n", fp);
> + }
> +
> + if (d->module_name)
> + {
> + fputs ("\"provides\": [\n", fp);
> + fputs ("{\n", fp);
> +
> + fputs ("\"logical-name\": ", fp);
> + p1689r5_write_filepath (d->module_name, fp);
> + fputs (",\n", fp);
> +
> + fprintf (fp, "\"is-interface\": %s\n", d->is_exported ? "true" : "false");
> +
> + // TODO: header-unit information
> +
> + fputs ("}\n", fp);
> + fputs ("],\n", fp);
> + }
> +
> + fputs ("\"requires\": [\n", fp);
> + for (size_t i = 0; i < d->modules.size (); i++)
> + {
> + if (i != 0)
> + fputs (",\n", fp);
> + fputs ("{\n", fp);
> +
> + fputs ("\"logical-name\": ", fp);
> + p1689r5_write_filepath (d->modules[i], fp);
> + fputs ("\n", fp);
> +
> + // TODO: header-unit information
> +
> + fputs ("}\n", fp);
> + }
> + fputs ("]\n", fp);
> +
> + fputs ("}\n", fp);
> +
> + fputs ("],\n", fp);
> +
> + fputs ("\"version\": 0,\n", fp);
> + fputs ("\"revision\": 0\n", fp);
> +
> + fputs ("}\n", fp);
> }
>
> /* Write out a deps buffer to a file, in a form that can be read back
next prev parent reply other threads:[~2022-10-10 21:04 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-04 15:11 [PATCH RESEND 0/1] RFC: P1689R5 support Ben Boeckel
2022-10-04 15:12 ` [PATCH RESEND 1/1] p1689r5: initial support Ben Boeckel
2022-10-04 19:12 ` Harald Anlauf
2022-10-11 11:30 ` Ben Boeckel
2022-10-10 21:04 ` Jason Merrill [this message]
2022-10-11 11:42 ` Ben Boeckel
2022-10-18 12:18 ` Ben Boeckel
2022-10-20 15:39 ` Jason Merrill
2022-10-20 17:31 ` Ben Boeckel
2022-10-20 18:22 ` Jason Merrill
2022-10-10 20:21 ` [PATCH RESEND 0/1] RFC: P1689R5 support Jason Merrill
2022-10-13 17:08 ` David Malcolm
2022-10-18 12:22 ` Ben Boeckel
2022-10-19 7:21 ` Martin Liška
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=78b88b1d-b328-a140-3a27-d33a3d96f3b9@redhat.com \
--to=jason@redhat.com \
--cc=ben.boeckel@kitware.com \
--cc=brad.king@kitware.com \
--cc=fortran@gcc.gnu.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=gcc@gcc.gnu.org \
--cc=me@benboeckel.net \
--cc=nathan@acm.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).