public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jason Merrill <jason@redhat.com>
To: Ben Boeckel <me@benboeckel.net>, gcc-patches@gcc.gnu.org
Cc: nathan@acm.org, fortran@gcc.gnu.org, gcc@gcc.gnu.org,
	brad.king@kitware.com, Ben Boeckel <ben.boeckel@kitware.com>
Subject: Re: [PATCH RESEND 1/1] p1689r5: initial support
Date: Mon, 10 Oct 2022 17:04:09 -0400	[thread overview]
Message-ID: <78b88b1d-b328-a140-3a27-d33a3d96f3b9@redhat.com> (raw)
In-Reply-To: <20221004151200.1275636-2-ben.boeckel@kitware.com>

On 10/4/22 11:12, Ben Boeckel wrote:
> This patch implements support for [P1689R5][] to communicate to a build
> system the C++20 module dependencies to build systems so that they may
> build `.gcm` files in the proper order.

Thanks!

> Support is communicated through the following three new flags:
> 
> - `-fdeps-format=` specifies the format for the output. Currently named
>    `p1689r5`.
> 
> - `-fdeps-file=` specifies the path to the file to write the format to.

Do you expect users to want to emit Makefile (-MM) and P1689 
dependencies from the same compilation?

> - `-fdep-output=` specifies the `.o` that will be written for the TU
>    that is scanned. This is required so that the build system can
>    correlate the dependency output with the actual compilation that will
>    occur.

The dependency machinery already needs to be able to figure out the name 
of the output file, can't we reuse that instead of specifying it on the 
command line?

> CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
> using an experimental feature selection (to allow for future usage
> evolution without committing to how it works today). While it remains
> experimental, docs may be found in CMake's documentation for
> experimental features.
> 
> Future work may include using this format for Fortran module
> dependencies as well, however this is still pending work.
> 
> [P1689R5]: https://isocpp.org/files/papers/P1689R5.html
> [cmake-experimental]: https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst
> 
> TODO:
> 
> - header-unit information fields
> 
> Header units (including the standard library headers) are 100%
> unsupported right now because the `-E` mechanism wants to import their
> BMIs. A new mode (i.e., something more workable than existing `-E`
> behavior) that mocks up header units as if they were imported purely
> from their path and content would be required.
> 
> - non-utf8 paths
> 
> The current standard says that paths that are not unambiguously
> represented using UTF-8 are not supported (because these cases are rare
> and the extra complication is not worth it at this time). Future
> versions of the format might have ways of encoding non-UTF-8 paths. For
> now, this patch just doesn't support non-UTF-8 paths (ignoring the
> "unambiguously represetable in UTF-8" case).
> 
> - figure out why junk gets placed at the end of the file
> 
> Sometimes it seems like the file gets a lot of `NUL` bytes appended to
> it. It happens rarely and seems to be the result of some
> `ftruncate`-style call which results in extra padding in the contents.
> Noting it here as an observation at least.
> 
> Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
> ---
>   gcc/ChangeLog           |   9 ++
>   gcc/c-family/ChangeLog  |   6 +
>   gcc/c-family/c-opts.cc  |  40 ++++++-
>   gcc/c-family/c.opt      |  12 ++
>   gcc/cp/ChangeLog        |   5 +
>   gcc/cp/module.cc        |   3 +-
>   gcc/doc/invoke.texi     |  15 +++
>   gcc/fortran/ChangeLog   |   5 +
>   gcc/fortran/cpp.cc      |   4 +-
>   gcc/genmatch.cc         |   2 +-
>   gcc/input.cc            |   4 +-
>   libcpp/ChangeLog        |  11 ++
>   libcpp/include/cpplib.h |  12 +-
>   libcpp/include/mkdeps.h |  17 ++-
>   libcpp/init.cc          |  14 ++-
>   libcpp/mkdeps.cc        | 235 ++++++++++++++++++++++++++++++++++++++--
>   16 files changed, 368 insertions(+), 26 deletions(-)
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 6dded16c0e3..2d61de6adde 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,12 @@
> +2022-09-20  Ben Boeckel  <ben.boeckel@kitware.com>
> +
> +	* doc/invoke.texi: Document -fdeps-format=, -fdep-file=, and
> +	-fdep-output= flags.
> +	* genmatch.cc (main): Add new preprocessor parameter used for C++
> +	module tracking.
> +	* input.cc (test_lexer): Add new preprocessor parameter used for C++
> +	module tracking.
> +
>   2022-09-19  Torbjörn SVENSSON  <torbjorn.svensson@foss.st.com>
>   
>   	* targhooks.cc (default_zero_call_used_regs): Improve sorry
> diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
> index ba3d76dd6cb..569dcd96e8c 100644
> --- a/gcc/c-family/ChangeLog
> +++ b/gcc/c-family/ChangeLog
> @@ -1,3 +1,9 @@
> +2022-09-20  Ben Boeckel  <ben.boeckel@kitware.com>
> +
> +	* c-opts.cc (c_common_handle_option): Add fdeps_file variable and
> +	-fdeps-format=, -fdep-file=, and -fdep-output= parsing.
> +	* c.opt: Add -fdeps-format=, -fdep-file=, and -fdep-output= flags.
> +
>   2022-09-15  Richard Biener  <rguenther@suse.de>
>   
>   	* c-common.h (build_void_list_node): Remove.
> diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> index babaa2fc157..617d0e93696 100644
> --- a/gcc/c-family/c-opts.cc
> +++ b/gcc/c-family/c-opts.cc
> @@ -77,6 +77,9 @@ static bool verbose;
>   /* Dependency output file.  */
>   static const char *deps_file;
>   
> +/* Enhanced dependency output file.  */
> +static const char *fdeps_file;
> +
>   /* The prefix given by -iprefix, if any.  */
>   static const char *iprefix;
>   
> @@ -360,6 +363,23 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,
>         deps_file = arg;
>         break;
>   
> +    case OPT_fdep_format_:
> +      if (!strcmp (arg, "p1689r5"))
> +	cpp_opts->deps.format = DEPS_FMT_P1689R5;
> +      else
> +	error ("%<-fdep-format=%> unknown format %s", arg);
> +      break;
> +
> +    case OPT_fdep_file_:
> +      deps_seen = true;
> +      fdeps_file = arg;
> +      break;
> +
> +    case OPT_fdep_output_:
> +      deps_seen = true;
> +      defer_opt (code, arg);
> +      break;
> +
>       case OPT_MF:
>         deps_seen = true;
>         deps_file = arg;
> @@ -1279,6 +1299,7 @@ void
>   c_common_finish (void)
>   {
>     FILE *deps_stream = NULL;
> +  FILE *fdeps_stream = NULL;
>   
>     /* Note that we write the dependencies even if there are errors. This is
>        useful for handling outdated generated headers that now trigger errors
> @@ -1307,9 +1328,24 @@ c_common_finish (void)
>        locations with input_location, which would be incorrect now.  */
>     override_libcpp_locations = false;
>   
> +  if (cpp_opts->deps.format != DEPS_FMT_NONE)
> +    {
> +      if (!fdeps_file)
> +	fdeps_stream = out_stream;
> +      else if (fdeps_file[0] == '-' && fdeps_file[1] == '\0')
> +	fdeps_stream = stdout;
> +      else
> +	{
> +	  fdeps_stream = fopen (fdeps_file, "w");
> +	  if (!fdeps_stream)
> +	    fatal_error (input_location, "opening dependency file %s: %m",
> +			 fdeps_file);
> +	}
> +    }
> +
>     /* For performance, avoid tearing down cpplib's internal structures
>        with cpp_destroy ().  */
> -  cpp_finish (parse_in, deps_stream);
> +  cpp_finish (parse_in, deps_stream, fdeps_stream);
>   
>     if (deps_stream && deps_stream != out_stream && deps_stream != stdout
>         && (ferror (deps_stream) || fclose (deps_stream)))
> @@ -1381,6 +1417,8 @@ handle_deferred_opts (void)
>   
>   	if (opt->code == OPT_MT || opt->code == OPT_MQ)
>   	  deps_add_target (deps, opt->arg, opt->code == OPT_MQ);
> +	else if (opt->code == OPT_fdep_output_)
> +	  deps_add_output (deps, opt->arg, true);
>         }
>   }
>   
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 1c7f89eeb94..b0b91ef73a3 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -256,6 +256,18 @@ MT
>   C ObjC C++ ObjC++ Joined Separate MissingArgError(missing makefile target after %qs)
>   -MT <target>	Add a target that does not require quoting.
>   
> +fdep-format=
> +C ObjC C++ ObjC++ NoDriverArg Joined MissingArgError(missing format after %qs)
> +Format for output dependency information.  Supported (\"p1689r5\").
> +
> +fdep-file=
> +C ObjC C++ ObjC++ NoDriverArg Joined MissingArgError(missing output path after %qs)
> +File for output dependency information.
> +
> +fdep-output=
> +C ObjC C++ ObjC++ NoDriverArg Joined MissingArgError(missing path after %qs)
> +-fdep-output=obj.o Output file for the compile step.
> +
>   P
>   C ObjC C++ ObjC++
>   Do not generate #line directives.
> diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
> index dc4ce202c35..eb5fd11bff7 100644
> --- a/gcc/cp/ChangeLog
> +++ b/gcc/cp/ChangeLog
> @@ -1,3 +1,8 @@
> +2022-09-20  Ben Boeckel  <ben.boeckel@kitware.com>
> +
> +	* module.cc (preprocessed_module): Pass whether the module is
> +	exported to dependency tracking.
> +
>   2022-09-17  Patrick Palka  <ppalka@redhat.com>
>   
>   	* module.cc (friend_from_decl_list): Don't consider
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> index 9a9ef4e3332..9a2d1da0af1 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -19751,7 +19751,8 @@ preprocessed_module (cpp_reader *reader)
>   		  && (module->is_interface () || module->is_partition ()))
>   		deps_add_module_target (deps, module->get_flatname (),
>   					maybe_add_cmi_prefix (module->filename),
> -					module->is_header());
> +					module->is_header (),
> +					module->is_exported ());
>   	      else
>   		deps_add_module_dep (deps, module->get_flatname ());
>   	    }
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index aa5655764a0..890bb58b056 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -2776,6 +2776,21 @@ is @option{-fpermitted-flt-eval-methods=c11}.  The default when in a GNU
>   dialect (@option{-std=gnu11} or similar) is
>   @option{-fpermitted-flt-eval-methods=ts-18661-3}.
>   
> +@item -fdep-file=@var{file}
> +@opindex fdep-file
> +Where to write structured dependency information.
> +
> +@item -fdep-format=@var{format}
> +@opindex fdep-format
> +The format to use for structured dependency information. @samp{p1689r5} is the
> +only supported format right now.  Note that when this argument is specified, the
> +output of @samp{-MF} is stripped of some information (namely C++ modules) so
> +that it does not use extended makefile syntax not understood by most tools.
> +
> +@item -fdep-output=@var{file}
> +@opindex fdep-output
> +Analogous to @option{-MT} but for structured dependency information.
> +
>   @item -fplan9-extensions
>   @opindex fplan9-extensions
>   Accept some non-standard constructs used in Plan 9 code.
> diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
> index f5f8ac04286..b0942f876a0 100644
> --- a/gcc/fortran/ChangeLog
> +++ b/gcc/fortran/ChangeLog
> @@ -1,3 +1,8 @@
> +2022-09-20  Ben Boeckel  <ben.boeckel@kitware.com>
> +
> +	* cpp.cc (gfc_cpp_done): Add new preprocessor parameter used for C++
> +	module tracking.
> +
>   2022-09-19  Francois-Xavier Coudert  <fxcoudert@gcc.gnu.org>
>   
>   	* libgfortran.h: Declare GFC_FPE_AWAY.
> diff --git a/gcc/fortran/cpp.cc b/gcc/fortran/cpp.cc
> index 364bd0d2a85..0b9df9c02cd 100644
> --- a/gcc/fortran/cpp.cc
> +++ b/gcc/fortran/cpp.cc
> @@ -712,7 +712,7 @@ gfc_cpp_done (void)
>   	  FILE *f = fopen (gfc_cpp_option.deps_filename, "w");
>   	  if (f)
>   	    {
> -	      cpp_finish (cpp_in, f);
> +	      cpp_finish (cpp_in, f, NULL);
>   	      fclose (f);
>   	    }
>   	  else
> @@ -721,7 +721,7 @@ gfc_cpp_done (void)
>   			     xstrerror (errno));
>   	}
>         else
> -	cpp_finish (cpp_in, stdout);
> +	cpp_finish (cpp_in, stdout, NULL);
>       }
>   
>     cpp_undef_all (cpp_in);
> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
> index a0b22c50ae3..b9ddf8ad669 100644
> --- a/gcc/genmatch.cc
> +++ b/gcc/genmatch.cc
> @@ -5246,7 +5246,7 @@ main (int argc, char **argv)
>     dt.gen (stdout, gimple);
>   
>     /* Finalize.  */
> -  cpp_finish (r, NULL);
> +  cpp_finish (r, NULL, NULL);
>     cpp_destroy (r);
>   
>     delete operators;
> diff --git a/gcc/input.cc b/gcc/input.cc
> index 060ca160126..a563f7ca4bc 100644
> --- a/gcc/input.cc
> +++ b/gcc/input.cc
> @@ -2260,7 +2260,7 @@ test_lexer (const line_table_case &case_)
>     ASSERT_NE (tok, NULL);
>     ASSERT_EQ (tok->type, CPP_EOF);
>   
> -  cpp_finish (parser, NULL);
> +  cpp_finish (parser, NULL, NULL);
>     cpp_destroy (parser);
>   }
>   
> @@ -2291,7 +2291,7 @@ class cpp_reader_ptr
>   
>     ~cpp_reader_ptr ()
>     {
> -    cpp_finish (m_ptr, NULL);
> +    cpp_finish (m_ptr, NULL, NULL);
>       cpp_destroy (m_ptr);
>     }
>   
> diff --git a/libcpp/ChangeLog b/libcpp/ChangeLog
> index 5984915a49e..52e20886037 100644
> --- a/libcpp/ChangeLog
> +++ b/libcpp/ChangeLog
> @@ -1,3 +1,14 @@
> +2022-09-20  Ben Boeckel  <ben.boeckel@kitware.com>
> +
> +	* include/cpplib.h: Add cpp_deps_format enum.
> +	* include/cpplib.h (cpp_options): Add format field.
> +	* include/cpplib.h (cpp_finish): Add dependency stream parameter.
> +	* include/mkdeps.h (deps_add_module_target): Add new preprocessor
> +	parameter used for C++ module tracking.
> +	* init.cc (cpp_finish): Add new preprocessor parameter used for C++
> +	module tracking.
> +	* mkdeps.cc (mkdeps): Implement P1689R5 output.
> +
>   2022-09-08  Lewis Hyatt  <lhyatt@gmail.com>
>   
>   	* line-map.cc (location_adhoc_data_update): Remove reliance on
> diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
> index 2db1e9cbdfb..90787230a9e 100644
> --- a/libcpp/include/cpplib.h
> +++ b/libcpp/include/cpplib.h
> @@ -298,6 +298,9 @@ typedef CPPCHAR_SIGNED_T cppchar_signed_t;
>   /* Style of header dependencies to generate.  */
>   enum cpp_deps_style { DEPS_NONE = 0, DEPS_USER, DEPS_SYSTEM };
>   
> +/* Format of header dependencies to generate.  */
> +enum cpp_deps_format { DEPS_FMT_NONE = 0, DEPS_FMT_P1689R5 };

Why not add this to cpp_deps_style?

> +
>   /* The possible normalization levels, from most restrictive to least.  */
>   enum cpp_normalize_level {
>     /* In NFKC.  */
> @@ -581,6 +584,9 @@ struct cpp_options
>       /* Style of header dependencies to generate.  */
>       enum cpp_deps_style style;
>   
> +    /* Format of header dependencies to generate.  */
> +    enum cpp_deps_format format;
> +
>       /* Assume missing files are generated files.  */
>       bool missing_files;
>   
> @@ -1104,9 +1110,9 @@ extern void cpp_post_options (cpp_reader *);
>   extern void cpp_init_iconv (cpp_reader *);
>   
>   /* Call this to finish preprocessing.  If you requested dependency
> -   generation, pass an open stream to write the information to,
> -   otherwise NULL.  It is your responsibility to close the stream.  */
> -extern void cpp_finish (cpp_reader *, FILE *deps_stream);
> +   generation, pass open stream(s) to write the information to,
> +   otherwise NULL.  It is your responsibility to close the stream(s).  */
> +extern void cpp_finish (cpp_reader *, FILE *deps_stream, FILE *fdeps_stream);
>   
>   /* Call this to release the handle at the end of preprocessing.  Any
>      use of the handle after this function returns is invalid.  */
> diff --git a/libcpp/include/mkdeps.h b/libcpp/include/mkdeps.h
> index 96d64641b1a..396afa60f25 100644
> --- a/libcpp/include/mkdeps.h
> +++ b/libcpp/include/mkdeps.h
> @@ -53,20 +53,29 @@ extern void deps_add_default_target (class mkdeps *, const char *);
>   
>   /* Adds a module target.  The module name and cmi name are copied.  */
>   extern void deps_add_module_target (struct mkdeps *, const char *module,
> -				    const char *cmi, bool is_header);
> +				    const char *cmi, bool is_header,
> +				    bool is_exported);
>   
>   /* Adds a module dependency.  The module name is copied.  */
>   extern void deps_add_module_dep (struct mkdeps *, const char *module);
>   
> +/* Add an output.  */
> +extern void deps_add_output (struct mkdeps *, const char *, bool);
> +
>   /* Add a dependency (appears on the right side of the colon) to the
>      deps list.  Dependencies will be printed in the order that they
>      were entered with this function.  By convention, the first
>      dependency entered should be the primary source file.  */
>   extern void deps_add_dep (class mkdeps *, const char *);
>   
> -/* Write out a deps buffer to a specified file.  The last argument
> -   is the number of columns to word-wrap at (0 means don't wrap).  */
> -extern void deps_write (const cpp_reader *, FILE *, unsigned int);
> +/* Write out a deps buffer to a specified file.  The third argument
> +   is the number of columns to word-wrap at (0 means don't wrap).
> +   The last argument indicates whether to output extra information
> +   (namely modules).  */
> +extern void deps_write (const struct cpp_reader *, FILE *, unsigned int, int);
> +
> +/* Write out a deps buffer to a specified file in P1689R5 format.  */
> +extern void deps_write_p1689r5 (const struct mkdeps *, FILE *);
>   
>   /* Write out a deps buffer to a file, in a form that can be read back
>      with deps_restore.  Returns nonzero on error, in which case the
> diff --git a/libcpp/init.cc b/libcpp/init.cc
> index d3b4f00994b..d575afe23c7 100644
> --- a/libcpp/init.cc
> +++ b/libcpp/init.cc
> @@ -853,7 +853,7 @@ read_original_directory (cpp_reader *pfile)
>      Maybe it should also reset state, such that you could call
>      cpp_start_read with a new filename to restart processing.  */
>   void
> -cpp_finish (cpp_reader *pfile, FILE *deps_stream)
> +cpp_finish (struct cpp_reader *pfile, FILE *deps_stream, FILE *fdeps_stream)
>   {
>     /* Warn about unused macros before popping the final buffer.  */
>     if (CPP_OPTION (pfile, warn_unused_macros))
> @@ -867,8 +867,16 @@ cpp_finish (cpp_reader *pfile, FILE *deps_stream)
>     while (pfile->buffer)
>       _cpp_pop_buffer (pfile);
>   
> -  if (deps_stream)
> -    deps_write (pfile, deps_stream, 72);
> +  cpp_deps_format deps_format = CPP_OPTION (pfile, deps.format);
> +  if (deps_format == DEPS_FMT_P1689R5 && fdeps_stream)
> +    deps_write_p1689r5 (pfile->deps, fdeps_stream);
> +
> +  if (CPP_OPTION (pfile, deps.style) != DEPS_NONE
> +      && deps_stream)
> +    {
> +      deps_write (pfile, deps_stream,
> +		  72, deps_format == DEPS_FMT_NONE);
> +    }
>   
>     /* Report on headers that could use multiple include guards.  */
>     if (CPP_OPTION (pfile, print_include_names))
> diff --git a/libcpp/mkdeps.cc b/libcpp/mkdeps.cc
> index 30e87d8b4d7..e1253202d9f 100644
> --- a/libcpp/mkdeps.cc
> +++ b/libcpp/mkdeps.cc
> @@ -81,7 +81,8 @@ public:
>     };
>   
>     mkdeps ()
> -    : module_name (NULL), cmi_name (NULL), is_header_unit (false), quote_lwm (0)
> +    : primary_output (NULL), module_name (NULL), cmi_name (NULL)
> +    , is_header_unit (false), is_exported (false), quote_lwm (0)
>     {
>     }
>     ~mkdeps ()
> @@ -90,6 +91,9 @@ public:
>   
>       for (i = targets.size (); i--;)
>         free (const_cast <char *> (targets[i]));
> +    free (const_cast <char *> (primary_output));
> +    for (i = outputs.size (); i--;)
> +      free (const_cast <char *> (outputs[i]));
>       for (i = deps.size (); i--;)
>         free (const_cast <char *> (deps[i]));
>       for (i = vpath.size (); i--;)
> @@ -103,6 +107,8 @@ public:
>   public:
>     vec<const char *> targets;
>     vec<const char *> deps;
> +  const char * primary_output;
> +  vec<const char *> outputs;
>     vec<velt> vpath;
>     vec<const char *> modules;
>   
> @@ -110,6 +116,7 @@ public:
>     const char *module_name;
>     const char *cmi_name;
>     bool is_header_unit;
> +  bool is_exported;
>     unsigned short quote_lwm;
>   };
>   
> @@ -288,6 +295,21 @@ deps_add_default_target (class mkdeps *d, const char *tgt)
>       }
>   }
>   
> +/* Adds an output O.  We make a copy, so it need not be a permanent
> +   string.  */
> +void
> +deps_add_output (struct mkdeps *d, const char *o, bool is_primary)
> +{
> +  o = apply_vpath (d, o);
> +  if (is_primary)
> +  {
> +    if (d->primary_output)
> +      d->outputs.push (d->primary_output);
> +    d->primary_output = xstrdup (o);
> +  } else
> +    d->outputs.push (xstrdup (o));
> +}
> +
>   void
>   deps_add_dep (class mkdeps *d, const char *t)
>   {
> @@ -325,12 +347,13 @@ deps_add_vpath (class mkdeps *d, const char *vpath)
>   
>   void
>   deps_add_module_target (struct mkdeps *d, const char *m,
> -			const char *cmi, bool is_header_unit)
> +			const char *cmi, bool is_header_unit, bool is_exported)
>   {
>     gcc_assert (!d->module_name);
>     
>     d->module_name = xstrdup (m);
>     d->is_header_unit = is_header_unit;
> +  d->is_exported = is_exported;
>     d->cmi_name = xstrdup (cmi);
>   }
>   
> @@ -387,7 +410,7 @@ make_write_vec (const mkdeps::vec<const char *> &vec, FILE *fp,
>      .PHONY targets for all the dependencies too.  */
>   
>   static void
> -make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> +make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax, int extra)

Instead of adding the "extra" parameter...

>   {
>     const mkdeps *d = pfile->deps;
>   
> @@ -398,7 +421,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
>     if (d->deps.size ())
>       {
>         column = make_write_vec (d->targets, fp, 0, colmax, d->quote_lwm);
> -      if (CPP_OPTION (pfile, deps.modules) && d->cmi_name)
> +      if (extra && CPP_OPTION (pfile, deps.modules) && d->cmi_name)
>   	column = make_write_name (d->cmi_name, fp, column, colmax);
>         fputs (":", fp);
>         column++;
> @@ -412,7 +435,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
>     if (!CPP_OPTION (pfile, deps.modules))
>       return;

...how about checking CPP_OPTION for p1689r5 mode here?

>   
> -  if (d->modules.size ())
> +  if (extra && d->modules.size ())
>       {
>         column = make_write_vec (d->targets, fp, 0, colmax, d->quote_lwm);
>         if (d->cmi_name)
> @@ -423,7 +446,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
>         fputs ("\n", fp);
>       }
>   
> -  if (d->module_name)
> +  if (extra && d->module_name)
>       {
>         if (d->cmi_name)
>   	{
> @@ -455,7 +478,7 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
>   	}
>       }
>     
> -  if (d->modules.size ())
> +  if (extra && d->modules.size ())
>       {
>         column = fprintf (fp, "CXX_IMPORTS +=");
>         make_write_vec (d->modules, fp, column, colmax, 0, ".c++m");
> @@ -468,9 +491,203 @@ make_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
>   /* Really we should be opening fp here.  */
>   
>   void
> -deps_write (const cpp_reader *pfile, FILE *fp, unsigned int colmax)
> +deps_write (const struct cpp_reader *pfile, FILE *fp, unsigned int colmax,
> +	    int extra)
> +{
> +  make_write (pfile, fp, colmax, extra);
> +}
> +
> +static bool
> +is_utf8 (const char *name)

Can we share utf8 parsing code with decode_utf8_char in pretty-print.cc?

> +{
> +  int byte_length = 0;
> +  int expected_continuations = 0;
> +  uint32_t codepoint = 0;
> +
> +  for (const char* c = name; *c; c++)
> +    {
> +      int byte = *c;
> +
> +      // Check for a continuation character.
> +      if ((byte & 0xc0) == 0x80) // 10xx_xxxx
> +	{
> +	  --expected_continuations;
> +
> +	  if (expected_continuations < 0)
> +	    return false;
> +
> +	  // Add the encoded bits to the codepoint.
> +	  codepoint = (codepoint << 6) + (byte & ~0x3f);
> +
> +	  continue;
> +	}
> +
> +      // Make sure we don't expect more continuation characters
> +      // once we don't find one.
> +      if (expected_continuations != 0)
> +	return false;
> +
> +      // Invalid because if they appear, it is an attempt to encode an ASCII
> +      // character in 2 bytes.
> +      if (byte == 0xc0 || byte == 0xc1) // 1100_0000 || 1100_0001
> +	return false;
> +
> +      // Invalid because they would encode codepoints greater than allowed
> +      // (0x10FFFF).
> +      if (byte > 0xf4)
> +	return false;
> +
> +      // ED-prefixed sequence encoding UTF-16 surrogate halves.
> +      if (0xD0FF <= codepoint && codepoint <= 0xDFFF)
> +	return false;
> +
> +      // 0x10FFFF is the highest valid codepoint for UTF-8.
> +      if (0x10FFFF < codepoint)
> +	return false;
> +
> +      // Overlong encoding of a codepoint.
> +      if (byte_length == 2 && codepoint < 0x0800)
> +	return false;
> +      if (byte_length == 3 && codepoint < 0x10000)
> +	return false;
> +
> +      // Single byte character.
> +      if ((byte & 0x80) == 0x00) // 0xxx_xxxx
> +	{
> +	  codepoint = byte;
> +	  byte_length = expected_continuations = 0;
> +	}
> +      // Prefix codes.
> +      else if ((byte & 0xe0) == 0xc0) // 110x_xxxx
> +	{
> +	  codepoint = byte & 0x1f;
> +	  byte_length = expected_continuations = 1;
> +	}
> +      else if ((byte & 0xf0) == 0xe0) // 1110_xxxx
> +	{
> +	  codepoint = byte & 0x0f;
> +	  byte_length = expected_continuations = 2;
> +	}
> +      else if ((byte & 0xf8) == 0xf0) // 1111_0xxx
> +	{
> +	  codepoint = byte & 0x07;
> +	  byte_length = expected_continuations = 3;
> +	}
> +      else
> +	return false;
> +    }
> +
> +  if (expected_continuations != 0)
> +    return false;
> +
> +  return true;
> +}
> +
> +static void
> +p1689r5_write_filepath (const char *name, FILE *fp)
>   {
> -  make_write (pfile, fp, colmax);
> +  if (is_utf8 (name))
> +    {
> +      fputc ('"', fp);
> +      for (const char* c = name; *c; c++)
> +	{
> +	  // Escape control characters.
> +	  if (ISCNTRL (*c))
> +	    fprintf (fp, "\\u%04x", *c);
> +	  // JSON escape characters.
> +	  else if (*c == '"' || *c == '\\')
> +	    {
> +	      fputc ('\\', fp);
> +	      fputc (*c, fp);
> +	    }
> +	  // Everything else.
> +	  else
> +	    fputc (*c, fp);
> +	}
> +      fputc ('"', fp);
> +    }
> +  else
> +    {
> +      // TODO: print an error
> +    }
> +}
> +
> +static void
> +p1689r5_write_vec (const mkdeps::vec<const char *> &vec, FILE *fp)
> +{
> +  for (unsigned ix = 0; ix != vec.size (); ix++)
> +    {
> +      p1689r5_write_filepath (vec[ix], fp);
> +      if (ix < vec.size () - 1)
> +	fputc (',', fp);
> +      fputc ('\n', fp);
> +    }
> +}
> +
> +void
> +deps_write_p1689r5 (const struct mkdeps *d, FILE *fp)
> +{
> +  fputs ("{\n", fp);
> +
> +  fputs ("\"rules\": [\n", fp);
> +  fputs ("{\n", fp);
> +
> +  if (d->primary_output)
> +    {
> +      fputs ("\"primary-output\": ", fp);
> +      p1689r5_write_filepath (d->primary_output, fp);
> +      fputs (",\n", fp);
> +    }
> +
> +  if (d->outputs.size ())
> +    {
> +      fputs ("\"outputs\": [\n", fp);
> +      p1689r5_write_vec (d->outputs, fp);
> +      fputs ("],\n", fp);
> +    }
> +
> +  if (d->module_name)
> +    {
> +      fputs ("\"provides\": [\n", fp);
> +      fputs ("{\n", fp);
> +
> +      fputs ("\"logical-name\": ", fp);
> +      p1689r5_write_filepath (d->module_name, fp);
> +      fputs (",\n", fp);
> +
> +      fprintf (fp, "\"is-interface\": %s\n", d->is_exported ? "true" : "false");
> +
> +      // TODO: header-unit information
> +
> +      fputs ("}\n", fp);
> +      fputs ("],\n", fp);
> +    }
> +
> +  fputs ("\"requires\": [\n", fp);
> +  for (size_t i = 0; i < d->modules.size (); i++)
> +    {
> +      if (i != 0)
> +	fputs (",\n", fp);
> +      fputs ("{\n", fp);
> +
> +      fputs ("\"logical-name\": ", fp);
> +      p1689r5_write_filepath (d->modules[i], fp);
> +      fputs ("\n", fp);
> +
> +      // TODO: header-unit information
> +
> +      fputs ("}\n", fp);
> +    }
> +  fputs ("]\n", fp);
> +
> +  fputs ("}\n", fp);
> +
> +  fputs ("],\n", fp);
> +
> +  fputs ("\"version\": 0,\n", fp);
> +  fputs ("\"revision\": 0\n", fp);
> +
> +  fputs ("}\n", fp);
>   }
>   
>   /* Write out a deps buffer to a file, in a form that can be read back


  parent reply	other threads:[~2022-10-10 21:04 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-04 15:11 [PATCH RESEND 0/1] RFC: P1689R5 support Ben Boeckel
2022-10-04 15:12 ` [PATCH RESEND 1/1] p1689r5: initial support Ben Boeckel
2022-10-04 19:12   ` Harald Anlauf
2022-10-11 11:30     ` Ben Boeckel
2022-10-10 21:04   ` Jason Merrill [this message]
2022-10-11 11:42     ` Ben Boeckel
2022-10-18 12:18       ` Ben Boeckel
2022-10-20 15:39         ` Jason Merrill
2022-10-20 17:31           ` Ben Boeckel
2022-10-20 18:22             ` Jason Merrill
2022-10-10 20:21 ` [PATCH RESEND 0/1] RFC: P1689R5 support Jason Merrill
2022-10-13 17:08   ` David Malcolm
2022-10-18 12:22     ` Ben Boeckel
2022-10-19  7:21       ` Martin Liška

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78b88b1d-b328-a140-3a27-d33a3d96f3b9@redhat.com \
    --to=jason@redhat.com \
    --cc=ben.boeckel@kitware.com \
    --cc=brad.king@kitware.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=gcc@gcc.gnu.org \
    --cc=me@benboeckel.net \
    --cc=nathan@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).