public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Nathan Sidwell <nathan@acm.org>
To: GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Jeff Law <law@redhat.com>, Sandra Loosemore <sandra@codesourcery.com>
Subject: Modules doc
Date: Fri, 20 Nov 2020 10:19:55 -0500	[thread overview]
Message-ID: <9c066876-dfe5-778d-aaa4-dd343afe5d35@acm.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 170 bytes --]

Here is an update c++ modules documentation patch.  I'd be grateful for 
review.  Especially checking I'm not using too much implementor-speak

nathan
-- 
Nathan Sidwell

[-- Attachment #2: 13-doc.diff --]
[-- Type: text/x-patch, Size: 21630 bytes --]

diff --git c/gcc/doc/cppopts.texi w/gcc/doc/cppopts.texi
index 7f1849d841f..e5ece92487b 100644
--- c/gcc/doc/cppopts.texi
+++ w/gcc/doc/cppopts.texi
@@ -139,6 +139,10 @@ this useless.
 
 This feature is used in automatic updating of makefiles.
 
+@item -Mno-modules
+@opindex Mno-modules
+Disable dependency generation for compiled module interfaces.
+
 @item -MP
 @opindex MP
 This option instructs CPP to add a phony target for each dependency
diff --git c/gcc/doc/invoke.texi w/gcc/doc/invoke.texi
index 02abac39de8..29ae36861ad 100644
--- c/gcc/doc/invoke.texi
+++ w/gcc/doc/invoke.texi
@@ -172,6 +172,7 @@ listing and explanation of the binary and decimal byte size prefixes.
 * Spec Files::          How to pass switches to sub-processes.
 * Environment Variables:: Env vars that affect GCC.
 * Precompiled Headers:: Compiling a header once, and using it many times.
+* C++ Modules::		Experimental C++20 module system.
 @end menu
 
 @c man begin OPTIONS
@@ -214,14 +215,21 @@ in the following sections.
 -faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new @gol
 -fconstexpr-depth=@var{n}  -fconstexpr-cache-depth=@var{n} @gol
 -fconstexpr-loop-limit=@var{n}  -fconstexpr-ops-limit=@var{n} @gol
+-fmodule-header@r{[}=@var{kind}@r{]} -fmodule-only -fmodules-ts @gol
+-fmodule-implicit-inline @gol
+-fmodule-mapper=@var{specification} @gol
+-fmodule-version-ignore @gol
 -fno-elide-constructors @gol
 -fno-enforce-eh-specs @gol
 -fno-gnu-keywords @gol
 -fno-implicit-templates @gol
 -fno-implicit-inline-templates @gol
--fno-implement-inlines  -fms-extensions @gol
+-fno-implement-inlines  @gol
+-fno-module-lazy @gol
+-fms-extensions @gol
 -fnew-inheriting-ctors @gol
 -fnew-ttp-matching @gol
+-fno-module-lazy @gol
 -fno-nonansi-builtins  -fnothrow-opt  -fno-operator-names @gol
 -fno-optional-diags  -fpermissive @gol
 -fno-pretty-templates @gol
@@ -233,12 +241,14 @@ in the following sections.
 -fvisibility-inlines-hidden @gol
 -fvisibility-ms-compat @gol
 -fext-numeric-literals @gol
+-flang-info-include-translate@r{[}=@var{name}@r{]} @gol
 -Wabi-tag  -Wcatch-value  -Wcatch-value=@var{n} @gol
 -Wno-class-conversion  -Wclass-memaccess @gol
 -Wcomma-subscript  -Wconditionally-supported @gol
 -Wno-conversion-null  -Wctad-maybe-unsupported @gol
 -Wctor-dtor-privacy  -Wno-delete-incomplete @gol
--Wdelete-non-virtual-dtor  -Wdeprecated-copy  -Wdeprecated-copy-dtor @gol
+-Wdelete-non-virtual-dtor  -Wdeprecated-copy -Wdeprecated-copy-dtor @gol
+-Winvalid-imported-macros @gol
 -Wno-deprecated-enum-enum-conversion -Wno-deprecated-enum-float-conversion @gol
 -Weffc++  -Wno-exceptions -Wextra-semi  -Wno-inaccessible-base @gol
 -Wno-inherited-variadic-ctor  -Wno-init-list-lifetime @gol
@@ -599,7 +609,7 @@ Objective-C and Objective-C++ Dialects}.
 -fpreprocessed  -ftabstop=@var{width}  -ftrack-macro-expansion  @gol
 -fwide-exec-charset=@var{charset}  -fworking-directory @gol
 -H  -imacros @var{file}  -include @var{file} @gol
--M  -MD  -MF  -MG  -MM  -MMD  -MP  -MQ  -MT @gol
+-M  -MD  -MF  -MG  -MM  -MMD  -MP  -Mno-modules -MQ  -MT @gol
 -no-integrated-cpp  -P  -pthread  -remap @gol
 -traditional  -traditional-cpp  -trigraphs @gol
 -U@var{macro}  -undef  @gol
@@ -1571,7 +1581,7 @@ name suffix).  This option applies to all following input files until
 the next @option{-x} option.  Possible values for @var{language} are:
 @smallexample
 c  c-header  cpp-output
-c++  c++-header  c++-cpp-output
+c++  c++-header  c++-system-header c++-user-header c++-cpp-output
 objective-c  objective-c-header  objective-c-cpp-output
 objective-c++ objective-c++-header objective-c++-cpp-output
 assembler  assembler-with-cpp
@@ -3056,6 +3066,53 @@ To save space, do not emit out-of-line copies of inline functions
 controlled by @code{#pragma implementation}.  This causes linker
 errors if these functions are not inlined everywhere they are called.
 
+@item -fmodules-ts
+@itemx -fno-modules-ts
+@opindex fmodules-ts
+@opindex fno-modules-ts
+Enable support for C++ 20 modules.  The @option{-fno-modules-ts} is
+usually not needed, as that is the default.  Even though this is a
+C++20 feature, it is not currently implicitly enabled by selecting
+that standard version.
+
+@item -fmodule-header
+@itemx -fmodule-header=user
+@itemx -fmodule-header=system
+@opindex fmodule-header
+Compile as a header unit.
+
+@item -fmodule-implicit-inline
+@opindex fmodule-implicit-inline
+Memmber functions defined in their class definitions are not
+implicitly inline for modular code.  This is different to traditional
+C++ behaviour, for good reasons.  However, it may result in a
+difficulty during code porting.  This option will make such function
+definitions implicitly inline.  It does however generate an ABI
+incompatibility, so you must use it everywhere or nowhere.  (Such
+definitions outside of a named module remain implicitly inline,
+regardless.)
+
+@item -fno-module-lazy
+@opindex fno-module-lazy
+@opindex fmodule-lazy
+Disable lazy module importing and module mapper creation.
+
+@item -fmodule-mapper=@r{[}@var{hostname}@r{]}:@var{port}@r{[}?@var{ident}@r{]}
+@itemx -fmodule-mapper=|@var{program}@r{[}?@var{ident}@r{]} @var{args...}
+@itemx -fmodule-mapper==@var{socket}@r{[}?@var{ident}@r{]}
+@itemx -fmodule-mapper=<>@r{[}@var{fdinout}@r{]}@r{[}?@var{ident}@r{]}
+@itemx -fmodule-mapper=<@var{fdin}>@var{fdout}@r{[}?@var{ident}@r{]}
+@itemx -fmodule-mapper=@var{file}@r{[}?@var{ident}@r{]}
+@vindex CXX_MODULE_MAPPER @r{environment variable}
+@opindex fmodule-mapper
+An oracle to query for module name to filename mappings.  If
+unspecified the @env{CXX_MODULE_MAPPER} environment variable is used,
+and if that is unset, an in-process default is provided.
+
+@item -fmodule-only
+@opindex fmodule-only
+Only emit the module CMI, inhibiting any object file.
+
 @item -fms-extensions
 @opindex fms-extensions
 Disable Wpedantic warnings about constructs used in MFC, such as implicit
@@ -3303,6 +3360,12 @@ for ISO C++11 onwards (@option{-std=c++11}, ...).
 Do not search for header files in the standard directories specific to
 C++, but do still search the other standard directories.  (This option
 is used when building the C++ library.)
+
+@item -flang-info-include-translate
+@itemx -flang-info-include-translate=@var{header}
+@opindex flang-info-include-translate
+Note include translation events.
+
 @end table
 
 In addition, these warning options have meanings only for C++ programs:
@@ -3460,6 +3523,13 @@ the variable declaration statement.
 
 @end itemize
 
+@item -Winvalid-imported-macros
+@opindex Winvalid-imported-macros
+@opindex Wno-invalid-imported-macros
+Verify all imported macro definitions are valid at end of
+compilation.  This is not enabled by default, as it requires
+additional processing to determine.
+
 @item -Wno-literal-suffix @r{(C++ and Objective-C++ only)}
 @opindex Wliteral-suffix
 @opindex Wno-literal-suffix
@@ -16728,6 +16798,11 @@ By default, the dump will contain messages about successful
 optimizations (equivalent to @option{-optimized}) together with
 low-level details about the analysis.
 
+@item -fdump-lang
+@opindex fdump-lang
+Dump language-specific information.  The file name is made by appending
+@file{.lang} to the source file name.
+
 @item -fdump-lang-all
 @itemx -fdump-lang-@var{switch}
 @itemx -fdump-lang-@var{switch}-@var{options}
@@ -16748,6 +16823,14 @@ Enable all language-specific dumps.
 Dump class hierarchy information.  Virtual table information is emitted
 unless '@option{slim}' is specified.  This option is applicable to C++ only.
 
+@item module
+Dump module information.  Options @option{lineno} (locations),
+@option{graph} (reachability), @option{blocks} (clusters),
+@option{uid} (serialization), @option{alias} (mergeable),
+@option{asmname} (Elrond), @option{eh} (mapper) & @option{vops}
+(macros) may provide additional information.  This option is
+applicable to C++ only.
+
 @item raw
 Dump the raw internal tree data.  This option is applicable to C++ only.
 
@@ -32492,3 +32575,275 @@ precompiled header, the actual behavior is a mixture of the
 behavior for the options.  For instance, if you use @option{-g} to
 generate the precompiled header but not when using it, you may or may
 not get debugging information for routines in the precompiled header.
+
+@node C++ Modules
+@section C++ Modules
+@cindex speed of compilation
+
+Modules are a C++ 20 language feature.  As the name suggests, it
+provides a modular compilation system, intending to provide both
+faster builds and better library isolation.  The ``Merging Modules''
+paper @uref{https://wg21.link/p1103}, provides the easiest to read set
+of changes to the standard, although it does not capture later
+changes.  That specification is now part of C++20,
+@uref{git@@github.com:cplusplus/draft.git}, it is considered complete
+(there may be defect reports to come).
+
+@emph{G++'s modules support is not complete.}  Other than bugs, the
+missing pieces are:
+
+@table @emph
+
+@item Private Module Fragment
+The Private Module Fragment is recognized, but an error is emitted.
+
+@item Partition definition visibility rules
+Entities may be defined in implementation partitions, and those
+definitions are not available outside of the module.  This is not
+implemented.
+
+@item Textual merging of reachable GM entities
+Entitites may be multiply defined across different header-units.
+These must be de-duplicated, and this is implemented across imports,
+or when an import redefines a textually-defined entity.  However the
+reverse is not implemented -- textually redefining an entity that has
+been defined in an imported header-unit.  A redefinition error will be
+emitted.
+
+@item Translation-Unit local referencing rules
+Papers p1815 (@uref{https://wg21.link/p1815}) and p2003
+(@uref{https://wg21.link/p2003} adds limitations on which entities an
+exported region may reference (for instance, the entities an exported
+template definition may reference).  These are not fully implemented.
+
+@end table
+
+Modular compilation is @emph{not} enabled with just the
+@option{-std=c++20} option.  You must explicitly enable it with the
+@option{-fmodules-ts} option.  It is independent of the language
+version selected, although in pre-c++20 versions, it is of course an
+extension.
+
+No new source file suffixes are required or supported.  If you wish to
+use a non-standard suffix (@xref{Overall Options}), you will also need
+to provide a @option{-x c++} option too.@footnote{Some users like to
+distinguish module interface files with a new suffix, such as naming
+the source @code{module.cppm}, which involves
+teaching all tools about the new suffix.  A different scheme, such as
+naming @code{module-m.cpp} would be less invasive.}
+
+Compiling a module interface unit produces an additional output (to
+the assembly or object file), called a Compiled Module Interface
+(CMI).  This encodes the exported declarations of the module.
+Importing a module reads in the CMI.  The import graph is a Directed
+Acyclic Graph (DAG).  You must build imports before the importer.
+
+Header files may themselves be compiled to header units, which are a
+transitional ability aiming at faster compilation.  The
+@option{-fmodule-header} option is used to enable this, and implies
+the @option{-fmodules-ts} option.  These CMIs are named by the fully
+resolved underlying header file, and thus may be a complete pathname
+containing subdirectories.  If the header file is found at an absolute
+pathname, the CMI location is still relative to a CMI root directory.
+
+As header files often have no suffix, you commonly have to specify a
+@option{-x} option to tell the compiler the source is a header file.
+You may use @option{-x c++-header}, @option{-x c++-user-header} or
+@option{-x c++-system-header}.  When used in conjunction with
+@option{-fmodules-ts}, these all imply an appropriate
+@option{-fmodule-header} option.  The latter two variants will use the
+user or system include path to search for the file specified.  This
+allows you to, for instance, compile standard library header files as
+header units, without needing to know exactly where they are
+installed.  Specifying the language as one of these variants also
+inhibits output of the object file, as header files have no associated
+object file.
+
+When creating an output CMI any missing directory components are
+created in a manner that is safe for concurrent builds creating
+multiple, different, CMIs within a common subdirectory tree.
+
+CMI contents are written to a temporary file, which is then atomically
+renamed.  Observers will either see old contents (if there is an
+existing file), or complete new contents.  They will not observe the
+CMI during its creation.  This is unlike object file writing, which
+may be observed by an external process.
+
+CMIs are read in lazily, if the host OS provides @code{mmap}
+functionality.  Generally blocks are read when name lookup or template
+instantiation occurs.  To inhibit this, the @option{-fno-module-lazy}
+option may be used.
+
+The @option{-fmodule-only} option disables generation of the
+associated object file for compiling a module interface.  Only the CMI
+is generated.  This option is implied when using the
+@option{-fmodule-header} option.
+
+The @option{--param lazy-modules=@var{n}} parameter controls the limit
+on the number of concurrently open module files during lazy loading.
+Should more modules be imported, an LRU algorithm is used to determine
+which files to close -- until that file is needed again.  This limit
+may be exceeded with deep module dependency hierarchies.  With large
+code bases there may be more imports than the process limit of file
+descriptors.  By default, the limit is a few less than the per-process
+file descriptor hard limit, if that is determinable.@footnote{Where
+applicable the soft limit is incremented as needed towards the hard limit.}
+
+The @option{-flang-info-include-translate} option notes whether
+include translation occurs.  With no argument, all include translation
+is noted.  Otherwise, queries about include translation of a specific
+header file is noted.  The latter form may be repeated.  This option
+may be helpful in determining whether include translation is
+happenning -- if it is working correctly, it'll behave as if it wasn't
+there at all.
+
+The @option{-Winvalid-imported-macros} option causes all imported macros
+to be resolved at the end of compilation.  Without this, imported
+macros are only resolved when expanded or (re)defined.  This option
+will detect conflicting import definitions for all macros.
+
+The @option{-fmodule-mapper} family of options are described below.
+
+@menu
+* C++ Module Mapper::       Module Mapper
+* C++ Module Preprocessing::  Module Preprocessing
+@end menu
+
+@node C++ Module Mapper
+@subsection Module Mapper
+@cindex C++ Module Mapper
+
+A module mapper provides a server or file that the compiler queries to
+determine the mapping between module names and CMI files.  It is also
+used to build CMIs on demand.  @emph{Mapper functionality is in its
+infancy and is intended for experimentation with build system
+interactions.}
+
+A mapper may be specified with the @option{-fmodule-mapper=@var{val}}
+option or @env{CXX_MODULE_MAPPER} environment variable.  The value may
+have one of the following forms:
+
+@table @gcctabopt
+
+@item @r{[}@var{hostname}@r{]}:@var{port}@r{[}?@var{ident}@r{]}
+An optional hostname and a numeric port number to connect to.  If the
+hostname is omitted, the loopback address is used.  If the hostname
+corresponds to multiple IPV6 addresses, these are tried in turn, until
+one is successful.  If your host lacks IPv6, this form is
+non-functional.  If you must use IPv4 use
+@option{-fmodule-mapper='|ncat @var{ipv4host} @var{port}'}.
+
+@item =@var{socket}@r{[}?@var{ident}@r{]}
+A local domain socket.  If your host lacks local domain sockets, this
+form is non-functional.
+
+@item |@var{program}@r{[}?@var{ident}@r{]} @r{[}@var{args...}@r{]}
+A program to spawn, and communicate with on its stdin/stdout streams.
+Your @var{PATH} environment variable is searched for the program.
+Arguments are separated by space characters, (it is not possible for
+one of the arguments delivered to the program to contain a space).  An
+exception is if @var{program} begins with @@.  In that case
+@var{program} (sans @@) is looked for in the compiler's internal
+binary directory.  Thus the sample mapper-server can be specified
+with @code{@@g++-mapper-server}.
+
+@item <>@r{[}?@var{ident}@r{]}
+@item <>@var{fdinout}@r{[}?@var{ident}@r{]}
+@item <@var{fdin}>@var{fdout}@r{[}?@var{ident}@r{]}
+File descriptors to communicate over.  The first form, @option{<>},
+communicates over stdin and stdout.  The second form specifies a
+bidirectional file descriptor and the last form allows specifying
+two independent descriptors.  Note that other compiler options might
+cause the compiler to read stdin or write stdout.
+
+@item @var{file}@r{[}?@var{ident}@r{]}
+A mapping file consisting of space-separated module-name, filename
+pairs, one per line.  Only the mappings for the direct imports and any
+module export name need be provided.  If other mappings are provided,
+they override those stored in any imported CMI files.  A repository
+root may be specified in the mapping file by using @samp{$root} as the
+module name in the first active line.
+
+@end table
+
+As shown, an optional @var{ident} may suffix the first word of the
+option, indicated by a @samp{?} prefix.  The value is used in the
+initial handshake with the module server, or to specify a prefix on
+mapping file lines.  In the server case, the main source file name is
+used if no @var{ident} is specified.  In the file case, all non-blank
+lines are significant, unless a value is specified, in which case only
+lines beginning with @var{ident} are significant.  The @var{ident}
+must be separated by whitespace from the module name.  Be aware that
+@samp{<}, @samp{>}, @samp{?}, and @samp{|} characters are often
+significant to the shell, and therefore may need quoting.
+
+The mapper is connected to or loaded lazily, when the first module
+mapping is required.  The networking protocols are only supported on
+hosts that provide networking.  If no mapper is specified a default is
+provided.
+
+A project-specific mapper is expected to be provided by the build
+system that invokes the compiler.  It is not expected that a
+general-purpose server is provided for all compilations.  As such, the
+server will know the build configuration, the compiler it invoked, and
+the environment (such as working directory) in which that is
+operating.  As it may parallelize builds, several compilations may
+connect to the same socket.
+
+The default mapper generates CMI files in a @samp{gcm.cache}
+directory.  CMI files have a @samp{.gcm} suffix.  The module unit name
+is used directly to provide the basename.  Header units construct a
+relative path using the underlying header file name.  If the path is
+already relative, a @samp{,} directory is prepended.  Internal
+@samp{..} components are translated to @samp{,,}.  No attempt is made
+to canonicalize these filenames beyond that done by the preprocessor's
+include search algorithm, as in general it is ambiguous when symbolic
+links are present.
+
+The mapper protocol was published as ``A Module Mapper''
+@uref{https://wg21.link/p1184}.  The implementation is provided by
+@command{libcody}, @uref{https://www.github.com/urnathan/libcody},
+which specifies the canonical protocol definition.  A proof of concept
+server implementation embedded in @command{make} was described in
+''Make Me A Module'', @uref{https://wg21.link/p1602}.
+
+@node C++ Module Preprocessing
+@subsection Module Preprocessing
+@cindex C++ Module Preprocessing
+
+Modules affect preprocessing because of header units and include
+translation.  Some uses of the preprocessor as a separate step will
+either not produce a correct output, or require CMIs to be available.
+
+Header units import macros.  These macros can affect later conditional
+inclusion, which therefore can cascade to differing import sets.  When
+preprocessing, it is necessary to load the CMI.  If a header unit is
+unavailable, the preprocessor will issue a warning and continue (when
+not just preprocessing, an error is emitted).  Detecting such imports
+requires preprocessor tokenization of the input stream to phase 4
+(macro expansion).
+
+Include translation converts @code{#include}, @code{#include_next} and
+@code{#import} directives to internal @code{import} declarations.
+Whether a particular directive is translated is controlled by the
+module mapper.  Header unit names are canonicalized during
+preprocessing.
+
+Dependency information can be emitted for macro import, extending the
+functionality of @option{-MD} and @option{-MMD} options.  Detection of
+import declarations also requires phase 4 preprocessing, and thus
+requires full preprocessing (or compilation).
+
+The @option{-M}, @option{-MM} and @option{-E -fdirectives-only} options halt
+preprocessing before phase 4.
+
+The @option{-save-temps} option will use @option{-fdirectives-only}
+for preprocessing, and preserve the macro definitions in the
+preprocessed output.  Usually you will also want to use this option
+when explicitly preprocessing a header-unit, or consuming such
+preprocessed output:
+
+@smallexample
+g++ -fmodules-ts -E -fdirectives-only my-header.hh -o my-header.ii
+g++ -x c++-header -fmodules-ts -fpreprocessed -fdirectives-only my-header.ii
+@end smallexample

             reply	other threads:[~2020-11-20 15:20 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20 15:19 Nathan Sidwell [this message]
2020-11-20 16:45 ` Marek Polacek
2020-11-20 17:12   ` Nathan Sidwell
2020-11-30  6:33     ` Sandra Loosemore
2020-11-30 13:14       ` Nathan Sidwell
2020-11-24  7:11   ` Boris Kolpackov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9c066876-dfe5-778d-aaa4-dd343afe5d35@acm.org \
    --to=nathan@acm.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=sandra@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).