From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=PHC3=CN=kitware.com=ben.boeckel@sourceware.org>
Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835])
	by sourceware.org (Postfix) with ESMTPS id C22C83858D35
	for <fortran@gcc.gnu.org>; Sun, 25 Jun 2023 17:08:21 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C22C83858D35
Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=kitware.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kitware.com
Received: by mail-qt1-x835.google.com with SMTP id d75a77b69052e-4009cc311d2so5186341cf.1
        for <fortran@gcc.gnu.org>; Sun, 25 Jun 2023 10:08:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=kitware.com; s=google; t=1687712901; x=1690304901;
        h=user-agent:in-reply-to:content-transfer-encoding
         :content-disposition:mime-version:references:message-id:subject:cc
         :to:from:date:from:to:cc:subject:date:message-id:reply-to;
        bh=hgolHcOsnX43Hq8WqP+1J4iJsihCTmeDvLrMkW/6zJM=;
        b=JGm1UZxaePeLCs8L9uI7BP5lFxe/6o9WYWPNDiqoKQFXZIZxz+/VYhURTSQ1gT7V3T
         0CYBYGFdMQhJJ+xBAkuKL3b3Q8yJJumQx6f325Al+/zSqh8l9LL4QjBUlr+f2ryvBnop
         A/VHkN+XBDnosQw30pqcD+b2kmE84hfoImOjI=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1687712901; x=1690304901;
        h=user-agent:in-reply-to:content-transfer-encoding
         :content-disposition:mime-version:references:message-id:subject:cc
         :to:from:date:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=hgolHcOsnX43Hq8WqP+1J4iJsihCTmeDvLrMkW/6zJM=;
        b=bP7ax8x2xVdFbWedcJgaV3VF7985b5RkoY7hOBOlvclchcUC+PFmPSEBwnXGjN/ePj
         BlL8abQEAvabOX2nE2f0U4ZVwGKOK4nGnPz4OmW1up0bzcMLLKRRyDn9JjtCu4wE1OlA
         0AYzyda6cndurcMHWoBkmpYiY3ks051lwL4/Ut2/BT68saRj5Ab+40Cd5X+tNLgHitu3
         tog+Wm2efbzMtFSpTPATWUSkKERUWZ0UiBx6t1nvajXJfi1+JkTARu1si+p5HHj5QLec
         J5d26kxKoDQWjucVwjRUjweGNNeM/3c8fvkoJ6SFsnaUFydB4sOdmZsCSY8sWqXBCiUG
         kp+w==
X-Gm-Message-State: AC+VfDyAPdgXFv7G/JZM4og5JezeZjXgvTAzef/rade323fFwXEzOvsA
	zREfFwiq9NE0L4MrZXbrnUzjJg==
X-Google-Smtp-Source: ACHHUZ59y1v31k/WcgRv25ZzlxnG4+1qbG2wqvF99UWHsVjxFmeAB9OKynB8eBeU7y9JwrupUYY+Og==
X-Received: by 2002:a05:622a:493:b0:3fd:e870:ef0d with SMTP id p19-20020a05622a049300b003fde870ef0dmr29497323qtx.47.1687712900993;
        Sun, 25 Jun 2023 10:08:20 -0700 (PDT)
Received: from localhost (cpe-142-105-146-128.nycap.res.rr.com. [142.105.146.128])
        by smtp.gmail.com with ESMTPSA id t13-20020ac8760d000000b003f38aabb88asm2053942qtq.20.2023.06.25.10.08.20
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Sun, 25 Jun 2023 10:08:20 -0700 (PDT)
Date: Sun, 25 Jun 2023 13:08:19 -0400
From: Ben Boeckel <ben.boeckel@kitware.com>
To: Jason Merrill <jason@redhat.com>
Cc: gcc-patches@gcc.gnu.org, nathan@acm.org, fortran@gcc.gnu.org,
	gcc@gcc.gnu.org, brad.king@kitware.com
Subject: Re: [PATCH v5 3/5] p1689r5: initial support
Message-ID: <20230625170819.GE270821@farprobe>
References: <20230125210636.2960049-1-ben.boeckel@kitware.com>
 <20230125210636.2960049-4-ben.boeckel@kitware.com>
 <77ee5db6-e45e-5178-4807-5b2fef29e8c7@redhat.com>
 <20230620194649.GA186848@farprobe>
 <e9dc7075-2eae-5dcb-c134-0a7d906d4181@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <e9dc7075-2eae-5dcb-c134-0a7d906d4181@redhat.com>
User-Agent: Mutt/2.2.9 (2022-11-12)
X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <fortran.gcc.gnu.org>

On Fri, Jun 23, 2023 at 14:31:17 -0400, Jason Merrill wrote:
> On 6/20/23 15:46, Ben Boeckel wrote:
> > On Tue, Feb 14, 2023 at 16:50:27 -0500, Jason Merrill wrote:
> >> On 1/25/23 13:06, Ben Boeckel wrote:
> 
> >>> Header units (including the standard library headers) are 100%
> >>> unsupported right now because the `-E` mechanism wants to import their
> >>> BMIs. A new mode (i.e., something more workable than existing `-E`
> >>> behavior) that mocks up header units as if they were imported purely
> >>> from their path and content would be required.
> >> >> I notice that the cpp dependency generation tries (in open_file_failed)
> >> to continue after encountering a missing file, is that not sufficient 
> >> for header units?  Or adjustable to be sufficient?
> > 
> > No. Header units can introduce macros which can be used to modify the
> > set of modules that are imported. Included headers are "discovered"
> > dependencies and don't modify the build graph (just add more files that
> > trigger a rebuild) and can be collected during compilation. Module
> > dependencies are needed to get the build correct in the first place in
> > order to:
> > 
> > - order module compilations in the build graph so that imported modules
> >   are ready before anything using them; and
> > - computing the set of flags needed for telling the compiler where
> >   imported modules' CMI files should be located.
> 
> So if the header unit CMI isn't available during dependency generation, 
> would it be better to just #include the header?

It's not so simple: the preprocessor state needs to isolate out
`LOCAL_ONLY` from this case:

```
#define LOCAL_ONLY 1
import <some/header.h>; // The preprocessing of this should *not* see
                        // `LOCAL_ONLY`.
```

> > Hmm. But `stdout` is probably fine to use for both though. Basically:
> > 
> >      if (fdeps_stream == out_stream && fdeps_stream != stdout)
> >        make_diagnostic_noise ();
> 
> (fdeps_stream == deps_stream, but sure, that's reasonable.

Done.

> >> So, I take it this is the common use case you have in mind, generating
> >> Make dependencies for the p1689 file?  When are you thinking the Make
> >> dependencies for the .o are generated?  At build time?
> > 
> > Yes. If an included file changes, the scanning should be performed
> > again. The compilation will have its own `-MF` as well (which should
> > point to the same files plus the CMI files it ends up reading).
> > 
> >> I'm a bit surprised you're using .json rather than an extension that
> >> indicates what the information is.
> > 
> > I can change that; the filename doesn't *really* matter (e.g., CMake
> > uses `.ddi` for "dynamic dependency information").
> 
> That works.

Done.

> >>> `-M` is about discovered dependencies: those that you find out while
> >>> doing work. `-fdep-*` is about ordering dependencies: extracting
> >>> information from file content in order to even order future work around.
> >>
> >> I'm not sure I see the distinction; Makefiles also express ordering
> >> dependencies.  In both cases, you want to find out from the files what
> >> order you will want to process them in when building the project.
> > 
> > Makefiles can express ordering dependencies, but not the `-M` snippets;
> > these are for files that, if changed, should trigger a rebuild. This is > fundamentally different than module dependencies which instead indicate
> > which *compiles* (or CMI generation if using a 2-phase setup) need to
> > complete before compilation (or CMI generation) of the scanned TU can be
> > performed. Generally generated headers will be ordered manually in the
> > build system description. However, maintaining that same level for
> > in-source dependency information on a per-source level is a *far* higher
> > burden.
> 
> The main difference I see is that the CMI might not exist yet.  As you 
> say, we don't want to require people to write all the dependencies by 
> hand, but that just means we need to be able to generate the 
> dependencies automatically.  In the Make-only model I'm thinking of, one 
> would collect dependencies on an initial failing build, and then start 
> over from the beginning again with the dependencies we discovered.  It's 
> the same two-phase scan and build, but one that uses the same compile 
> commands for both phases.

It's a potentially unbounded set of phases:

- 2 phases per tool that is built that generates other module-using
  code for other tools:

    - scan files for toolA
    - build files for toolA
    - scan files written by toolA (for toolB)
    - build files written by toolA (for toolB)
    - …

- if a referenced module does not exist, knowing when one is "done" is
  difficult (an import cycle would appear like this because while module
  X does exist, it depends on Y which itself claims a dependency on X).

*Something* needs to interpret the information being provided in the
`-fdeps-file=` and communicate back to the build tool what is going on.
This requires:

- some coordination of where each TU will store these files (e.g., CMake
  has a per-target directory for storing such things and the collator
  knows which TUs belong to which targets to tell everything about
  them); and
- some kind of coordination for the target-level dependency graph (i.e.,
  if libA depends-on only libB, a TU from libA using a module "from"
  libB is ok, but using one "from" libC is bad and the same if libB uses
  a module "from" libA).

Therefore any automation will need to have an idea of the overall build
graph and an understanding of "yes, module X exists, but this TU is not
allowed to use it because the graph doesn't allow that edge". I don't
think a "drop in" tool to any arbitrary `Makefile`-using project can
exist as there's just not enough structure in a `Makefile` to really
require such information. `automake` can provide one, however, because
(AFAIK) it *does* have an idea about these kinds of things (though it
may need to be more structured than it is today).

The `libcody` GNU Make support patch doesn't have this structured
information and (AFAIK) will happily let any TU import any module from
any other TU even if the linker is going to be unhappy later. Again, I
think we can give users (and they deserve) better error messages than
"<mangled module initializer symbol> not found".

> Anyway, this isn't an objection to this patch, just another model I also 
> want to support.

The old Intel Fortran "how to build with modules" docs used to have a
"run `make` until it works" instruction step. I think we can do better
these days (yes, it means that pure-`make` is far harder than it used to
be).

> >>> <snip JSON output diff>
> >>
> >> Is there a reason not to use the gcc/json.h interface for JSON output?
> > 
> > This is `libcpp`; is that not a dependency cycle?
> 
> Ah, indeed.  We could move it to libiberty, but it would need 
> significant adjustments to remove its dependencies on other stuff in 
> gcc/.  So maybe just add a TODO comment about that, along with adding 
> comments before the functions.

Done.

--Ben