From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <paulhollinsky@gmail.com>
Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com
 [IPv6:2607:f8b0:4864:20::633])
 by sourceware.org (Postfix) with ESMTPS id 8FB193858D32
 for <gcc-patches@gcc.gnu.org>; Mon, 22 Aug 2022 17:39:55 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8FB193858D32
Received: by mail-pl1-x633.google.com with SMTP id 20so10561374plo.10
 for <gcc-patches@gcc.gnu.org>; Mon, 22 Aug 2022 10:39:55 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=in-reply-to:content-disposition:mime-version:references:message-id
 :subject:cc:to:from:date:x-gm-message-state:from:to:cc;
 bh=eYZflo2krENRxf1UcdWVvIeX8JNh/t/HDZk1sSWESkQ=;
 b=S46Y1wuF0cEL14wX+YR+vv9lSkh/ZLUMqmJ8XlfwfKTDtwyyWg119iv/zvgF+bJvqB
 AG02AIZ57arcJK2S8AZD2f35chUrmTGT6MyxNTAB8pIvmLY08wRsizljrsnTQEJTQzNC
 LEVcY3sLgElTxoByyy7TRqH1RpQSMFREywO+GzfLPt46cpXR8pf8dvKhIFzecirHaqGj
 MnqJDyubbzzmDM/+wK2cDqyOs395b1S3Hq17IkwBC2vkDoMvAlYYpGwWUKgJkX5cSioW
 dInb1YV5F+YsDGg6f2HCKz9Hi/MmIvuQgN6HZqDBQbgZiHuJP55uul//kjc9wIWIpEBv
 +kEg==
X-Gm-Message-State: ACgBeo0bQVvAf9mHa47YrC4mZjMJVfXWaD2omuAcAFYN8AUq8Xejnpwh
 PUsV2Qj3h01XS1v26S87+Kk=
X-Google-Smtp-Source: AA6agR6aDC4o1YqfqaH6ObZusuwT8KKYK1FxVJgVMCisHzW/Fmq+r3So6W0diX2UU5nk/3YwpuaYeQ==
X-Received: by 2002:a17:902:b217:b0:172:bd6c:814d with SMTP id
 t23-20020a170902b21700b00172bd6c814dmr18532262plr.55.1661189994241; 
 Mon, 22 Aug 2022 10:39:54 -0700 (PDT)
Received: from perseverance.local (76-14-114-239.rk.wavecable.com.
 [76.14.114.239]) by smtp.gmail.com with ESMTPSA id
 a5-20020a170902ecc500b0016efbccf2c0sm5541698plh.56.2022.08.22.10.39.52
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 22 Aug 2022 10:39:53 -0700 (PDT)
Date: Mon, 22 Aug 2022 10:39:51 -0700
From: Paul Hollinsky <paulhollinsky@gmail.com>
To: Nathan Sidwell <nathan@acm.org>
Cc: gcc-patches@gcc.gnu.org, Jason Merrill <jason@redhat.com>,
 David Malcom <dmalcolm@redhat.com>, Per Bothner <per@bothner.com>,
 Tom Tromey <tom@tromey.com>
Subject: Re: Ping [PATCH V2] libcpp: Optimize #pragma once with a hash table
 [PR58770]
Message-ID: <20220822173951.tm4h5tjtvpo5sb4j@perseverance.local>
References: <20220706230321.48041-1-paulhollinsky@gmail.com>
 <20220801051839.1454-1-paulhollinsky@gmail.com>
 <20220819202734.wvourvrje3x5vfla@perseverance.local>
 <b8bcc0a2-25d9-a6e7-0e37-e58254bf7230@acm.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <b8bcc0a2-25d9-a6e7-0e37-e58254bf7230@acm.org>
X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Aug 2022 17:39:58 -0000

On Mon, Aug 22, 2022 at 09:19:29AM -0400, Nathan Sidwell wrote:
> On 8/19/22 16:27, Paul Hollinsky wrote:
> > Hi all,
> > 
> > Would love some feedback on this patch!
> > 
> > Thanks,
> > Paul
> > 
> > On Mon, Aug 01, 2022 at 05:18:40AM +0000, Paul Hollinsky wrote:
> > > Rather than traversing the all_files linked list for every include,
> > > this factors out the quick idempotency checks (modification time
> > > and size) to be the keys in a hash table so we can find matching
> > > files quickly.
> 
> I'm probably confused, but why is the existing idempotency logic
> insufficiently optimal?  Can it be improved directly?

The idempotency logic breaks down to a couple integer comparisons, I doubt
it can be made much faster. Usually a file will be skipped after the very
first idempotency check. The reason it's suboptimal right now is that it's
O(n^2) behavior -- for each include that's seen we follow the entire linked
list and do those integeger comparisons.

The cost of hashing, performing a lookup against the hash table, and then
traversing the list of results (any hash collisions) is much lower than
traversing the full list of includes. I suspect that has something to do
with cache locality, or lack thereof, when accessing each member of the
linked list. It's likely also just far fewer instructions executed. 

> 
> WRT using modification time as a key.  I don't see how that provides
> sufficient additional randomization to file size?  Groups of headers are
> often installed with the same mtime.

This is a fair point, and I was going to say something here about the
codebases being compiled, but git also doesn't preserve mtime.

The real reason mtime is because that's what was done in the previous
implementation, and I felt it best not to mess with.

But, if there are better attributes that could be used without adding
much overhead we could use those instead. I was personally worried that
reading the file to perform any sort of hash would be expensive.

In any case, thank you very much for the feedback!

--Paul

> 
> > > 
> > > The hash table value type is a linked list, in case more than one
> > > file matches the quick check.
> > > 
> > > The table is only built if a once-only file is seen, so include
> > > guard performance is not affecte
> > > 
> > > My laptop would previously complete Ricardo's benchmark from the
> > > PR in ~1.1s using #pragma once, and ~0.35s using include guards.
> > > 
> > > After this change, both benchmarks now complete in ~0.35s. I did
> > > have to randomize the modification dates on the benchmark headers
> > > so the files did not all end up in the same hash table list, but
> > > that would likely not come up outside of the contrived benchmark.
> > > 
> > > I bootstrapped and ran the testsuite on x86_64 Darwin, as well as
> > > ppc64le and aarch64 Linux.
> > > 
> > > libcpp/ChangeLog:
> > > 
> > > 	PR preprocessor/58770
> > > 	* internal.h: Add hash table for #pragma once
> > > 	* files.cc: Optimize #pragma once with the hash table
> > > 
> > > Signed-off-by: Paul Hollinsky <paulhollinsky@gmail.com>
> > > ---
> > >   libcpp/files.cc   | 116 +++++++++++++++++++++++++++++++++++++++++++---
> > >   libcpp/internal.h |   3 ++
> > >   2 files changed, 112 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/libcpp/files.cc b/libcpp/files.cc
> > > index 24208f7b0f8..d4ffd77578e 100644
> > > --- a/libcpp/files.cc
> > > +++ b/libcpp/files.cc
> > > @@ -167,6 +167,33 @@ struct file_hash_entry_pool
> > >     struct cpp_file_hash_entry pool[FILE_HASH_POOL_SIZE];
> > >   };
> > > +/* A set of attributes designed to quickly identify obviously different files
> > > +   in a hashtable.  Just in case there are collisions, we still maintain a
> > > +   list.  These sub-lists can then be checked for #pragma once rather than
> > > +   interating through all_files.  */
> > > +struct file_quick_idempotency_attrs
> > > +{
> > > +  file_quick_idempotency_attrs(const _cpp_file *f)
> > > +    : mtime(f->st.st_mtime), size(f->st.st_size) {}
> > > +
> > > +  time_t mtime;
> > > +  off_t size;
> > > +
> > > +  static hashval_t hash (/* _cpp_file* */ const void *p);
> > > +};
> > > +
> > > +/* Sub-list of very similar files kept in a hashtable to check for #pragma
> > > +   once.  */
> > > +struct file_sublist
> > > +{
> > > +  _cpp_file *f;
> > > +  file_sublist *next;
> > > +
> > > +  static int eq (/* _cpp_file* */ const void *p,
> > > +		 /* file_sublist* */ const void *q);
> > > +  static void del (/* file_sublist* */ void *p);
> > > +};
> > > +
> > >   static bool open_file (_cpp_file *file);
> > >   static bool pch_open_file (cpp_reader *pfile, _cpp_file *file,
> > >   			   bool *invalid_pch);
> > > @@ -849,17 +876,17 @@ has_unique_contents (cpp_reader *pfile, _cpp_file *file, bool import,
> > >     if (!pfile->seen_once_only)
> > >       return true;
> > > -  /* We may have read the file under a different name.  Look
> > > -     for likely candidates and compare file contents to be sure.  */
> > > -  for (_cpp_file *f = pfile->all_files; f; f = f->next_file)
> > > +  /* We may have read the file under a different name.  We've kept
> > > +     similar looking files in this lists under this hash table, so
> > > +     check those more thoroughly.  */
> > > +  void* ent = htab_find(pfile->pragma_once_files, file);
> > > +  for (file_sublist *e = static_cast<file_sublist *> (ent); e; e = e->next)
> > >       {
> > > +      _cpp_file *f = e->f;
> > >         if (f == file)
> > >   	continue; /* It'sa me!  */
> > > -      if ((import || f->once_only)
> > > -	  && f->err_no == 0
> > > -	  && f->st.st_mtime == file->st.st_mtime
> > > -	  && f->st.st_size == file->st.st_size)
> > > +      if ((import || f->once_only) && f->err_no == 0)
> > >   	{
> > >   	  _cpp_file *ref_file;
> > > @@ -895,6 +922,38 @@ has_unique_contents (cpp_reader *pfile, _cpp_file *file, bool import,
> > >     return true;
> > >   }
> > > +/* Add the given file to the #pragma once table so it can be
> > > +   quickly identified and excluded the next time it's seen.  */
> > > +static void
> > > +update_pragma_once_table (cpp_reader *pfile, _cpp_file *file)
> > > +{
> > > +  void **slot = htab_find_slot (pfile->pragma_once_files, file, INSERT);
> > > +  if (slot)
> > > +    {
> > > +      if (!*slot)
> > > +	*slot = xcalloc (1, sizeof (file_sublist));
> > > +
> > > +      file_sublist *e = static_cast<file_sublist *> (*slot);
> > > +      while (e->f)
> > > +	{
> > > +	  if (!e->next)
> > > +	    {
> > > +	      void *new_sublist = xcalloc(1, sizeof (file_sublist));
> > > +	      e->next = static_cast<file_sublist *> (new_sublist);
> > > +	    }
> > > +	  e = e->next;
> > > +	}
> > > +
> > > +      e->f = file;
> > > +    }
> > > +  else
> > > +    {
> > > +      cpp_error (pfile, CPP_DL_ERROR,
> > > +		 "Unable to create #pragma once table space for %s",
> > > +		 _cpp_get_file_name (file));
> > > +    }
> > > +}
> > > +
> > >   /* Place the file referenced by FILE into a new buffer on the buffer
> > >      stack if possible.  Returns true if a buffer is stacked.  Use LOC
> > >      for any diagnostics.  */
> > > @@ -950,6 +1009,9 @@ _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, include_type type,
> > >         if (!has_unique_contents (pfile, file, type == IT_IMPORT, loc))
> > >   	return false;
> > > +      if (pfile->seen_once_only && file->once_only)
> > > +	update_pragma_once_table (pfile, file);
> > > +
> > >         if (pfile->buffer && file->dir)
> > >   	sysp = MAX (pfile->buffer->sysp, file->dir->sysp);
> > > @@ -1434,6 +1496,41 @@ nonexistent_file_hash_eq (const void *p, const void *q)
> > >     return filename_cmp ((const char *) p, (const char *) q) == 0;
> > >   }
> > > +/* Hasher for the #pragma once hash table.  */
> > > +hashval_t
> > > +file_quick_idempotency_attrs::hash (const void *p)
> > > +{
> > > +  const _cpp_file *f = static_cast<const _cpp_file *> (p);
> > > +  file_quick_idempotency_attrs kh (f);
> > > +  return iterative_hash_object (kh, 0);
> > > +}
> > > +
> > > +/* Equality checker for the #pragma once hash table.  */
> > > +int
> > > +file_sublist::eq (const void *p, const void *q)
> > > +{
> > > +  /* Just check if the file q would be in the list p. Every
> > > +     file in the list should have these attributes the same,
> > > +     so we don't need to traverse.  */
> > > +  const file_sublist *e = static_cast<const file_sublist *> (p);
> > > +  const _cpp_file *f = static_cast<const _cpp_file *> (q);
> > > +  return f->st.st_mtime == e->f->st.st_mtime
> > > +	 && f->st.st_size == e->f->st.st_size;
> > > +}
> > > +
> > > +/* Cleanup for a file sub-list. Does not free the _cpp_file
> > > +   structures within.  */
> > > +void
> > > +file_sublist::del (void *p)
> > > +{
> > > +  file_sublist *e = static_cast<file_sublist *> (p);
> > > +  if (e->next)
> > > +    {
> > > +      file_sublist::del (e->next);
> > > +      free (e->next);
> > > +    }
> > > +}
> > > +
> > >   /* Initialize everything in this source file.  */
> > >   void
> > >   _cpp_init_files (cpp_reader *pfile)
> > > @@ -1442,6 +1539,10 @@ _cpp_init_files (cpp_reader *pfile)
> > >   					NULL, xcalloc, free);
> > >     pfile->dir_hash = htab_create_alloc (127, file_hash_hash, file_hash_eq,
> > >   					NULL, xcalloc, free);
> > > +  pfile->pragma_once_files = htab_create_alloc (127,
> > > +					file_quick_idempotency_attrs::hash,
> > > +					file_sublist::eq, file_sublist::del,
> > > +					xcalloc, free);
> > >     allocate_file_hash_entries (pfile);
> > >     pfile->nonexistent_file_hash = htab_create_alloc (127, htab_hash_string,
> > >   						    nonexistent_file_hash_eq,
> > > @@ -1456,6 +1557,7 @@ _cpp_cleanup_files (cpp_reader *pfile)
> > >   {
> > >     htab_delete (pfile->file_hash);
> > >     htab_delete (pfile->dir_hash);
> > > +  htab_delete (pfile->pragma_once_files);
> > >     htab_delete (pfile->nonexistent_file_hash);
> > >     obstack_free (&pfile->nonexistent_file_ob, 0);
> > >     free_file_hash_entries (pfile);
> > > diff --git a/libcpp/internal.h b/libcpp/internal.h
> > > index badfd1b40da..9c3c46df335 100644
> > > --- a/libcpp/internal.h
> > > +++ b/libcpp/internal.h
> > > @@ -485,6 +485,9 @@ struct cpp_reader
> > >        been used.  */
> > >     bool seen_once_only;
> > > +  /* Optimization for #pragma once.  */
> > > +  struct htab *pragma_once_files;
> > > +
> > >     /* Multiple include optimization.  */
> > >     const cpp_hashnode *mi_cmacro;
> > >     const cpp_hashnode *mi_ind_cmacro;
> > > -- 
> > > 2.34.1
> > > 
> 
> 
> -- 
> Nathan Sidwell