From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x72a.google.com (mail-qk1-x72a.google.com [IPv6:2607:f8b0:4864:20::72a]) by sourceware.org (Postfix) with ESMTPS id 4D62F38582BE for ; Mon, 22 Aug 2022 13:19:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4D62F38582BE Received: by mail-qk1-x72a.google.com with SMTP id f14so7796689qkm.0 for ; Mon, 22 Aug 2022 06:19:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=0ZapL3C3WppSllbFfqmFtV7WjwkKqiY/6fwrJ3kgC5E=; b=0g3aum6i9dlmSd5NND/M7FfjQINw+w4Hn27sd0YRcMviNyIk++qixzgZtYpreBj1r3 FnXCksY6y99ohzAXS/3YmmjYF+IdKGRS4JtFfcRqj1j6M6EH4UXWLImiqN1R44IsjZau dt7HmbZ2MOucwEXAEaM7pCih5VwDnrJB9VpuDSA8NHFaCFTg7/WjexGx5X5stXN1GSPg 4gPw8oAjrX7lW1Ezty7Q8xcYRLePAcC1cm0UBsewx0+6zEfPjkvhAJEQPrE2JdtBIhZD XJD/ZyFX71onZ8dawhgE52kfjBMOGvFLyXaEnJQnFOXp6CInepQXBr1NRI1JU+LXiUWb XHHQ== X-Gm-Message-State: ACgBeo0btgeTEYt16kDqrug7P+7Yhv44nHL6DoSk2uDiujjDC7KNyK7B N5TO0h9W6ahD73lb+cY6so4= X-Google-Smtp-Source: AA6agR4rGsVtK7X08iK3dWcu05UEQtxAJcPA1rFwtwim0xbFlVXUhA3KzEyrFbHKRdxqVJhth/RhcA== X-Received: by 2002:a05:620a:1b8e:b0:6bb:f85e:6de0 with SMTP id dv14-20020a05620a1b8e00b006bbf85e6de0mr5691050qkb.618.1661174371410; Mon, 22 Aug 2022 06:19:31 -0700 (PDT) Received: from ?IPV6:2620:10d:c0a8:11c9::119a? ([2620:10d:c091:480::f0dc]) by smtp.googlemail.com with ESMTPSA id dm47-20020a05620a1d6f00b006b8fb2a1145sm10727263qkb.124.2022.08.22.06.19.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 22 Aug 2022 06:19:30 -0700 (PDT) Sender: Nathan Sidwell Message-ID: Date: Mon, 22 Aug 2022 09:19:29 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Subject: Re: Ping [PATCH V2] libcpp: Optimize #pragma once with a hash table [PR58770] Content-Language: en-US To: Paul Hollinsky , gcc-patches@gcc.gnu.org Cc: Jason Merrill , David Malcom , Per Bothner , Tom Tromey References: <20220706230321.48041-1-paulhollinsky@gmail.com> <20220801051839.1454-1-paulhollinsky@gmail.com> <20220819202734.wvourvrje3x5vfla@perseverance.local> From: Nathan Sidwell In-Reply-To: <20220819202734.wvourvrje3x5vfla@perseverance.local> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3038.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Aug 2022 13:19:34 -0000 On 8/19/22 16:27, Paul Hollinsky wrote: > Hi all, > > Would love some feedback on this patch! > > Thanks, > Paul > > On Mon, Aug 01, 2022 at 05:18:40AM +0000, Paul Hollinsky wrote: >> Rather than traversing the all_files linked list for every include, >> this factors out the quick idempotency checks (modification time >> and size) to be the keys in a hash table so we can find matching >> files quickly. I'm probably confused, but why is the existing idempotency logic insufficiently optimal? Can it be improved directly? WRT using modification time as a key. I don't see how that provides sufficient additional randomization to file size? Groups of headers are often installed with the same mtime. >> >> The hash table value type is a linked list, in case more than one >> file matches the quick check. >> >> The table is only built if a once-only file is seen, so include >> guard performance is not affecte >> >> My laptop would previously complete Ricardo's benchmark from the >> PR in ~1.1s using #pragma once, and ~0.35s using include guards. >> >> After this change, both benchmarks now complete in ~0.35s. I did >> have to randomize the modification dates on the benchmark headers >> so the files did not all end up in the same hash table list, but >> that would likely not come up outside of the contrived benchmark. >> >> I bootstrapped and ran the testsuite on x86_64 Darwin, as well as >> ppc64le and aarch64 Linux. >> >> libcpp/ChangeLog: >> >> PR preprocessor/58770 >> * internal.h: Add hash table for #pragma once >> * files.cc: Optimize #pragma once with the hash table >> >> Signed-off-by: Paul Hollinsky >> --- >> libcpp/files.cc | 116 +++++++++++++++++++++++++++++++++++++++++++--- >> libcpp/internal.h | 3 ++ >> 2 files changed, 112 insertions(+), 7 deletions(-) >> >> diff --git a/libcpp/files.cc b/libcpp/files.cc >> index 24208f7b0f8..d4ffd77578e 100644 >> --- a/libcpp/files.cc >> +++ b/libcpp/files.cc >> @@ -167,6 +167,33 @@ struct file_hash_entry_pool >> struct cpp_file_hash_entry pool[FILE_HASH_POOL_SIZE]; >> }; >> >> +/* A set of attributes designed to quickly identify obviously different files >> + in a hashtable. Just in case there are collisions, we still maintain a >> + list. These sub-lists can then be checked for #pragma once rather than >> + interating through all_files. */ >> +struct file_quick_idempotency_attrs >> +{ >> + file_quick_idempotency_attrs(const _cpp_file *f) >> + : mtime(f->st.st_mtime), size(f->st.st_size) {} >> + >> + time_t mtime; >> + off_t size; >> + >> + static hashval_t hash (/* _cpp_file* */ const void *p); >> +}; >> + >> +/* Sub-list of very similar files kept in a hashtable to check for #pragma >> + once. */ >> +struct file_sublist >> +{ >> + _cpp_file *f; >> + file_sublist *next; >> + >> + static int eq (/* _cpp_file* */ const void *p, >> + /* file_sublist* */ const void *q); >> + static void del (/* file_sublist* */ void *p); >> +}; >> + >> static bool open_file (_cpp_file *file); >> static bool pch_open_file (cpp_reader *pfile, _cpp_file *file, >> bool *invalid_pch); >> @@ -849,17 +876,17 @@ has_unique_contents (cpp_reader *pfile, _cpp_file *file, bool import, >> if (!pfile->seen_once_only) >> return true; >> >> - /* We may have read the file under a different name. Look >> - for likely candidates and compare file contents to be sure. */ >> - for (_cpp_file *f = pfile->all_files; f; f = f->next_file) >> + /* We may have read the file under a different name. We've kept >> + similar looking files in this lists under this hash table, so >> + check those more thoroughly. */ >> + void* ent = htab_find(pfile->pragma_once_files, file); >> + for (file_sublist *e = static_cast (ent); e; e = e->next) >> { >> + _cpp_file *f = e->f; >> if (f == file) >> continue; /* It'sa me! */ >> >> - if ((import || f->once_only) >> - && f->err_no == 0 >> - && f->st.st_mtime == file->st.st_mtime >> - && f->st.st_size == file->st.st_size) >> + if ((import || f->once_only) && f->err_no == 0) >> { >> _cpp_file *ref_file; >> >> @@ -895,6 +922,38 @@ has_unique_contents (cpp_reader *pfile, _cpp_file *file, bool import, >> return true; >> } >> >> +/* Add the given file to the #pragma once table so it can be >> + quickly identified and excluded the next time it's seen. */ >> +static void >> +update_pragma_once_table (cpp_reader *pfile, _cpp_file *file) >> +{ >> + void **slot = htab_find_slot (pfile->pragma_once_files, file, INSERT); >> + if (slot) >> + { >> + if (!*slot) >> + *slot = xcalloc (1, sizeof (file_sublist)); >> + >> + file_sublist *e = static_cast (*slot); >> + while (e->f) >> + { >> + if (!e->next) >> + { >> + void *new_sublist = xcalloc(1, sizeof (file_sublist)); >> + e->next = static_cast (new_sublist); >> + } >> + e = e->next; >> + } >> + >> + e->f = file; >> + } >> + else >> + { >> + cpp_error (pfile, CPP_DL_ERROR, >> + "Unable to create #pragma once table space for %s", >> + _cpp_get_file_name (file)); >> + } >> +} >> + >> /* Place the file referenced by FILE into a new buffer on the buffer >> stack if possible. Returns true if a buffer is stacked. Use LOC >> for any diagnostics. */ >> @@ -950,6 +1009,9 @@ _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, include_type type, >> if (!has_unique_contents (pfile, file, type == IT_IMPORT, loc)) >> return false; >> >> + if (pfile->seen_once_only && file->once_only) >> + update_pragma_once_table (pfile, file); >> + >> if (pfile->buffer && file->dir) >> sysp = MAX (pfile->buffer->sysp, file->dir->sysp); >> >> @@ -1434,6 +1496,41 @@ nonexistent_file_hash_eq (const void *p, const void *q) >> return filename_cmp ((const char *) p, (const char *) q) == 0; >> } >> >> +/* Hasher for the #pragma once hash table. */ >> +hashval_t >> +file_quick_idempotency_attrs::hash (const void *p) >> +{ >> + const _cpp_file *f = static_cast (p); >> + file_quick_idempotency_attrs kh (f); >> + return iterative_hash_object (kh, 0); >> +} >> + >> +/* Equality checker for the #pragma once hash table. */ >> +int >> +file_sublist::eq (const void *p, const void *q) >> +{ >> + /* Just check if the file q would be in the list p. Every >> + file in the list should have these attributes the same, >> + so we don't need to traverse. */ >> + const file_sublist *e = static_cast (p); >> + const _cpp_file *f = static_cast (q); >> + return f->st.st_mtime == e->f->st.st_mtime >> + && f->st.st_size == e->f->st.st_size; >> +} >> + >> +/* Cleanup for a file sub-list. Does not free the _cpp_file >> + structures within. */ >> +void >> +file_sublist::del (void *p) >> +{ >> + file_sublist *e = static_cast (p); >> + if (e->next) >> + { >> + file_sublist::del (e->next); >> + free (e->next); >> + } >> +} >> + >> /* Initialize everything in this source file. */ >> void >> _cpp_init_files (cpp_reader *pfile) >> @@ -1442,6 +1539,10 @@ _cpp_init_files (cpp_reader *pfile) >> NULL, xcalloc, free); >> pfile->dir_hash = htab_create_alloc (127, file_hash_hash, file_hash_eq, >> NULL, xcalloc, free); >> + pfile->pragma_once_files = htab_create_alloc (127, >> + file_quick_idempotency_attrs::hash, >> + file_sublist::eq, file_sublist::del, >> + xcalloc, free); >> allocate_file_hash_entries (pfile); >> pfile->nonexistent_file_hash = htab_create_alloc (127, htab_hash_string, >> nonexistent_file_hash_eq, >> @@ -1456,6 +1557,7 @@ _cpp_cleanup_files (cpp_reader *pfile) >> { >> htab_delete (pfile->file_hash); >> htab_delete (pfile->dir_hash); >> + htab_delete (pfile->pragma_once_files); >> htab_delete (pfile->nonexistent_file_hash); >> obstack_free (&pfile->nonexistent_file_ob, 0); >> free_file_hash_entries (pfile); >> diff --git a/libcpp/internal.h b/libcpp/internal.h >> index badfd1b40da..9c3c46df335 100644 >> --- a/libcpp/internal.h >> +++ b/libcpp/internal.h >> @@ -485,6 +485,9 @@ struct cpp_reader >> been used. */ >> bool seen_once_only; >> >> + /* Optimization for #pragma once. */ >> + struct htab *pragma_once_files; >> + >> /* Multiple include optimization. */ >> const cpp_hashnode *mi_cmacro; >> const cpp_hashnode *mi_ind_cmacro; >> -- >> 2.34.1 >> -- Nathan Sidwell