public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "matthijs at stdin dot nl" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug preprocessor/80753] __has_include and __has_include_next taints subsequent I/O errors Date: Wed, 23 Sep 2020 16:00:31 +0000 [thread overview] Message-ID: <bug-80753-4-ke2XL7TPw2@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-80753-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80753 --- Comment #4 from Matthijs Kooijman <matthijs at stdin dot nl> --- I looked a bit at the source, and it seems the the problem is (not surprisingly) that `__has_include` causes the header filename to be put into the cache, and an error message is only generated when the entire include path has been processed without resolving the entry from the cache (essentially this means that an error is only triggered when the entry is put into the cache, *except* when it happens as part of `__has_include`). Relevant function is _cpp_find_file: https://github.com/gcc-mirror/gcc/blob/da13b7737662da11f8fefb28eaf4ed7c50c51767/libcpp/files.c#L506 This is called with kind = _cpp_FFK_HAS_INCLUDE for `__has_include` which prevents an error here: https://github.com/gcc-mirror/gcc/blob/da13b7737662da11f8fefb28eaf4ed7c50c51767/libcpp/files.c#L591-L592 And the cache entry is immediately returned on subsequent calls here: https://github.com/gcc-mirror/gcc/blob/da13b7737662da11f8fefb28eaf4ed7c50c51767/libcpp/files.c#L523-L526 It seems there are continuous lookups in the cache and it took me a while to realize how the cache actually works (I initially thought that maybe files that were *not* found) were not actually put in the cache, but AFAIU the `pfile->file_hash` cache works like this: - Hash slots are indexed by filename - Each slot contains a linked list of entries. - Each entry contains the directory lookups start from. This can be the start_dir, but also the first quote_include or bracket_include dir. Iow, the cache entry is only valid for a lookup that starts at that directory, or has progressed to that directory, to allow a "" include to prime the cache for a <> include as well. - Once a file is found, or the search path is exhausted, the result is stored in the cache for start_dir and for "" and <> includes if the start dir for those has been passed. Anyway, this means that a failed lookup based from __has_include() will cause the failed result to be put into the cache for one or more directories without emitting an error and always be returned for subsequent includes without emitting an error. An obvious fix would be to simply not put the result in the cache for _cpp_FFK_HAS_INCLUDE, but that would be a missed cache opportunity. An alternative would be to add a boolean "error_emitted" to each _cpp_file* in the cache (cannot add it to the cache entry itself, since the same _cpp_file* might end up in different cache entries), that defaults to false and is set to true by open_file_failed. When kind is not _cpp_FFK_HAS_INCLUDE, and returning a cache entry that has "error_emitted" set to false and has an errno call open_file_failed to emit an error. This would require that open_file_failed to emit the right output in this case (i.e. the cached _cpp_file* should not rely on the context in which it was generated), but AFAICS this would be the case. I'm not quite familiar with building and patching gcc (and also really need to get this yak hair back to my actual work), so I probably won't be providing a patch here. Maybe my above analysis enables someone else to do so? :-)
next prev parent reply other threads:[~2020-09-23 16:00 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <bug-80753-4@http.gcc.gnu.org/bugzilla/> 2020-09-23 14:24 ` matthijs at stdin dot nl 2020-09-23 16:00 ` matthijs at stdin dot nl [this message] 2023-06-12 17:15 ` pinskia at gcc dot gnu.org 2023-06-13 9:35 ` jakub at gcc dot gnu.org 2023-06-15 12:17 ` cvs-commit at gcc dot gnu.org 2023-08-01 10:43 ` jakub at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-80753-4-ke2XL7TPw2@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).