public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug preprocessor/109485] New: Feature request: More efficient include path handling
@ 2023-04-12 11:35 dani.borg at outlook dot com
  2023-04-12 18:05 ` [Bug preprocessor/109485] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: dani.borg at outlook dot com @ 2023-04-12 11:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109485

            Bug ID: 109485
           Summary: Feature request: More efficient include path handling
           Product: gcc
           Version: 12.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: preprocessor
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dani.borg at outlook dot com
  Target Milestone: ---

When preprocessing, the method to lookup include paths is inefficient and cause
more file system calls than needed.
The current method simply tries each include path in order, for every unique
#include directive. Basically O(n*n) in complexity.
The method scales poorly when the number of include paths increase which can
cause high system load and long build times.

A smarter approach, which clang appears to be using, is to keep track of which
include paths doesn't contain the path seen in the include directive. Then file
system queries for impossible paths can be avoided.

To give a concrete example that compares gcc and clang, the following can be
run in bash on a Linux system. The example below only shows the relevant output
from strace.

#prepare - create 2 include paths, 3 headers and a source file including the
headers
mkdir -p a b/b && touch b/b/a.h b/b/b.h b/b/c.h && echo -e '#include
"b/a.h"\n#include "b/b.h"\n#include "b/c.h"' > a.c

#gcc
strace -f -e open,stat gcc -Ia -Ib -nostdinc a.c -E -o/dev/null
  [pid 12] open("a/b/a.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or
directory)
  [pid 12] open("b/b/a.h", O_RDONLY|O_NOCTTY) = 4
  [pid 12] open("a/b/b.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or
directory)
  [pid 12] open("b/b/b.h", O_RDONLY|O_NOCTTY) = 4
  [pid 12] open("a/b/c.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or
directory)
  [pid 12] open("b/b/c.h", O_RDONLY|O_NOCTTY) = 4
  [pid 12] +++ exited with 0 +++

#clang
strace -f -e open,stat clang -Ia -Ib -nostdinc a.c -E -o/dev/null
  stat("a/b", 0x7ffd8d11ac20)             = -1 ENOENT (No such file or
directory)
  stat("b/b", {st_mode=S_IFDIR|0755, st_size=55, ...}) = 0
  open("b/b/a.h", O_RDONLY|O_CLOEXEC)     = 3
  open("b/b/b.h", O_RDONLY|O_CLOEXEC)     = 3
  open("b/b/c.h", O_RDONLY|O_CLOEXEC)     = 3
  +++ exited with 0 +++

Note how clang first determines if the partial path exist, before testing the
full path. This way all open("a/b/... calls can be avoided.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-04-13  9:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-12 11:35 [Bug preprocessor/109485] New: Feature request: More efficient include path handling dani.borg at outlook dot com
2023-04-12 18:05 ` [Bug preprocessor/109485] " pinskia at gcc dot gnu.org
2023-04-12 18:09 ` pinskia at gcc dot gnu.org
2023-04-12 18:11 ` [Bug preprocessor/109485] improve " pinskia at gcc dot gnu.org
2023-04-12 18:15 ` pinskia at gcc dot gnu.org
2023-04-12 18:52 ` pinskia at gcc dot gnu.org
2023-04-13  8:06 ` dani.borg at outlook dot com
2023-04-13  8:17 ` dani.borg at outlook dot com
2023-04-13  8:51 ` dani.borg at outlook dot com
2023-04-13  9:07 ` dani.borg at outlook dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).