public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/61922] New: Recursive #include overruns Win32 MAX_PATH due to lack of path canonization
@ 2014-07-26 17:16 ilya.konstantinov at gmail dot com
  2014-07-27 17:24 ` [Bug preprocessor/61922] " ilya.konstantinov at gmail dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: ilya.konstantinov at gmail dot com @ 2014-07-26 17:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61922

            Bug ID: 61922
           Summary: Recursive #include overruns Win32 MAX_PATH due to lack
                    of path canonization
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ilya.konstantinov at gmail dot com

Created attachment 33190
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33190&action=edit
Minimal testcase

When #include "..." is relative to the including file's directory, clang
performs simple path concatenation.

FOR EXAMPLE, if foo/foo.h includes "../bar/bar.h", clang will ask the OS to
open "foo/../bar/bar.h" -- i.e. it will not canonize the path to "bar/bar.h".

Normally, the OS handles this under the hood. However, Win32 CreateFile only
accepts up to 260 (a.k.a MAX_PATH) characters.

With sufficiently long and contrived chains of relative #includes, which
actually occurred in a real-life project of mine, it can overrun MAX_PATH and
result in erroneous "File not found".

TESTCASE: I'm attaching a reproducing testcase. On Microsoft Visual C++, it
compiles correctly. clang has the same issue:
http://llvm.org/bugs/show_bug.cgi?id=20440


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug preprocessor/61922] Recursive #include overruns Win32 MAX_PATH due to lack of path canonization
  2014-07-26 17:16 [Bug c/61922] New: Recursive #include overruns Win32 MAX_PATH due to lack of path canonization ilya.konstantinov at gmail dot com
@ 2014-07-27 17:24 ` ilya.konstantinov at gmail dot com
  2014-07-27 17:40 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ilya.konstantinov at gmail dot com @ 2014-07-27 17:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61922

--- Comment #1 from Ilya Konstantinov <ilya.konstantinov at gmail dot com> ---
Of course, I meant to say:

When #include "..." is relative to the including file's directory, **gcc**
performs simple path concatenation.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug preprocessor/61922] Recursive #include overruns Win32 MAX_PATH due to lack of path canonization
  2014-07-26 17:16 [Bug c/61922] New: Recursive #include overruns Win32 MAX_PATH due to lack of path canonization ilya.konstantinov at gmail dot com
  2014-07-27 17:24 ` [Bug preprocessor/61922] " ilya.konstantinov at gmail dot com
@ 2014-07-27 17:40 ` pinskia at gcc dot gnu.org
  2014-07-27 17:48 ` ilya.konstantinov at gmail dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2014-07-27 17:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61922

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The issue is complex due to symbolic links. Also this sounds like a bug in msvc
runtime library if you are using mingw. What happens on Cygwin for open/fopen?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug preprocessor/61922] Recursive #include overruns Win32 MAX_PATH due to lack of path canonization
  2014-07-26 17:16 [Bug c/61922] New: Recursive #include overruns Win32 MAX_PATH due to lack of path canonization ilya.konstantinov at gmail dot com
  2014-07-27 17:24 ` [Bug preprocessor/61922] " ilya.konstantinov at gmail dot com
  2014-07-27 17:40 ` pinskia at gcc dot gnu.org
@ 2014-07-27 17:48 ` ilya.konstantinov at gmail dot com
  2014-07-27 17:58 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ilya.konstantinov at gmail dot com @ 2014-07-27 17:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61922

--- Comment #3 from Ilya Konstantinov <ilya.konstantinov at gmail dot com> ---
Both mingw and cygwin will eventually call CreateFile, there's no way around
that. The 260 characters limitation is on CreateFile.

Re symlinks - see discussion on the llvm issue. I'm not proposing textual
canonization since it will not respect symlinks. However, applying
GetFullPathName each time will be much better than the current state.

Here's an example:

> cd c:\dev\test\
> gcc test.c

test.c:
#include "foo/foo.h"

   BEFORE: CreateFile("foo/foo.h")

   AFTER:  GetFullPathName("foo/foo.h") --> CreateFile("c:\dev\foo\foo.h)

foo\foo.h:
#include "../bar/bar.h"

   BEFORE: CreateFile("foo/../bar/foo.h")

   AFTER:  GetFullPathName("c:\dev\foo\../bar/bar.h") -->
CreateFile("c:\dev\bar\bar.h)

bar\bar.h:
#include "../baz/baz.h"

   BEFORE: CreateFile("foo/../bar/../baz/baz.h")

   AFTER:  GetFullPathName("c:\dev\bar\../baz/baz.h") -->
CreateFile("c:\dev\baz\baz.h)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug preprocessor/61922] Recursive #include overruns Win32 MAX_PATH due to lack of path canonization
  2014-07-26 17:16 [Bug c/61922] New: Recursive #include overruns Win32 MAX_PATH due to lack of path canonization ilya.konstantinov at gmail dot com
                   ` (2 preceding siblings ...)
  2014-07-27 17:48 ` ilya.konstantinov at gmail dot com
@ 2014-07-27 17:58 ` pinskia at gcc dot gnu.org
  2014-07-27 18:09 ` ilya.konstantinov at gmail dot com
  2014-07-28  1:19 ` ilya.konstantinov at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2014-07-27 17:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61922

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Except gcc never directly calls create file. Please file a bug with Cygwin and
mingw instead.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug preprocessor/61922] Recursive #include overruns Win32 MAX_PATH due to lack of path canonization
  2014-07-26 17:16 [Bug c/61922] New: Recursive #include overruns Win32 MAX_PATH due to lack of path canonization ilya.konstantinov at gmail dot com
                   ` (3 preceding siblings ...)
  2014-07-27 17:58 ` pinskia at gcc dot gnu.org
@ 2014-07-27 18:09 ` ilya.konstantinov at gmail dot com
  2014-07-28  1:19 ` ilya.konstantinov at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: ilya.konstantinov at gmail dot com @ 2014-07-27 18:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61922

--- Comment #5 from Ilya Konstantinov <ilya.konstantinov at gmail dot com> ---
cygwin and mingw can document a limitation such as:

fopen will truncate paths to MAX_PATH.

Once cygwin or mingw receive a monstrous path, there's little they can do.
CreateFile won't accept it, neither GetFullPathName. The only thing left is
manually canonizing it, which is tedious and error-prone -- cygwin/mingw will
have to implement its own directory traversal, textually processing the path,
and respecting ".", ".." and symlinks along the way.

Meanwhile, gcc can make an effort not to pass a monstrous path in the first
place, by canonizing it after every #include.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug preprocessor/61922] Recursive #include overruns Win32 MAX_PATH due to lack of path canonization
  2014-07-26 17:16 [Bug c/61922] New: Recursive #include overruns Win32 MAX_PATH due to lack of path canonization ilya.konstantinov at gmail dot com
                   ` (4 preceding siblings ...)
  2014-07-27 18:09 ` ilya.konstantinov at gmail dot com
@ 2014-07-28  1:19 ` ilya.konstantinov at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: ilya.konstantinov at gmail dot com @ 2014-07-28  1:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61922

--- Comment #6 from Ilya Konstantinov <ilya.konstantinov at gmail dot com> ---
OK, I've done some more research:

1) It turns out Windows doesn't handle parent directories from symlinks the way
*nix does, but rather just follows them out of the symlink. Same for junctions
a.k.a mountpoints. Thus, "textual canonization" is actually possible.

2) This issue is not present on cygwin gcc. cygwin implements its own path
canonization (on each 'open'). In fact, it has to, since it uses NtCreateFile
which performs no path processing of its own. [1] As a result, gcc on cygwin
should support canonical paths up to 32K, and (pre-canonization) paths of any
size. Furthermore, cygwin preserves the POSIX semantics -- i.e. "symlink/.."
leads out of the symlink's target.

3) My gcc build (from the Android NDK) is a mingw build, so uses MSVCRT fopen,
which is a thin wrapper around CreateFile. Since we're not going to change
MSVCRT, should it be a MINGW issue?

[1] see https://cygwin.com/viewvc/src/winsup/cygwin/path.cc?view=markup


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-07-28  1:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-26 17:16 [Bug c/61922] New: Recursive #include overruns Win32 MAX_PATH due to lack of path canonization ilya.konstantinov at gmail dot com
2014-07-27 17:24 ` [Bug preprocessor/61922] " ilya.konstantinov at gmail dot com
2014-07-27 17:40 ` pinskia at gcc dot gnu.org
2014-07-27 17:48 ` ilya.konstantinov at gmail dot com
2014-07-27 17:58 ` pinskia at gcc dot gnu.org
2014-07-27 18:09 ` ilya.konstantinov at gmail dot com
2014-07-28  1:19 ` ilya.konstantinov at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).